Read the original document by opening this link in a new tab.
Table of Contents
Contents: 1 Uncertainty and inference...21
1.1 The goal of perception...21
1.2 Hypotheses and their probabilities...22
1.3 Sensory noise and perceptual ambiguity...26
1.4 Bayesian inference in visual perception...27
1.5 Bayesian inference in auditory perception...31
1.6 Historical background: perception as unconscious inference...37
1.7 Summary...38
1.8 Suggested readings...38
1.9 Problems...39
2 Using Bayes' rule...43
2.1 Steps of Bayesian modeling...43
2.2 Alternative form of Bayes' rule...47
2.3 Areal representation...48
2.4 The prosecutor's fallacy...49
2.5 A changing prior: luggage carousel example...50
2.6 A flat prior: Gestalt perception example...53
2.7 Optimality, evolution, and motivations for Bayesian modeling...55
2.8 Summary...56
2.9 Suggested readings...56
2.10 Problems...56
3 Bayesian inference under measurement noise...63
3.1 The steps of Bayesian modeling...63
3.2 Step 1: The generative model...65
3.2.1 The measurement: an abstracted sensory representation...65
3.2.2 Graphical model...66
3.2.3 The stimulus distribution...66
3.2.4 The measurement distribution...68
3.2.5 Joint distribution...69
3.3 Step 2: Inference...69
3.3.1 The prior distribution...69
3.3.2 The likelihood function...70
3.3.3 The posterior distribution...72
3.3.4 The posterior mean...75
3.3.5 Width of the posterior...76
3.3.6 The posterior mean estimate...76
3.3.7 The MAP estimate...77
3.4 Uncertainty and confidence...77
3.4.1 Uncertainty...77
3.4.2 Bayesian confidence...78
3.5 Model mismatch in inference...79
3.5.1 Prior mismatch...80
3.5.2 Improper priors...80
3.6 Heteroskedasticity...81
3.7 Magnitude variables...81
3.8 Applications...82
3.9 Percepts...83
3.10 Summary...84
3.11 Suggested readings...84
3.12 Problems...85
4 The response distribution...89
4.1 Inherited variability...89
4.2 The response distribution...90
4.3 Belief versus response distributions...91
4.4 Maximum-likelihood estimation...93
4.5 Bias and mean squared error...94
4.5.1 An 'inverted bias' perspective...96
4.5.2 All expenses paid...98
4.6 Other estimates...99
4.7 Decision noise and response noise...99
4.8 Misconceptions...100
4.9 Reflections on Bayesian models...102
4.10 Summary...102
4.11 Suggested readings...103
4.12 Problems...103
5 Cue combination and evidence accumulation...109
5.1 What is cue combination?...109
5.2 Formulation of the Bayesian model...111
5.2.1 Step 1: Generative model...112
5.2.2 Step 2: Inference...113
5.2.3 Step 3: Estimate distribution...116
5.3 Artificial cue conflict...116
5.3.1 Distinguishing the distributions...117
5.4 Generalizations: prior, multiple cues...117
5.5 Evidence accumulation...118
5.6 Cue combination under ambiguity...119
5.7 Applications...119
5.8 Summary...121
5.9 Suggested readings...121
5.10 Problems...121
6 Learning as inference...125
6.1 The many forms of learning...125
6.2 Learning the probability of a binary event...126
6.2.1 Prediction...129
6.2.2 Update equations...130
6.2.3 Uncertainty...130
6.2.4 Binomial distribution...131
6.2.5 Non-uniform prior...131
6.3 Linking Bayesian learning to reinforcement learning...132
6.4 Learning the precision of a normal distribution...133
6.4.1 Why not infer variance?...136
6.4.2 Prediction...136
6.5 Learning the slope of a linear relationship...136
6.6 Learning the structure of a causal model...137
6.7 More learning...140
6.8 Summary...140
6.9 Suggested readings...140
6.10 Problems...141
7 Discrimination and detection...145
7.1 Example tasks...145
7.2 Discrimination...146
7.2.1 Step 1: Generative model...146
7.2.2 Step 2: Inference...146
7.2.3 Gaussian model...146
Summary
Preamble Bayesian models of perception and action. Copyright Wei Ji Ma, Konrad Kording, Daniel Goldreich, 2022, with original artwork by Brennan Klein. This is only a draft. The book will be published by MIT Press. The LaTeX template for this draft was based on The Legrand Orange Book, downloaded from http://www.LaTeXTemplates.com. Companion website: www.bayesianmodeling.com. Dedication We dedicate this book to the memory of David Knill (1961-2014). All three of us have learned a good part of what we know about Bayesian modeling of perception and action from him. As a caring and patient mentor and as an excellent teacher, he also made studying this topic a lot more enjoyable for all of us. The field of Bayesian modeling of perception and action would not be where it is without him and this book would probably never have been written. Acknowledgments This book has been a long time in the making, and we are indebted to many people. We first came up with the idea in June 2009, when – together with Alan Stocker and Jonathan Pillow – we taught a computational neuroscience course at the Instituto Gulbenkian de Ciência in Oeiras, Portugal. At the time, in an impressive display of unbridled optimism, K.K. predicted that we would be done by December 2009. A short 13 years later, we have the book in hand. The delay has come with benefits, though: over the years, we have used chapter drafts and the book's ideas to teach Bayesian modeling to hundreds of undergraduate students, graduate students, and postdocs in our courses at McMaster University, Baylor College of Medicine, Northwestern University, New York University, the University of Pennsylvania, and in tutorials at conferences and summer schools. Many of these students, as well as our teaching assistants – notably Ronald van den Berg, Anna Kutschireiter, Lucy Lai, Jennifer Laura Lee, Julie Lee, Jorge Menendez, Sashank Pisupati, Anne-Lene Sax, Shan Shen, Bei Xiao, and Hörmet Yiltiz – and lab members (too many to list) contributed numerous corrections, comments, problem solutions, and problem suggestions. We thank Nuwan de Silva for test-doing all problems in an earlier version of the book. We thank readers of various drafts, in particular Luigi Acerbi, Robert Jacobs, Michael Landy, Zili Liu, and Javier Traver, for providing deep and useful feedback on the content and exposition; we also thank Robert and Zili for being two of our most steadfast supporters over the years. We are deeply grateful to Brennan Klein, a postdoc at Northeastern University, for professionalizing and redesigning our figures throughout the book, as well as for many fun drawings and for making us more dutiful as authors. This book would never have been finished without the help of 1,3,7-trimethylxanthine and we are thankful for its existence. We are grateful to Robert Prior from MIT Press, by name destined to be our editor, who repeatedly set us firm deadlines, which he patiently consented to extend when we invariably missed them, and who made a free online version possible. Finally, we would like to thank our families, who have been unreasonably supportive all these years. a) b) where each node is a variable and each arrow a dependency. Observations/measurements are at the bottom. For each variable, For each observation, assume a noise model. For others, get the distribution from the experimental design or from natural statistics. If there are incoming arrows, the distribution is a conditional one. Step 1: Generative model Example from Chapters 3 - 4 Stimulus Measurements xp(s) =N(s;µ,σ2 s) p(x|s) =N(x;s,σ2) a) b) The optimal observer does this using the distributions in the generative model. Alternatively, the observer might assume different distributions (e.g., wrong beliefs). Marginalize (average) over variables other than the observations and the world state of interest. Assume a utility function, then maximize expected utility under the posterior. (Alternative: sample from the posterior.) Result: decision rule (mapping from observations to decision). When utility is accuracy, the read-out is to maximize the posterior (MAP decision rule). Step 2: Bayesian inference (decision rule) L(s;x) =p(x|s) p(s|x)∝L(s;x)p(s) p(s|x) =N/parenleftBig s;Jsµ+Jx Js+J,1 Js+J/parenrightBig ˆs=Jsµ+Jx Js+J Js≡1 σ2sand J≡1 σ2 For every unique trial in the experiment, To do so, use the distribution of the observations given the stimuli (from Step 1) and the decision rule (from Step 2). - Good method: Sample observations according to Step 1; for each, apply decision rule; tabulate responses. - Better: Integrate numerically over observations. - Best (when possible): Integrate analytically over observations. Optional: Add response noise or lapses. Step 3: Response probabilities p(ˆs|s) =/integraldisplay p(ˆs|x)p(x|s)dx =N/parenleftBig ˆs;Jsµ+Js Js+J,J (Js+J)2/parenrightBig a) b) c) d) e) f), the log probability of the subject's responses across all trials for a hypothesized parameter combination. Result: parameter estimates and maximum log likelihood. Test for parameter recovery and summary statistics recovery using synthetic data. Use more than one algorithm. by rerunning the fitted model. (e.g., vary Step 2). Correct for number of parameters (e.g., using AIC). Test for model recovery using synthetic data. using summary statistics. Optional: Step 4: Model fitting and model comparison logL∗ ˆσlogL(σ; data )logL(σ; data ) = ntrials ∑ i=1logp(ˆsi|si,σ) The four steps of Bayesian modeling Contents