Friday, December 14, 2012

Bayesian inference with probabilistic population codes

Ma, WJ. Beck, JM. Latham, PE. Pouget, A. (2006) Bayesian inference with probabilistic population codes. Nature Neuroscience 9(11): 1432-1438.

To get Bayes optimal performance, neuron's must be doing computation that is a close approximation to Bayes' rule. Neuronal variability implies that populations of neurons automatically represent probability distributions over the stimulus - a code called "probabilistic population codes".

Any paper that mentions death by piranha in the first paragraph has got to be good.

Poisson-like variability seen in neuronal responses allows neurons to represent probability distributions in a format that reduces optimal Bayesian inference to simple linear combinations of neural activities.

Equations 2 and 3 describe how to combine two gaussian distributions (i.e. sensory integration) optimally according to Bayes. This is their definition of optimal:
So the gain of the population code reflects the variance of the distribution. Simply adding two neural distributions can lead to optimal bayesian inference.

Figure 2 Inference with probabilistic population codes for Gaussian
probability distributions and Poisson variability. The left plots correspond
to population codes for two cues, c1 and c2, related to the same variable s.
Each of these encodes a probability distribution with a variance inversely
proportional to the gains, g1 and g2, of the population codes (K is a constant
depending on the width of the tuning curve and the number of neurons).
Adding these two population codes leads to the output population activity
shown on the right. This output also encodes a probability distribution with a
variance inversely proportional to the gain. Because the gain of this code is
g1 + g2, and g1 and g2 are inversely proportional to s12 and s22, respectively,
the inverse variance of the output population code is the sum of the inverse
variances associated with c1 and c2. This is precisely the variance expected
from an optimal Bayesian inference (equation (3)). In other words, taking the
sum of two population codes is equivalent to taking the product of their
encoded distributions.

They derive generalizations  of this - i.e. tuning curves and distributions that are not gaussian. Essentially optimality can be obtained even if the neurons are not independent, or if their receptive fields are not of the same form (i.e. some gaussian some sigmoidal). The covariance matrix of the neural repsonses must be proportional to the gain.

Can also incorporate a prior distribution that is not flat.

They do a simulation of integrate-and-fire neurons that is similary to figure 2, and show that it works.

The population code not only reflects the value, but also the uncertainty - based on the gain of the population.

Need divisive normalization to prevent saturation.


Pretty cool stuff. Another example of why divisive normalization is an essential computation for the brain. I also like how they create seperate populations that represent their distributions and then are combined in a higher-level population.

No comments:

Post a Comment