Wednesday, October 31, 2012

Adaptive resonance theory: How a brain learns to consciously attend, learn, and recognize a changing world II

Grossberg, S. (2012) Adaptive resonance theory: How a brain learns to consciously attend, learn, and recognize a changing world. Neural Networks.

Starting back with section 9.

He links consciousness to resonance by the idea that the "contents" of the experience - conscious qualia, are linked through the "symbolic", compressed representation. The feedback in brain/ART binds the pixels and the symbols together, which is the basis of a conscious experience.

Learning occurs in the resonant state. When their is resonance the bottom-up adaptive filter and the top-down expectation pathways have learning activated. (the weights going up and down the hierarchy).

match learning causes gamma (ergo gamma is consciousness), mismatch/reset leads to beta oscillations.

Attentional system knows how inputs are categorized, but not whether categorization is correct, orienting system knows whether categorization is correct, but not what is being categorized. This means orienting system's activation needs to be nonspecific. Can use medium-term memory (synaptic depression) to lower chances of getting stuck in same local-minima category during search process. The self-normalizing network is essential - can act as a real-time probability distribution. Search cycle is probabilistic hypothesis testing and descision making.

ART prototypes are not averages, but the actively selected critical feature patterns upon which the top-down expectations of the category focus attention. "Vigilance" is the level of acceptable matching - low vigilance learns general categories with abstract prototypes. High vigilance forces a prototype to encode an individual examplar.

p is the vigilance parameter in figure 2. This controls how bad a match can be before search for a new category is initiated. Can control vigilance by a process of match tracking. Vigilance "tracks" the degree of match between input exemplar and matched prototype. The vigilance parameter is constantly being increased just enough to trigger a reset.

What stream learns spatially invariant object categories, where stream knows object positions and how to move. Interactions between what and where overcome these informational deficiencies. The what and where stream interact to bind view-invariant and positionally-invariant object categories.
A view-specific category of a novel object is learned and activates cells at a higher level that will become view-invariant object category as multiple view-specific categories are associated with it. As the eyes move around an object surface multiple view-specific categories are learned and associated with the emerging invariant category. An attentional shroud prevents the view-invariant category from getting reset, even while new view-specific categories are rest, as the eyes explore an object. This is done by inhibiting ITa reset mechanism.

The surface-shroud resonance is formed between surface representation (V4) and spatial attention (PPC), and focuses attention on object to be learned. When shroud collapses view-invariant category can be reset, and eyes can move to a new object.

Next is section 18, page 23.

Tuesday, October 30, 2012

Adaptive resonance theory: How a brain learns to consciously attend, learn, and recognize a changing world I

Grossberg, S. (2012) Adaptive resonance theory: How a brain learns to consciously attend, learn, and recognize a changing world. Neural Networks.

Wow, so don't go overboard with that title. He actually did talk about the "C" word in his talk, and actually he said a lot of things that I agree with. I'm sure we'll get to that. This is a 98 page paper, which after just skimming through and looking at the figures summarizes 40 years of his research into this idea. On the surface when I saw his talk I thought that he had some really great insights and understood many of the fundamental principals that I know in my gut are there. So, I think this will be a good read. And so this will just be mostly notes as I read through the paper.

"The review illustrates that ART is currently the most highly developed cognitive and neural
theory available, with the broadest explanatory and predictive range."

And if you thought the title was overboard, the abstract is just like, "look, this is how the brain works. It explains everything." And then it lists everything. But really, it basically lists everything...

One of the main problems of modern algorithms is "catastrophic forgetting", basically that algorithms forget as quickly as they learn. The brain somehow overcomes this problem...This is the stability-plasticity dilemma.

Consciousness, Learning, Expectation, Attention, Resonance and synchrony (CLEARS). "All conscious states are resonant states".

"ART accomplishes these properties by proposing how top-down expectations focus attention on salient combinations of cues, and characterizes how attention may operate via a form of self-normalizing "biased competition" (Desimone, 1998)."

ART allows for modal architectures. They are designed  to be general-purpose self-organizing processing. More general than AI, less general than turing machine. Evolutionary pressure shaped these modal architecture to create "Complementary Computing" and "Laminar Computing".

Ok, one of the things that was really interesting was the shunting component of the model. It was never clear to me from his talk what that meant, or if it meant shunting in the way I've been thinking about it. But the equations says this for shunting:
- dx/dt = -Ax + -FxJ
where J is the inhibitory input. And this for additive:
-dx/dt = -Ax + -EJ

Right. So this to me looks like exactly what I mean by shunting in the leech. x is like directly related to V, which in the model is directly related to spiking. So, if x is 0, then the shunting inhibition causes no current to leak out.
dV/dt = gl * (Er - V) + gi * (Er - V) + ge * (Ena - V)

That was short-term memory. I'm just on page 5.

Medium term memory sounds like its based on synaptic depression. This could fit in with that working memory facilitation-depression paper. [Review]

LTM: Gated Steepest Descent Learning. Two algorithms: outstar, instar. The outstar equation is where i cell learns a spatial pattern of activation across a network of sampled cells. The instar is like the opposite learning: the j cell learns the pattern of signals that activate it. instar is the competitive learning for self-organizing maps.

So he talks about two types of learning. ART is matched-based learning. However, ART does not describe mismatch learning, which is like the "Where" pathway and motor system - continually updated.

"3.3. Why procedural memories are not conscious. Brain systems that use inhibitory matching and
mismatch learning cannot generate excitatory resonances. Hence, if "all conscious states are resonant
states", then spatial and motor representations are not conscious. This way of thinking provides a
mechanistic reason why declarative memories (or "learning that"), which are the sort of memories
learned by ART, may be conscious, whereas procedural memories (or "learning how"), which are the
sort of memories that control spatial orienting and action, are not conscious (Cohen and Squire 1980)."

Top down activation primes bottom-up excitatability - excitatory matching. This is like top-down signals activating the apical tuft, leading to calcium spikes in a bunch of neurons. Then these neurons get bottom up input and then start bursting. The neurons without the top-down inputs are spiking, and are not heard as loudly.

Excitatory matching generates resonant brain states. Positive feedback from bursting neurons and top-down neurons. Focues attention on a combination of features (the "critical feature pattern"). This triggers learning. Resonance provides a globa context-sensitive indicator that the system is processing data worthy of learning.

"Grossberg (1973) that the shunting, or gain control, properties of membrane equation
neurons in an on-center off-surround network enable them to self-normalize their activities"

"Zeki and Shipp (1988, p. 316) wrote that “backward connections seem not to excite cells in lower areas, but instead influence the way they respond to stimuli”; that is, they are modulatory"

"Sillito et al. (1994) concluded that “the cortico-thalamic input is only strong enough to exert an
effect on those dLGN cells that are additionally polarized by their retinal input...the feedback circuit
searches for correlations that support the ‘hypothesis’ represented by a particular pattern of cortical

"7. Imagining, Planning, and Hallucinations: Prediction without Action
A top-down expectation is not always modulatory. The excitatory/inhibitory balance in the modulatoryon-center of a top-down expectation can be modified by volitional control from the basal ganglia. If, for example, volitional signals inhibit inhibitory interneurons in the on-center, then read-out of a top-down expectation from a recognition category can fire cells in the on-center prototype, not merely modulate them. Such volitional control has been predicted to control mental imagery and the ability to think and plan ahead without external action, a crucial type of predictive competence in humans and other mammals. If these volitional signals become tonically hyperactive, then top-down expectations can fire without overt intention, leading to properties like schizophrenic hallucinations (Grossberg, 2000a). In summary, our ability to learn quickly without catastrophic forgetting led to circuits that can be volitionally modulated to enable imagination, internal thought, and planning. This modulation, which brings a huge evolutionary advantage to those who have it, also carries with it the risk of causing hallucinations."

To learn novel things a complementary system needs to be involved. This is called the orienting system. This system is sensitive to unexpected and unfamiliar events - events where bottom-up inputs do not trigger a strong resonant state. The orienting system resets the poorly matching top-down expectation. This pulls the system out of a poor local-minima, possibly leading to a better hypothesis. But if the matches continue to be poor, the orienting system activates uncommitted cells to learn about the novel information.
Ok, enough for today. On page 15, just about to start section 9.

Thursday, October 25, 2012

Stephen Grossberg

Just went to a talk by Stephen Grossberg from Boston University. It always sucks when you realize that all your ideas have already been though of by someone else... But basically his talk was about laminar theories of cortical computation. He didn't dive too much into the details of the model, but he showed its predictions and tons of evidence that supported many of the model's predictions. It was also very clear how much of the stuff in this blog fits in with his model.

So just as an overview, basically the model is that Layer 2/3 is acting like a pattern-completion circuit. This circuit gets input from layer 4 and feedsback to layer 6. Layer 4 and 6 are together acting like a pattern-seperation/competition/decision circuit. So this idea of two types of microcircuits in cortex is more prevelant.

The general theory is based on what he calls "ART": Adaptive Resonance Theory. And he talked a lot about the balance of excitation and inhibition. Lateral competition. Gain control and shunting inhibition. I'm not sure how deeply biophysical his model gets, or how fully-functional he's made it.

He went into a big discussion about all the different visual areas and how they are cooperatively computing. Each area is essentially doing the same thing on an abstract level, but computing with different inputs and feeding-back in different ways to influence other cortical areas. There are some structures that are not cortical that are playing specific roles such as maintaining object and spatial representations while the eyes move.

He went into a discussion about how top-down control of cortex is done through the apical dendrites. He was mainly talking about attention. It was as if attentional processes feedback to cortex through Layer1, and activate the apical tufts of pyramidal cells. This can turn up the gain of a particular area. He mentioned the point that the top-down processes shouldn't be enough alone to cause activation - as then you'd be hallucinating. He made a lot of mentions about illusory contours. The illusory contour activations are due to lateral connections coming in from "both-sides" that lead to activation.

All this made me think of the bursting of pyramidal cells. Top-down influences on the apical tuft will lead to calcium spikes etc, but if the pyramdial cell isn't also receiving bottom-up/lateral activations, then it won't fire. This prevents top-down attentional processes from causing firing and thus causing hallucinations. However, if there is both types of inputs, then the pyramdial cell will be bursting, which turns up the contrast of the representation. Bursting is also known to be important for some types of learning rules (it is easier to get LTP if the cells are bursting), and Grossberg discussed how learning during top-down modulation is important for his model.

So, yeah, definitely need to read his papers...

Tuesday, October 23, 2012

Brief Bursts Self-Inhibit and Correlate the Pyramidal Network

Berger, TK. Silberberg, G. Perin, R. Markram, H. (2010) Brief Bursts Self-Inhibit and Correlate the Pyramidal Network. PLoS Biology 8(9).

Prominent inhibitory pathway is frequency-dependent disynaptic inhibition (FDDI) between L5 Pyr and Martinotti cells. MC inhibition is facilitating, MC axons extend up to layer 1. Target oblique, apical and tuft dendrites of neighboring PCs.

whole-cell recording from TTL5 PCs and L5 MCs. FDDI modulated by Ih in Py dendrites. PCs that receive both disynaptic inhibition and monosynaptic excitation from neighboring PCs transition from net depolarization to hyperpolarization based on frequency of excitation
By activating pre-synaptic MC and PC simultaneously, the resulting EPSPs sum linearly on average. This is because PCs target basal tree and MCs target apical tree, so no shunting. If the apical trunk is directly excited then there is stronger inhibition at the soma. The further away the trunk electrode is the larger this effect.

Only a few MCs are likely to mediate the FDDI effect. They shut down one by hyperpolarizing and saw big reduction in inhibition in post-synaptic PC. Possible that hyperpolarizing could shut down multiple (thus underestimating number of MCs) due to electrical coupling.

MCs can synchronize the activity of PCs. Activating 8-9 PCs with bursts can saturate the MC effect on PCs.

Basically, this fits in with the gist that PC bursting is a communication protocol, and MC inhibition is part of the inhibitory feed-back loop that interacts with burst-based signaling. PC bursts seem to arise when the apical tree is activated - typically because of a NMDA and VGCC calcium spike in apical tuft/trunk. This bursting will activate MC cells which feedback onto the apical tree, and then reduces bursting likelihood. Unclear if this could be a multiplicative mechanism via shunting the apical tree, or if its just additive.

Monday, October 22, 2012

A Hierarchical Structure of Cortical Interneuron Electrical Diversity Revealed by Automated Statistical Analysis

Druckmann, S. Hill, S. Schurmann, F. Markram, H. Segev, I. (2012) A Hierarchical Structure of Cortical Interneuron Electrical Diversity Revealed by Automated Statistical Analysis. Cerebral Cortex.

Electrical diversity of cortical interneurons is well known. Standardized classification called PING. They analyzed ~500 neuron's electrical responses and used statistical hierarchical classification to subdivide neurons into e-types. They use 38 features to describe the voltage responses, and use clustering methods to put the cells into groups.

Features were extracted from cell trace (e.g. AP amplitude, half-width, rate, adaptation, ISI), then each cell represented by a vector in m-dimensional space. 466 data-points from recordings of cells. PCA was used to determine the prominent components.

Each class is treated as a multidimensional Gaussian. Nested clustering analysis is performed by repeatedly applying k-means clustering. Here's the process:
PCA(Data) -> 10 Features -> 2 clusters of Data -> repeat for each cluster.

PCA classification pulls out many of the clusters created by subjective PING analysis. The data appears to be naturally high-dimensional as 10 PCs are needed to explain 80% of the variance. PCA shows some seperation of the subjective types, but not complete classification. This is just due to the biases in the data for different features. PCA doesn't pull out classification features necessarily.

Used linear disciminant analysis on data to find dimensions that best seperate the classes. LDA is supervised, thus requires the subjective classification to work.

Unsupervised nested clustering of the features lead to a hierarchical picture of neuronal e-type. The first split caused the majority of interneurons to break from the Pyramids. The next split seperated the FS cells from the adapting cells. Then the groups were further split and named as in Figure 8:

Statistical connectivity provides a sufficient foundation for specific functional connectivity in neocortical neural microcircuits

Hill, SL. Wang, Y. Riachi, I. Schurmann, F. Markram, H. (2012) Statistical connectivity provides a sufficient foundation for specific functional connectivity in neocortical neural microcircuits. PNAS 109 (42): E2885-E2894.

At SfN this year there was a big poster session on Markram's Blue Brain project. The goal of this project is to basically take everything we know about cortex and then put it together in a bottom-up fashion. I don't think that putting together cortex in this manner will create a working cortex, but it will definitely guide ideas about a theory of cortex. This is one of several papers from Blue Brain project that I'm going to review.

The conclusion of this paper is that for most neurons/neuron types the probability that two neurons are connected is directly related to the amound of over-lap between the pre-synaptic axons and the post-synaptic dendrites. The model is basically that if neurons can wire together (i.e. because they actually make contact) then they will. The alternative is that chemical cues lead to specific types of connectivity, but for this to work tons of different cue combinations would be needed. However, on a larger-scale this may be the case, especially across neuron types - i.e. one neural type may be avoid another neuronal type, even though their axons/dendrites overlap significantly.

Rat somatosensory cortex P12-16. 10 types of neurons. For functional connectivity they measured responses in paired whole-cell patch-clamp recordings. They also stained neurons and made light-based reconstructions to identify the putative synapses. Putative synapses were described with two measurements: dendritic and axonal branch order, and path distance. Putative synapses showed domain-specific patterning - where different classes target specific areas of the dendritic tree. i.e. Pyr->Pyr synapses are typically on basal tree, martinotti neurons innervate distal dendrites of Pyr, and small baskets innervate somata and proximal dendrites.

To get statistical connectivity they took arbors from different animals and placed them randomly in their cortex model. Then measured sites of potential appositions. Statistical connectivity also reflected the domain-specific innervations of functional connectivity.
Tweaks to the model, such as slightly repositioning the neurons, changing the range of putative synapses does not have major effects on the statistical structural connectome. Altering the morphology and changing the number of morphological types of pyramids leads to similar results, suggesting that the structural connectome is robust to perturbations and that synapse distributions are invariant across animals.

The statistical connectivity can equally be calculated by combining morphologies of pyramids into a distribution and calculating the overlaps of the probabilistic distributions. This yields essentially the same result as calculating the statistical connections based on real neuronal morphologies and finding overlaps. Here is what the statistical distributions of pyramidal cells look like from their model:
In conclusion, chemical signals are present that set-up the layer-specific targeting of dendrites and axons. They may be involved in some other minor changes like repulsion of Py-Py synapses away from the soma. But most of connectivity can be approximated from statistical overlap. It is then likely that experience-based plasticity then strengthens and removes synapses.

Thursday, October 11, 2012

Cell-type homologies and the origins of the neocortex.

Dugas-Ford, J. Rowell, JJ. Ragsdale, CW. (2012) Cell-type homologies and the origins of the neocortex. PNAS.

The idea of these experiments is to address Harvey Karten's hypothesis that there is a homology between different layers of neocortex and the nuclei of bird brains. To answer the question they are looking for genetic markers that distinctly label L4 (input) pyramids and L5 (output) pyramids in mammalian cortex. They then basically look for the same markers in turtles, chickens and zebra finches to claim a homology.

They conclude that Karten was basically right. They find similar markers of L4 pyramids in mammals as they find in the suggested nuclei of birds. They also see homologies in turtles. 

The most interesting part is that turtle dorsal cortex appears to be structured like archeocortex in mammals (olfactory and hippocampus). They see that the input pyramids are grouped together in the rostral part of dorsal cortex, and the output pyramids are grouped in the caudal part. They even state the homology to hippocampus: "An analogous example is provided by the mammalian hipposcampus, a three-layered cortex with separate fields, CA1 and CA3, containing pyramidal cells that differ in their connections and their molecular identities and that have between the a transitional field, CA2". 

So this gives more credence to the idea that cortical processing may require two types of pyramidal cell circuits. Which means that in a way neo-cortex is two three-layer cortices stacked on top of each other. But it is really the two spatially seperated circuits like in turtle, hippocampus, olfactory, put on top of each other. Its probably a lot more efficient to wire them in the 6-layer fashion. For birds instead of stacking, its more like they just clumped together, which made them form nuclei.

So, wtf are all those black neurons? What is L2/3 and L6 adding to neocortex. They discuss a study which do similar experiment for L2/3 and says that homologous cells exist in check pallium, but say that these claims are problematic. But the question is whether these black neurons are doing something special or what? Do they have homologies in turtle?

(Note: cortex develops in an "inside-out" manner, the deep layer neurons are born first, superficial layer neurons migrate past them later: review Rakic 2009).

They talk about the development of these regions in the different species in the discussion. There's no SVZ in turtle, but due to spatial seperation it may be that the SVZ is caudal VZ of turtle.