Tuesday, October 30, 2012

Adaptive resonance theory: How a brain learns to consciously attend, learn, and recognize a changing world I

Grossberg, S. (2012) Adaptive resonance theory: How a brain learns to consciously attend, learn, and recognize a changing world. Neural Networks.

Wow, so don't go overboard with that title. He actually did talk about the "C" word in his talk, and actually he said a lot of things that I agree with. I'm sure we'll get to that. This is a 98 page paper, which after just skimming through and looking at the figures summarizes 40 years of his research into this idea. On the surface when I saw his talk I thought that he had some really great insights and understood many of the fundamental principals that I know in my gut are there. So, I think this will be a good read. And so this will just be mostly notes as I read through the paper.

"The review illustrates that ART is currently the most highly developed cognitive and neural
theory available, with the broadest explanatory and predictive range."

And if you thought the title was overboard, the abstract is just like, "look, this is how the brain works. It explains everything." And then it lists everything. But really, it basically lists everything...

One of the main problems of modern algorithms is "catastrophic forgetting", basically that algorithms forget as quickly as they learn. The brain somehow overcomes this problem...This is the stability-plasticity dilemma.

Consciousness, Learning, Expectation, Attention, Resonance and synchrony (CLEARS). "All conscious states are resonant states".

"ART accomplishes these properties by proposing how top-down expectations focus attention on salient combinations of cues, and characterizes how attention may operate via a form of self-normalizing "biased competition" (Desimone, 1998)."

ART allows for modal architectures. They are designed  to be general-purpose self-organizing processing. More general than AI, less general than turing machine. Evolutionary pressure shaped these modal architecture to create "Complementary Computing" and "Laminar Computing".

Ok, one of the things that was really interesting was the shunting component of the model. It was never clear to me from his talk what that meant, or if it meant shunting in the way I've been thinking about it. But the equations says this for shunting:
- dx/dt = -Ax + -FxJ
where J is the inhibitory input. And this for additive:
-dx/dt = -Ax + -EJ

Right. So this to me looks like exactly what I mean by shunting in the leech. x is like directly related to V, which in the model is directly related to spiking. So, if x is 0, then the shunting inhibition causes no current to leak out.
dV/dt = gl * (Er - V) + gi * (Er - V) + ge * (Ena - V)

That was short-term memory. I'm just on page 5.

Medium term memory sounds like its based on synaptic depression. This could fit in with that working memory facilitation-depression paper. [Review]

LTM: Gated Steepest Descent Learning. Two algorithms: outstar, instar. The outstar equation is where i cell learns a spatial pattern of activation across a network of sampled cells. The instar is like the opposite learning: the j cell learns the pattern of signals that activate it. instar is the competitive learning for self-organizing maps.

So he talks about two types of learning. ART is matched-based learning. However, ART does not describe mismatch learning, which is like the "Where" pathway and motor system - continually updated.

"3.3. Why procedural memories are not conscious. Brain systems that use inhibitory matching and
mismatch learning cannot generate excitatory resonances. Hence, if "all conscious states are resonant
states", then spatial and motor representations are not conscious. This way of thinking provides a
mechanistic reason why declarative memories (or "learning that"), which are the sort of memories
learned by ART, may be conscious, whereas procedural memories (or "learning how"), which are the
sort of memories that control spatial orienting and action, are not conscious (Cohen and Squire 1980)."

Top down activation primes bottom-up excitatability - excitatory matching. This is like top-down signals activating the apical tuft, leading to calcium spikes in a bunch of neurons. Then these neurons get bottom up input and then start bursting. The neurons without the top-down inputs are spiking, and are not heard as loudly.

Excitatory matching generates resonant brain states. Positive feedback from bursting neurons and top-down neurons. Focues attention on a combination of features (the "critical feature pattern"). This triggers learning. Resonance provides a globa context-sensitive indicator that the system is processing data worthy of learning.

"Grossberg (1973) that the shunting, or gain control, properties of membrane equation
neurons in an on-center off-surround network enable them to self-normalize their activities"

"Zeki and Shipp (1988, p. 316) wrote that “backward connections seem not to excite cells in lower areas, but instead influence the way they respond to stimuli”; that is, they are modulatory"

"Sillito et al. (1994) concluded that “the cortico-thalamic input is only strong enough to exert an
effect on those dLGN cells that are additionally polarized by their retinal input...the feedback circuit
searches for correlations that support the ‘hypothesis’ represented by a particular pattern of cortical
activity”."

"7. Imagining, Planning, and Hallucinations: Prediction without Action
A top-down expectation is not always modulatory. The excitatory/inhibitory balance in the modulatoryon-center of a top-down expectation can be modified by volitional control from the basal ganglia. If, for example, volitional signals inhibit inhibitory interneurons in the on-center, then read-out of a top-down expectation from a recognition category can fire cells in the on-center prototype, not merely modulate them. Such volitional control has been predicted to control mental imagery and the ability to think and plan ahead without external action, a crucial type of predictive competence in humans and other mammals. If these volitional signals become tonically hyperactive, then top-down expectations can fire without overt intention, leading to properties like schizophrenic hallucinations (Grossberg, 2000a). In summary, our ability to learn quickly without catastrophic forgetting led to circuits that can be volitionally modulated to enable imagination, internal thought, and planning. This modulation, which brings a huge evolutionary advantage to those who have it, also carries with it the risk of causing hallucinations."

To learn novel things a complementary system needs to be involved. This is called the orienting system. This system is sensitive to unexpected and unfamiliar events - events where bottom-up inputs do not trigger a strong resonant state. The orienting system resets the poorly matching top-down expectation. This pulls the system out of a poor local-minima, possibly leading to a better hypothesis. But if the matches continue to be poor, the orienting system activates uncommitted cells to learn about the novel information.
Ok, enough for today. On page 15, just about to start section 9.

No comments:

Post a Comment