ARTBM? AR-RBM? ART-RBM?
Anyway, adaptive resonance restricted boltzmann machine. Lets go even simpler and try and put the ART rules in context of an RBM. Lets consider just two layers where stimuli are presented to the first layer through the basal-tree. The firing of this layer stimulates the basal tree of the second layer. The spiking of the second layer influences the apical tree of the first layer (and would influence the basal tree of the third layer). We are ignoring any lateral connections for now.
So, I want to consider establishing a resonance, learning, and representation dimensionality in the context of an RBM. For just a regular RBM, I like to think of the learning rule as basically back-propagation in a feed-forward neural network, where the supervised learning signal is the input. Although, I feel like this is not quite the case (need to do an RBM refresher - RBMs are binary?).
The first layer receives a stimulus, and lets say is normalized to a constant length. So at first the second layer will have low activity (nothing has been learned), and over time the excitability of the second layer increases. Eventually, enough neurons will be activated until the feedback starts causing the first layer to begin bursting. One parameter is how much longer is a burst than a spike? The bursts and the spikes will keep being normalized by the inhibition to a constant length. This should create a positive feed-back loop which leads to more bursting in the first layer and more excitation in the second layer. The second layer will also need to be normalized. Synapses from second layer that cause a burst in the first layer (spike before burst) are strengthened, synapses that cause a spike in layer 2 from a burst (burst before spike) are also strengthened.
If this kept going up a hierarchy, then the second layer would receive top-down signals and bottom-up signals. This will cause the second layer to burst. So the question is should there be learning if and only if both pre and post are bursting? Or is one enough? What ultimately happens to the top layer, as this layer will not have top-down signals.
The valuable thing about this set-up is that the second layer can increase or decrease the dimensionality of the representation as it learns. So there needs to be two opposing forces - one that increases the dimensionality when the matching is poor, and one that decreases the dimensionality when the matching is good. The increasing excitability with a poor match will naturally increase the dimensionality. It could be that the divisive normalization/competition could turn-off the weaker firing neurons when the match is good. So there probably needs to be an LTD rule when there is a burst and no spiking.
A big question is what is the relationship between the burst and the spike. Consider bottom-up inputs causing a neuron to fire at 10Hz, then top-down inputs come in and increase activity. How is this activity increased? Does it just increase the gain of the output - i.e. all bottom-up rates are multiplied by 2? Or does it add a constant amount - say another 10Hz, thus a 10Hz burster looks like a 20Hz spiker? Is the top-down signal to burst binary or is it graded?
I would say the first thing to do is start with a rate-coded ARBM. Each neuron has two "compartments" where each compartment just sums its inputs and passes that through some kind of sigmoid like function. The bottom-up compartment sets the firing rate based on the inputs. All the firing rates are normalized by the recurrent multiplicative inhibition (or just constantly normalized by the program - perhaps it the vector length can be lower, but ultimately has some maximum). The top-down compartment lets say increases the gain of the output, and has some range - like 1x - 3x max. The top-down compartment being over some threshold level, would indicate learning signals. If pre and post activity then LTP, if only one side is bursting then LTD.
No comments:
Post a Comment