Multi-modular Associative Memory 
Nir Levy David Horn 
School of Physics and Astronomy 
Tel-Aviv University Tel Aviv 69978, Israel 
Eytan Ruppin 
Departments of Computer Science & Physiology 
Tel-Aviv University Tel Aviv 69978, Israel 
Abstract 
Motivated by the findings of modular structure in the association 
cortex, we study a multi-modular model of associative memory that 
can successfully store memory patterns with different levels of ac- 
tivity. We show that the segregation of synaptic conductances into 
intra-modular linear and inter-modular nonlinear ones considerably 
enhances the network's memory retrieval performance. Compared 
with the conventional, single-module associative memory network, 
the multi-modular network has two main advantages: It is less sus- 
ceptible to damage to columnar input, and its response is consistent 
with the cognitive data pertaining to category specific impairment. 
I Introduction 
Cortical modules were observed in the somatosensory and visual cortices a few 
decades ago. These modules differ in their structure and functioning but are likely to 
be an elementary unit of processing in the mammalian cortex. Within each module 
the neurons are interconnected. Input and output fibers from and to other cortical 
modules and subcortical areas connect to these neurons. More recently, modules 
were also found in the association cortex [1] where memory processes supposedly 
take place. Ignoring the modular structure of the cortex, most theoretical models 
of associative memory have treated single module networks. This paper develops 
a novel multi-modular network that mimics the modular structure of the cortex. 
In this framework we investigate the computational rational behind cortical multi- 
modular organization, in the realm of memory processing. 
Does multi-modular structure lead to computational advantages? Naturally one 
Multi-modular Associative Memo ry 53 
may think that modules are necessary in order to accommodate memories of dif- 
ferent coding levels. We show in the next section that this is not the case, since 
one may accommodate such memories in a standard sparse coding network . In 
fact, when trying to capture the same results in a modular network we run into 
problems, as shown in the third section: If both inter and intra modular synapses 
have linear characteristics, the network can sustain memory patterns with only a 
limited range of activity levels. The solution proposed here is to distinguish be- 
tween intra-modular and inter-modular couplings, endowing the inter-modular ones 
with nonlinear characteristics. From a computational point of view, this leads to 
a modular network that has a large capacity for memories with different coding 
levels. The resulting network is particularly stable with regard to damage to mod- 
ular inputs. From a cognitive perspective it is consistent with the data concerning 
category specific impairment. 
2 Homogeneous Network 
We study an excitatory-inhibitory associative memory network [2], having N ex- 
citatory neurons. We assume that the network stores M1 memory patterns r/ of 
sparse coding level p and M2 patterns v with coding level f such that p < f << 1. 
The synaptic efficacy Zij between the jth (presynaptic) neuron and the ith (post- 
synaptic) neuron is chosen in the Hebbian manner 
1 M1 1 M 
Jij = Np Z rl"rl'J + pp  v,vj , (1) 
/=1 /=1 
The updating rule for the activity state 1 of the ith binary neuron is given by 
(t + 1) = 13 (hi(t) - O) (2) 
where 13 is the step function and 0 is the threshold. 
hi(t) = 
p 
is the local field, or membrane potential. It includes the excitatory Hebbian coupling 
of all other excitatory neurons, 
N 
(4) 
for the two memory populations as 
1 N 1 N 
mv(t) = vjw(t) ' = . (6) 
The storage capacity ct = M/N of this network has two critical capacities. Ctc 
above which the population of f v patterns is unstable and etch above which the 
population of r/ patterns is unstable. We derived equations for the overlap and 
total activity of the two populations using mean field analysis. Here we give the 
and global inhibition that is proportional to the total activity of the excitatory 
neurons 
N 
1 
Q(t) =   (t). () 
J 
The overlap re(t) between the network activity and the memory patterns is defined 
54 N. Levy, D. Horn and E. Ruppin 
fixed-point equations for the case of M1 -- M2 -- M and = 
'5- 7 Mxf  + MP . 
resulting equations are 
The 
(7) 
and 
where 
and 
(8) 
(9) 
(a) 
(b) 
001, 
0.1 
0.06 
006 0.06 
0.04 " 0.0 
0.04 
0.0 0.0 
f 0 0 
1 
/ ' q 
' o.4.1  
0 , 
0.1 0.06 0.060  
0.06 ' 
.... 0.02",,,,, 
f 0 0 
0.1 0.1 
0.06 
0.06 
0.04 
p p 
Figure 1' (a) The critical capacity ac, rs. f and p for f _> p, 0 = 0.8 and N = 1000. 
(b) (ac n- a)/a nversus f and p for the same parameters as in (a). The validity 
of these analytical results was tested and verified in simulations. 
Next, we look for the critical capacities, a nand a at which the fixed-point 
equations become marginally stable. The results are shown in Figure 1. Figure 1 (a) 
shows acnvs. the coding levels f and p (f _> p). Similar results were obtained for 
ac. As evident the critical capacities of both populations are smaller than the one 
observed in a homogeneous network in which f = p. One hence necessarily pays a 
price for the ability to store patterns with different levels of activity. 
Figure l(b) plots the relative capacity difference (ac n- ac)/acnvs. f and p. The 
function is non negative, i.e., etch >_ rtc for all f and p. Thus, low activity memories 
are more stable than high activity ones. 
Assuming that high activity codes more features [3], these results seem to be at 
odds with the view [3, 4] that memories that contain more semantic features, and 
therefore correspond to larger Hebbian cell assemblies, are more stable, such as 
concrete versus abstract words. The homogeneous network, in which the memories 
with high activity are more susceptible to damage, cannot account for these obser- 
vations. In the next section we show how a modular network can store memories 
with different activity levels and account for this cognitive phenomenon. 
Multi-modular Associative Memo ry 55 
3 Modular Network 
We study a multi modular excitatory-inhibitory associative memory network, stor- 
ing M memory patterns in L modules of N neurons each. The memories are coded 
such that in every memory a variable number f of I to L modules is active. This 
number will be denoted as modular coding. The coding level inside the modules 
is sparse and fixed, i.e., each modular Hebbian cell assembly consists of pN active 
neurons with p << 1. The synaptic efficacy Zij lk between the jth (presynaptic) 
neuron from the kth module and the ith (postsynaptic) neuron from the/th module 
is chosen in a Hebbian manner 
M 
Jijt k _ 1 
where ri'it are the stored memory patterns. The updating rule for the activity state 
t of the ith binary neuron in the/th module is given by 
(12) 
where 0, is the threshold, and $(x) is a stochastic sigmoid function, getting the 
value 1 with probability (1 + e-) -1 and 0 otherwise. The neuron's local field, or 
membrane potential has two components, 
hit(t) = hiti,t,.,.t(t) + hit:t,.,.t(t) . (13) 
The internal field, hitinternat (t), includes the contributions from all other excitatory 
neurons that are situated in the/th module, and inhibition that is proportional to 
the total modular activity of the excitatory neurons, i.e., 
N 
j.i 
(14) 
where 
The 
N 
@t(t) = Np2Vjt(t) . (15) 
J 
external field component, hit:t,.,,t(t), includes the contributions from all 
other excitatory neurons that are situated outside the/th module, and inhibition 
that is proportional to the total network activity. 
= 6 - Ok(t) - od 
k7l j k 
(16) 
We allow here for the freedom of using more complicated behavior than the standard 
6(x) - x one. In fact, as we will see, the linear case is problematic, since only 
memory storage with limited modular coding is possible. 
The retrieval quality at each trial is measured by the overlap function, defined by 
L N 
rn ' (t) = pNfY' k=l i=1 
where f is the modular coding of 
56 N. Levy, D. Horn and E. Ruppin 
In the simulations we constructed a network of L -- 10 modules, where each module 
contains N = 500 neurons. The network stores M - 50 memory patterns randomly 
distributed over the modules. Five sets of ten memories each are defined. In each 
set the modular coding is distributed homogeneously between one to ten active 
modules. The sparse coding level within each module was set to be p = 0.05. Every 
simulation experiment is composed of many trials. In each trial we use as initial 
condition a corrupted version of a stored memory pattern with error rate of 5%, 
and check the network's retrieval after it converges to a stable state. 
1 
o.g 
o.8 
0.7 
0.6 
o.5 
0.4 
0.3 
0.2 
o.1 
o 
1 2 3 
4 5 
Modular Coding 
8 9 lO 
Figure 2: Quality of retrieval rs. memory modular coding. The dark shading repre- 
sents the mean overlap achieved by a network with linear intra-modular and inter- 
modular synaptic couplings. The light shading represents the mean overlap of a 
network with sigmoidal inter-modular connections, which is perfect for all memory 
patterns. The simulation parameters were: L - 10, N - 500, M - 50, p = 0.05, 
.k = 0.7, 0,/= 2 and O = 0.6. 
We start with the standard choice of 6(x) - x, i.e. treating similarly the intra- 
modular and inter-modular synaptic couplings. The performance of this network 
is shown in Figure 2. As evident, the network can store only a relatively narrow 
span of memories with high modular coding levels, and completely fails to retrieve 
memories with low modular coding levels (see also [5]). If, however, 6 is chosen to be 
a sigmoid function, a completely stable system is obtained, with all possible coding 
levels allowed. A sigmoid function on the external connections is hence very effective 
in enhancing the span of modular coding of memories that the network can sustain. 
The segregation of the synaptic inputs to internal and external connections has been 
motivated by observed patterns of cortical connectivity: Axons forming excitatory 
intra-modular connections make synapses more proximal to the cell body than do 
inter-modular connections [6]. Dendrites, having active conductances, embody a 
rich repertoire of nonlinear electrical and chemical dynamics (see [7] for a review). 
In our model, the setting of 6 to be a sigmoid function crudely mimics these active 
conductance properties. 
We may go on and envisage the use of a nested set of sigmoidal dendritic transmis- 
sion functions. This turns out to be useful when we test the effects of pathologic 
alterations on the retrieval of memories with different modular codings. The amaz- 
ing result is that if the damage is done to modular inputs, the highly nonlinear 
transmission functions are very resistible to it. An example is shown in Fig. 3. 
Multi-modular Associative Memory 57 
Here we compare two nonlinear functions: 
= ,xo j? vk(t)- 
kl j 
] 
The second one is the nested sigmoidal function mentioned above. Two types of 
input cues are compared: correct /i to one of the modules and no input to the 
rest, or partial input to all modules. 
0.6 
1 
o.g 
0.8 
0.7 
o.5 
0.4 
0.3 
0.2 
o.1 
o 
0 0.1 0. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 
re(t-O) 
Figure 3: The performance of modular networks with different types of non-linear 
inter-connections when partial input cues are given. The mean overlap is plotted 
vs. the overlap of the input cue. The solid line represents the performance of 
the network with 62 and the dash-dot line represents 6x. The left curve of 62 
corresponds to the case when full input is presented to only one module (out of 
the 5 that comprise a memory), while the right solid curve corresponds to partial 
input to all modules. The two 61 curves describe partial input to all modules, but 
correspond to two different choices of the threshold parameter 0a, 1.5 (left) and 2 
(right). Parameters are L = 5, N = 1000, p = 0.05, A = 0.8,  = 5, O, = 0.7 and 
Ok = 0.7. 
As we can see, the nested nonlinearities enable retrieval even if only the input to 
a single module survives. One may therefore conclude that, under such conditions, 
patterns of h'igh modular coding have a grater chance to be retrieved from an input 
to a single module and thus are more resilient to afferent damage. Adopting the 
assumption that different modules code for distinct semantic features, we now find 
that a multi-modular network with nonlinear dendritic transmission can account 
for the view of [3], that memories with more features are more robust. 
4 Summary 
We have studied the ability of homogeneous (single-module) and modular networks 
to store memory patterns with variable activity levels. Although homogeneous net- 
works can store such memory patterns, the critical capacity of low activity memories 
was shown to be larger than that of high activity ones. This result seems to be in- 
consistent with the pertaining cognitive data concerning category specific semantic 
58 N. Levy, D. Horn and E. Ruppin 
impairment, which seem to imply that high activity memories should be the more 
stable ones. 
Motivated by the findings of modular structure in associative cortex, we developed a 
multi-modular model of associative memory. Adding the assumption that dendritic 
non-linear processing operates on the signals of inter-modular synaptic connections, 
we obtained a network that has two important features: coexistence of memories 
with different modular codings and retrieval of memories from cues presented to a 
small fraction of all modules. The latter implies that memories encoded in many 
modules should be more resilient to damage in afferent connections, hence it is 
consistent with the conventional interpretation of the data on category specific 
impairment. 
References 
[1] R. F. Hevner. More modules. TINS, 16(5):178, 1993. 
[2] M. V. Tsodyks. Associative memory in neural networks with the hebbian learn- 
ing rule. Modern Physics Letters B, 3(7):555-560, 1989. 
[3] G. E. Hinton and T. Shallice. Lesioning at attractor network: investigations of 
acquired dyslexia. Psychological Review, 98(1):74-95, 1991. 
[4] G. V. Jones. Deep dyslexia, imageability, and ease of predication. Brain and 
Language, 24:1-19, 1985. 
[5] R. Lauro Grotto, S. Reich, and M. A. Virasoro. The computational role of 
conscious processing in a model of semantic memory. In Proceedings of the IIAS 
Symposium on Cognition Computation and Consciousness, 1994. 
[6] P. A. Hetherington and L. M. Shapiro. Simulating hebb cell assemblies: the 
necessity for partitioned dendritic trees and a post-not-pre ltd rule. Network, 
4:135-153, 1993. 
[7] R. Yuste and D. W. Tank. Dendritic integration in mammalian neurons a cen- 
tury after cajal. Neuron, 16:701-716, 1996. 
