A solvable connectionist model of 
immediate recall of ordered lists 
Nell Burgess 
Department of Anatomy, University College London 
London WCiE 6BT, England 
(e-mail: n .burgessucl. ac. uk) 
Abstract 
A model of short-term memory for serially ordered lists of verbal 
stimuli is proposed as an implementation of the 'articulatory loop' 
thought to mediate this type of memory (Baddeley, 1986). The 
model predicts the presence of a repeatable time-varying 'context' 
signal coding the timing of items' presentation in addition to a 
store of phonological information and a process of serial rehearsal. 
Items are associated with context nodes and phonemes by Hebbian 
connections showing both short and long term plasticity. Items are 
activated by phonemic input during presentation and reactivated 
by context and phonemic feedback during output. Serial selection 
of items occurs via a winner-take-all interaction amongst items, 
with the winner subsequently receiving decaying inhibition. An 
approximate analysis of error probabilities due to Gaussian noise 
during output is presented. The model provides an explanatory 
account of the probability of error as a function of serial position, 
list length, word length, phonemic similarity, temporal grouping, 
item and list familiarity, and is proposed as the starting point for 
a model of rehearsal and vocabulary acquisition. 
Introduction 
Short-term memory for serially ordered lists of pronounceable stimuli is well de- 
scribed, at a crude level, by the idea of an 'articulatory loop' (AL). This postulates 
that information is phonologically encoded and decays within 2 seconds unless re- 
freshed by serial rehearsal, see (Baddeley, 1986). It successfully accounts for (i) 
5 2 Neil Burgess 
the linear relationship between memory span s (the number of items s such that 
50% of lists of s items are correctly recalled) and articulation rate r (the number 
of items that can be said per second) in which s  2r + c, where r varies as a 
function of the items, language and development; (ii) the fact that span is lower for 
lists of phonemically similar items than phonemically distinct ones; (iii) unattended 
speech and articulatory distractor tasks (e.g. saying blah-blah-blah...) both reduce 
memory span. Recent evidence suggests that the AL plays a role in the learning of 
new words both during development and during recovery after brain traumas, see 
e.g. (Gathercole & Baddeley, 1993). Positron emission tomography studies indicate 
that the phonological store is localised in the left supramarginal gyrus, whereas sub- 
vocal rehearsal involves Broca's area and some of the motor areas involved in speech 
planning and production (Paulesu et al., 1993). 
However, the detail of the types of errors committed is not addressed by the AL 
idea. Principally: (iv) the majority of errors are 'order errors' rather than 'item 
errors', and tend to involve transpositions of neighbouring or phonemically similar 
items; (v) the probability of correctly recalling a list as a function of list length 
is a sigmoidl (vi) the probability of correctly recalling an item as a function of its 
serial position in the list (the 'serial position curve') has a bowed shape; (vii) span 
increases with the familiarity of the items used, specifically the c in s  2r + c can 
increase from 0 to 2.5 (see (Hulme et al., 1991)), and also increases if a list has been 
previously presented (the 'Hebb effect'); (viii) 'position specific intrusions' occur, 
in which an item from a previous list is recalled at the same position in the current 
list. Taken together, these data impose strong functional constraints on any neural 
mechanism implementing the AL. 
Most models showing serial behaviour rely on some form of 'chaining' mech- 
anism which associates previous states to successive states, via recurrent con- 
nections of various types. Chaining of item or phoneme representations gener- 
ates errors that are incompatible with human data, particularly (iv) above, see 
(Burgess & Hitch, 1992, Henson, 1994). Here items are maintained in serial order 
by association to a repeatable time-varying signal (which is suggested by position 
specific intrusions and is referred to below as 'context'), and by the recovery from 
suppression involved in the selection process - a modification of the 'competitive 
queuing' model for speech production (Houghton, 1990). The characteristics of 
STM for serially ordered items arise due to the way that context and phoneme 
information prompts the selection of each item. 
2 The model 
The model consists of 3 layers of artificial neurons representing context, phonemes 
and items respectively, connected by Hebbian connections with long and short term 
plasticity, see Fig. 1. There is a winner-take-all (WTA) interaction between item 
nodes: at each time step the item with the greatest input is given activation 1, and 
the others 0. The winner at the end of each time step receives a decaying inhibition 
that prevents it from being selected twice consecutively. 
During presentation, phoneme nodes are activated by acoustic or (translated) 
visual input, activation in the context layer follows the pattern shown in Fig. 1, 
item nodes receive input from phoneme nodes via connections wij. Connections 
A $olvable Connectionist Model of Immediate Recall of Ordered Lists 53 
A) 
ooo 
ooo 
ooo 
t=l 
t=2 
t=3 
B) 
context 
0000000 
Wij (t) wij (t/ 
0000 O0 
items (WTA + 
suppression) [ output 
phonemes .. 
0000000 
 translated 
visual input 
'ij(t) 
acoustic 
input buffer 
Figure 1: A) Context states as a function of serial position t; filled circles are active 
nodes, empty circles are inactive nodes. B) The architecture of the model. Full 
lines are connections with short and long term plasticity; dashed lines are routes by 
which information enters the model. 
Wij(t) learn the association between the context state and the winning item, and 
wij and i learn the association with the active phonemes. During recall, the 
context layer is re-activated as in presentation, activation spreads to the item layer 
(via Wq(t)) where one item wins and activates its phonemes (via i(t)). The 
item that now wins, given both context and phoneme inputs, is output, and then 
suppressed. 
As described so far, the model makes no errors. Errors occur when Gaussian noise 
is added to items' activations during the selection of the winning item to be output. 
Errors are likely when there are many items with similar activation levels due to 
decay of connection weights and inhibition since presentation. Items may then be 
selected in the wrong order, and performance will decrease with the time taken to 
present or recall a list. 
2.1 Learning and familiarity 
Connection weights have both long and short term plasticity: Wij(t) (similarly 
wo(t ) and eq(t)) have an incremental long term component Wig(t), and a one- 
shot short term component W/ (t) which decays by a factor A per second. The net 
weight of the connection is the sum of the two components: Wq(t) = Wi (t)+Wi (t). 
Learning occurs according to: 
w6(t + s) = w6(t ) 
if c(t)ai(t) > Wq(t); 
otherwise, 
54 Neil Burgess 
Witi(t) + ecj(t)ai(t) if cj(t)ai(t) > 0; 
Wi(t + 1) = Wig(t) otherwise, (1) 
where c(t) and ai(t) are the pre- and post-connection activations, and e decreases 
with IW)l so that the long term component saturates at some m:hum value. 
These mbdifiable connection weights are never negative. 
An item's 'familiarity' is reflected by the size of the long term components wit and 
-t of the weights storing the association with its phonemes. These components 
increase with each (error-free) presentation or recall of the item. For lists of to- 
tally unfamiliar items, the item nodes are completely interchangeable having only 
the short-term connections  to phoneme nodes that are learned at presentation. 
Whereas the presentation of a familiar item leads to the selection of a particular 
item node (due to the weights wj) and, during output, this item will activate its 
phonemes more strongly due to the weights ;. Unfamiliar items that are phone- 
mically similar to a familiar item will tend to ge represented by the familiar item 
node, and can take advantage of its long-term item-phoneme weights j. 
Presentation of a list leads to an increase in the long term component of the context- 
item association. Thus, if the same list is presented more than once its recall 
improves, and position specific intrusions from previous lists may also occur. Notice 
that only weights to or from an item winning at presentation or output are increased. 
3 Details 
There are n items per list, np phonemes per item, and 
seconds to present or recall. At time t, item node i has activation ai(t), context node 
i has activation ci(t), t is the set of ne context nodes active at time t, phoneme 
node i has activation bi (t) and 7>i is the set of n phonemes comprising item i. 
Context nodes have activation 0 or x//2n,, phonemes take activation 0 or 1/vf, 
so Wi.(t) < x/-/2n, and w.(t) = ji(t) < 1/v/-, see (1) This sets the relative 
effect that context and phoneme layers hae on tems' actvatmn, and ensures that 
items of neither few nor many phonemes are favoured, see (Burgess & Hitch, 1992). 
The long-term components of phoneme-item weights w[(t) and %t.i(t ) are 0.45/v/  
for familiar items, and 0.15/v/'ff  for unfamiliar items (chosen to match the data 
in Fig. 3B). The long-term components of context-item weights Wi(t ) increase by 
0.15/vr, for each of the first few presentations or recalls of a list. 
Apart from the WTA interaction, each item node i has input: 
n(t) = E(t) + h(t) + m, (2) 
where Ii(t) < 0 is a decaying inhibition imposed following an item's selection at 
presentation or output (see below), r/ is a (0, a) Gaussian random variable added 
at output only, and Ei (t) is the excitatory input to the item from the phoneme layer 
during presentation and the context and phoneme layers during recall: 
Ei(t)--{ .wi(t)b(t) during presentation; 
+ w,(t)bj(t) during recall. (3) 
During recall phoneme nodes are activated according to 
A Solvable Connectionist Model of Immediate Recall of Ordered Lists 55 
One time step refers to the presentation or recall of an item and has duration 
The variable t increases by 1 per time step, and refers to both time and serial 
position. Short term connection weights and inhibition Ii(t) decay by a factor A 
per second, or A' per time step. 
The algorithm is as follows; rehearsal corresponds to repeating the recall phase. 
Presentation 
0. Set activations, inhibitions and short term weights to zero, t - 1. 
1. Set the context layer to state t : ci(t) -- X//2nc if i  t; ci(t) = 0 otherwise. 
2. Input items, i.e. set the phoneme layer to state t: b(t) - 1/vr  if i  t; 
hi(t) - 0 otherwise. 
3. Select the winning item, i.e. an(t) - i where h,k(t) -- max/{h,,(t)}; ai(t) - 0, for 
4. Learning, i.e. increment all connection weights according to (1). 
5. Decay, i.e. multiply short-term connection weights W/(t), wj(t) and 
and inhibitions Ii(t) by a factor 
6. Inhibit winner, i.e. set Ik(t) -- -2, where k is the item selected in 3. 
7. t-- t + 1, go to 1. 
Recall 
O. t-- 1. 
1. Set the context layer to state Ct, as above. 
2. Set all phoneme activations to zero. 
3. Select the winning item, as above. 
4. Output. Activate phonemes via ji(t), select the winning item (in the presence 
of noise). 
5. Learning, as above. 
6. Decay, as above. 
7. Inhibit winner, i.e. set I, (t) - -2, where/c is the item selected in 4. 
8. t--, t-F 1, go to 1. 
4 Analysis 
The output of the model, averaged over many trials, depends on (i) the activation 
values of all items at the output step for each time t and, (ii) given these activations 
and the noise level, the probability of each item being the winner. Estimation is 
necessary since there is no simple exact expression for (ii), and (i) depends on which 
items were output prior to time t. 
I define 7(t, i) to be the time elapsed, by output at time t, since item i was last 
selected (at presentation or output), i.e. in the absence of errors: 
i) = { (t -' 
if/ t; 
if i > t. (4) 
If there have been no prior errors, then at time t the inhibition of item i is 
Ii(t) -- --2(A) 7(t'+x), and short term weights to and from item i have decayed 
by a factor A'r(t,i). For a novel list of familiar items, the excitatory input to item i 
during output at time t is, see (3): 
5 6 Neil Burgess 
A) 
0.90 
0.85 
0.80 
0.75 
'-,3'" '-..  ' /,,  
4"' ......... 
4 6 
0.7 
0.6 
I I I 
2 4 6 
Figure 2: Serial position curves. Full lines show the estimation, extra markers are 
error bars at one standard deviation of 5 simulations of 1,000 trials each, see 5 
for parameter values. A) Rehearsal. Four consecutive recalls of a list of 7 digits 
('1',..,'4'). B) Phonemic similarity. SPCs are shown for lists of dissimilar letters 
('d'), similar letters ('s'), and alternating similar and dissimilar letters with the 
similar ones in odd ('o') and even ('e') positions. C.f. (addeley, 1968, expt. v). 
where llxll is the number of elements in set X. 
The probability p(t, i) that item i wins at time t is estimated by the softmax 
function(Brindle, 1990): 
p(t,i) (6) 
exp 
where m/(t) is hi(t) without the noise term, see (2-3), and a' = 0.75a. For a = 0.5 
(the value used below), the r.m.s. difference between p(t, i) estimated by simulation 
(500 trials) and by (6) is always less than 0.035 for -1 < rm(t) < 1 with 2 to 6 
items. 
Which items have been selected prior to time t affects Ii(t) in hi(t) via 7(t, i). 
p(t, i) is estimated for all combinations of up to two prior errors using (6) with 
appropriate values of mi(t), and the average, weighted by the probability of each 
error combination, is used. The 'missing' probability corresponding to more than 
two prior errors is corrected for by normalising p(t, i) so that Y.i p(t, i) - 1 for 
t = 1, .., r,,,. This overestimates the recency effect, especially in super-span lists. 
5 Performance 
The parameter values used are A = 0.75, r, - 6, a = 0.5. Different types of item are 
modelled by varying (r, p) :'digits' correspond to (2,0.15), 'letters' to (2,0.2), and 
'words' to (5,0.15-0.3). 'Similar' items all have 1 phoneme in common, dissimilar 
items have none. Unless indicated otherwise, items are dissimilar and familiar, see 
3 for how familiarity is modelled. The size of a relative to A is set so that digit 
span  7. np and  are such that approximately 7 digits can be said in 2 seconds. 
The model's performance is shown in Figs. 2 and 3. Fig. 2A: the increase in the 
long-term component of context-item connections during rehearsal brings stability 
after a small number of rehearsals, i.e. no further errors are committed. Fig. 2B: 
serial position curves show the correct effect of phonemic similarity among items. 
A Solvable Connectionist Model of Immediate Recall of Ordered Lists 5 7 
A) 
0.8 
0.6 
0.4 
0.2 
0.0 
B 
i i 
f 
5 lO o.o 0.5 1.o 1.5 
Figure 3: Item span. Full lines show the estimation, extra markers (A only) are 
error bars at one standard deviation of 3 simulations of 1,000 trials each, see 5 
and 3 for parameter values. A) The probability of correctly recalling a whole list 
versus list length. Lists of digits ('d'), unfamiliar items (of the same length, 'u'), 
and experimental data on digits (adapted from (Guildford & Dallenbach, 1925), 
'x') are shown. B) Span versus articulation rate (rate= Ur, with r = 5 and 
 =0.15,0.2, and 0.3). Calculated curves are shown for novel lists of familiar 
('f') and unfamiliar ('u') words and lists of familiar words after 5 repetitions ('r'). 
Data on recall of words ('w') and non-words ('n') are also shown, adapted from 
(Hulme el: al., 1991). 
Fig. 3A: the probability of recalling a list correctly as a function of list length 
shows the correct sigmoidal relationship. Fig. 3B: item span shows the correct, 
approximately linear, relationship to articulation rate, with span for unfamiliar 
items below that for familiar items. Span increases with repeated presentations of 
a list in accordance with the 'Hebb effect'. Note that span is slightly overestimated 
for short lists of very long words. 
5.1 Discussion and relation to previous work 
This model is an extension of (Burgess & Hitch, 1992), primarily to model effects 
of rehearsal and item and list familiarity by allowing connection weights to show 
plasticity over different timescales, and secondly to show recency and phonemic 
similarity effects simultaneously by changing the way phoneme nodes are activated 
during recall. Note that the 'context' timing signal varies with serial position: re- 
flecting the rhythm of presentation rather than absolute time (indeed the effect of 
temporal grouping can be modelled by modifying the context representations to 
reflect the presence of pauses during presentation (Hitch et al., 1995)), so presenta- 
tion and recall rates cannot be varied. 
The decaying inhibition that follows an items selection increases the locality of 
errors, i.e. if item i + I replaces item i, then item i is most likely to replace item 
 + 1 in turn (rather than e.g. item i + 2). The model has two remaining problems: 
(i) selecting an item node to form the long term representation of a new item, 
without taking over existing item nodes, and (ii) learning the correct order of the 
phonemes within an item - a possible extension to address this problem is presented 
in (Hartley & Houghton, 1995). 
The mechanism for selecting items is a modification of competitive queuing 
58 Neil Burgess 
(Houghton, 1990) in that the WTA interaction occurs at the item layer, rather than 
in an extra layer, so that only the winner is active and gets associated to context 
and phoneme nodes (this avoids partial associations of a context state to all items 
similar to the winner, which would prevent the zig-zag curves in Fig. 2B). The basic 
selection mechanism is sufficient to store serial order in itself, since items recover 
from suppression in the same order in which they were selected at presentation. The 
model maps onto the articulatory loop idea in that the selection mechanism corre- 
sponds to part of the speech production ('articulation') system and the phoneme 
layer corresponds to the 'phonological store', and predicts that a 'context' timing 
signal is also present. Both the phoneme and context inputs to the item layer serve 
to increase span, and in addition, the former causes phonemic similarity effects and 
the latter causes recency, position specific intrusions and temporal grouping effects. 
6 Conclusion 
I have proposed a simple mechanism for the storage and recall of serially ordered 
lists of items. The distribution of errors predicted by the model can be estimated 
mathematically and models a very wide variety of experimental data. By virtue of 
long and short term plasticity of connection weights, the model begins to address 
familiarity and the role of rehearsal in vocabulary acquisition. Many of the predicted 
error probabilities have not yet been checked experimentally: they are predictions. 
However, the major prediction of this model, and of (Burgess & Hitch, 1992), is 
that, in addition to a short-term store of phonological information and a process of 
sub-vocal rehearsal, STM for ordered lists of verbal items involves a third component 
which provides a repeatable time-varying signal reflecting the rhythm of the items' 
presentation. 
Acknowledgements: I am grateful for discussions with Rik Henson and Graham 
Hitch regarding data, and with Tom Hartley and George Houghton regarding error 
probabilities, and to Mike Page for suggesting the use of the softmax function. This 
work was supported by a Royal Society University Research Fellowship. 
References 
Baddeley A D (1968) Quarterly Jrournal of Ezperimental Psychology 20 249-264. 
Baddeley A D (1986) Working Memory, Clarendon Press. 
Brindle, J S (1990) in: D S Touretzky (ed.) Advances in Neural Information Processing 
Systems . San Mateo, CA: Morgan Kaufmann. 
Burgess N & Hitch G J (1992) jr. Memory and Language 31 429-460. 
Gathercole S E & Baddeley A D (1993) Working memory and language, Erlbaum. 
Guildford J P & Dallenbach K M (1925) American J. of Psychology 36 621-628. 
Hartley T & Houghton G (1995) J. Memory and Language to be published. 
Henson R (1994) Tech. Report, M.R.C. Applied Psychology Unit, Cambridge, U.K. 
Hitch G, Burgess N, Towse J & Culpin V (1995) Quart. J. of Ezp. Psychology, submitted. 
Houghton G (1990) in: R Dale, C Mellish & M Zock (eds.), Current Research in Natural 
Language Generation 287-319. London: Academic Press. 
Hulme C, Maughan S & Brown G D A (1991) J. Memory and Language 30 685-701. 
Paulesu E, Frith C D & Frackowiak R S J (1993) Nature 362 342-344. 
PART H 
NEUROSCIENCE 
