Explorations with the Dynamic Wave 
Model 
Thomas P. Rebotier 
Department of Cognitive Science 
UCSD, 9500 Gilman Dr 
LA JOLLA CA 92093-0515 
rebotier@cogsci.ucsd.edu 
Jeffrey L. Elman 
Department of Cognitive Science 
UCSD, 9500 Gilman Dr 
LA JOLLA CA 92093-0515 
elman@cogsci.ucsd.edu 
Abstract 
Following Shrager and Johnson (1995) we study growth of logi- 
cal function complexity in a network swept by two overlapping 
waves: one of pruning, and the other of Hebbian reinforcement of 
connections. Results indicate a significant spatial gradient in the 
appearance of both linearly separable and non linearly separable 
functions of the two inputs of the network; the n.l.s. cells are much 
sparser and their slope of appearance is sensitive to parameters in 
a highly non-linear way. 
1 INTRODUCTION 
Both the complexity of the brain (and concomittant difficulty encoding that com- 
plexity through any direct genetic mapping), as well as the apparently high degree 
of cortical plasticity suggest that a great deal of cortical structure is nergent 
rather than pre-specified. Several neural models have explored the emergence of 
complexity. Von der Marlsburg (1973) studied the grouping of orientation selec- 
tivity by competitive Hebbian synaptic modification. Linsker (1986.a, 1986.b and 
1986.c) showed how spatial selection cells (off-center on-surround), orientation selec- 
tive cells, and finally orientation columns, emerge in successive layers from random 
input by simple, Hebbian-like learning rules. Miller (1992, 1994) studied the emer- 
gence of orientation selective columns from activity dependant competition between 
on-center and off-center inputs. 
Kerzsberg, Changeux and Dehaene (1992) studied a model with a dual-aspect learn- 
ing mechanism: Hebbian reinforcement of the connection strengths in case of cor- 
related activity, and gradual pruning of immature connections. Cells in this model 
were organized on a 2D grid, connected to each other according to a probability ex- 
ponentially decreasing with distance, and received inputs from two different sources, 
550 T.P. REBOTIER, J. L. ELMAN 
A and B, which might or might not be correlated. The analysis of the network re- 
vealed 17 different kinds of cells: those whose output after several cycles depended 
on the network's initial state, and the 16 possible logical functions of two inputs. 
Kerzsberg et al. found that learning and pruning created different patches of cells 
implementing common logical functions, with strong excitation within the patches 
and inhibition between patches. 
Shrager and Johnson (1995) extended that work by giving the network structure in 
space (structuring the inputs in intricated stripes) or in time, by having a Hebbian 
learning occur in a spatiotemporal wave that passed through the network rather 
than occurring everywhere simultaneously. Their motivation was to see if these 
learning conditions might create a cascade of increasingly complex functions. The 
approach was also motivated by developmental findings in humans and monkeys 
suggesting a move of the peak of maximal plasticity from the primary sensory 
and motor areas towards parietal and then frontal regions. Shrager and Johnson 
classified the logical functions into three groups: the constants (order 0), those that 
depend on one input only (order 1), those that depend on both inputs (order 2). 
They found that a slow wave favored the growth of order 2 cells, whereas a fast 
wave favored order i cells. However, they only varied the connection reinforcement 
(the growth Trophic Factor), so that the still diffuse pruning affected the rightmost 
connections before they could stabilize, resulting in an overall decrease which had 
to be compensated for in the analysis. 
In this work, we followed Shrager and Johnson in their study of the effect of a 
dynamic wave of learning. We present three novel features. Firstly, both the growth 
trophic factor (hereafter, TF) and the probability of pruning (by analogy, "death 
factor", DF) travel in gaussian-shaped waves. Second, we classify the cells in 4, not 
3, orders: order 3 is made of the non-linearly separable logical functions, whereas 
the order 2 is now restricted to linearly separable logical functions of both inputs. 
Third, we use an overall measure of network performance: the slope of appearance 
of units of a given order. The density is neglected as a measure not related to the 
specific effects we are looking for, namely, spatial changes in complexity. Thus, each 
run of our network can be analyzed using 4 values: the slopes for units of order 0, 
1, 2 and 3 (See Table 1.). This extreme summarization of functional information 
allows us to explore systematically many parameters and to study their influence 
over how complexity grows in space. 
Table 1: Orders of logical complexity 
ORDER FUNCTIONS 
0 
1 
2 
3 
True False 
A !A B !B 
A.B !A.B A.!B !A.!B AvB !AvB Av!B !Av!B 
A xor B, A==B 
2 METHODS 
Our basic netxvork consisted of 4 columns of 50 units (one simulation verified the 
scaling up of results, see section 3.2). Internal connections had a gaussian band- 
width and did not wrap around. All initial connections were of weight 1, so that the 
connectivity weights given as parameters specified a number of labile connections. 
Early investigations were made with a set of manually chosen parameters ("MAN- 
Explorations with the Dynamic Wave Model 551 
UAL"). Afterwards, two sets of parameters were determined by a Genetic Algorithm 
(see Goldberg 1989): the first, "SYM", by maximizing the slope of appearance of 
order 3 units only, the second, "ASY", byoptimizing jointly the appearance of order 
2 and order 3 units ("ASY"). The "SYM"network keeps a symmetrical rate of 
presentation between inputs A and B. In contrast, the "ASY" net presents input 
B much more often than input A. Parameters are specified in Table 1 and, are in 
"natural" units: bandwidths and distances are in "cells apart", trophic factor is 
homogenous to a weight, pruning is a total probability. Initial values and prun- 
ing necessited random number generation. We used a linear congruence generator 
(see p284 in Press 1988), so that given the same seed, two different machines could 
produce exactly the same run. All the points of each Figure are means of several 
(usually 40) runs with different random seeds and share the same series of random 
seeds. 
Table 2: Default parameters 
MAN. SYM. ASY. name 
description 
8.5 6.20 12 Wae 
6.5 5.2 9.7 Wai 
8.5 8.5 13.4 Wbe 
6.5 6.5 14.1 Wbi 
5.0 6.5 9.9 Wne 
3.5 1.24 12.4 Wni 
0.2 0.20 0.28 DW 
7.0 1.26 0.65 Bne 
7.0 2.86 0.03 Bni 
0.? 0.68 0.98 Cdw 
1.5 3.0 -3.2 Ddw 
9.8]' 17.6 16.4 Wtf 
0.6 0.6 0.6 Btf 
3.5 1.87 3.3 Tst 
0.6 0.64 0.5 Bdf 
0.65 0.62 0.12 Pdf 
0.5 0.5 0.06 Pa 
0.5 0.5 0.81 Pb 
0.00 0.00 0.00 Pab 
mean ini. weight of A excitatory connections 
mean ini. weight of A inhibitory connections 
mean ini. weight of B excitatory connections 
mean ini. weight of B inhibitory connections 
m.ini. density of internal excitatory connections 
m.ini. density of internal inhibitory connections 
relative variation in initial weights 
bandwidth of internal excitatory connections 
bandwidth of internal inhibitory connections 
celerity of dynamic wave 
distance between the peaks of both waves 
base level of TF (=highest available weight) 
bandwidth of TF dynamic wave 
Threshold of stabilisation (pruning stop) 
bandwidth of DF dynamic wave 
base level of DF (total proba. of degeneration) 
probability of A alone in the stimulus set 
probability of B alone in the stimulus set 
probability of simultaneous s A and B 
3 RESULTS 
3.1 RESULTS FORMAT 
All Figures have the same format and summarize 40 runs per point unless otherwise 
specified. The top graph presents the mean slope of appearance of all 4 orders 
of complexity (see Table 1) on the y axis, as a function of different values of the 
experimentally manipulated parameter, on the x axis. The bottom left graph shows 
the mean slope for order 2, surrounded by a gray area one standard deviation below 
and above. The bottom right graph shows the mean slope for order 3, also with 
a 1-s.d. surrounding area. The slopes have not been normalized, and come from 
networks whose columns are 50 units high, so that a slope of 1.0 indicates that the 
number of such units increase in average by one unit per columns, ie, by 3 units 
552 T.P. REBOTIER, J. L. ELMAN 
across a 4-column network. Because there were very few if any "undefined" units, 
the slopes approximately sum to zero. Standard deviations are obtained pointwise, 
which gives a very conservative estimate of reliability. 
3.2 OVERALL ASPECT OF THE NETS 
Very few order 3 units were appearing. The mean slope with the "SYM" and 
"ASY" netwts was about 0.3 unit/column. As in Kertzsberg et al., units with 
identical function tended to appear in blocks. The "SYM" nets tend to have a 
general overlay of order 2 units, with sparse units of other orders appearing in the 
rightmost columns. The "ASY" nets manifest sharp transitions between order 0 
columns to the left and order 2 columns to the right, with sparse order 3 units 
almost exclusively in the rightmost column. 
3.3 CHANGE IN NETWORK WIDTH 
Figure 1 presents the results of 4000 runs for each of 4 networks with identical 
parameters ("SYM") except for the number of columns, which varied from 4 to 
10. This Figure allows to see how the other results of this paper scale when wider 
networks are being used. All slopes decrease in a nearly hyperbolic manner. Since 
the mean network of width N is embedded as the beginning of the mean network 
of width N+P, this suggests that the effects of the growth reach a ceiling between 
width 4 and 6, and that the lesser slopes of networks of width 6, 8 and 10 is due 
to averaging between high slope in the first 4 columns and little afterwards. All 
parameter sets gave similar results. In essence, this justifies our using only 4 columns 
for this study. 
"4 
g o 
 t  6 .___________.-8 
._ 
-'4 
 Network Width in Columns 
 '-' orderO 
m m m orr 1 
oer2 
oer3 
I o 
6 
, -4 ].....--""'- order 2 
0.2  order 3 
o. -__ 
0 3 
4 6 8 10 
Figure 1' Scaling of the effects with network width. 
3.4 CONFIRMATION OF SHRAGER AND JOHNSON'S RESULTS 
Although our focus on order 3 units, especially XOR, led us to choose anti-correlated 
inputs (A and B never simultaneously present, Pab=0), we ran an exploration along 
the simultaneity axis which allowed to confirm Shrager and Johnson's findings of 
increase in order 2 units when Pab=0.5. Figure 2 presents the results of varying 
Pab from 0 to 1. There is a regular increase of the order 2 slope with Pab, and 
that the value which Shrager and Johnson used (0.5) yields indeed a positive slope 
more than 3 s.d.'s above zero. The speed of the dynanfic wave of trophic factor, 
Explorations with the Dynamic Wave Model 553 
which was an important determinant of the emergence of complexity in Shrager and 
Jonshon's study, had no remarkable influence in ours. 
Pab 
0.5   ,  
o ' 
ol.5 Probabili .ty of having $inlultanou$ A and B in   
0 O. I 0.2 0.3 0.4 0.5 0.6 0.? 0.8 0.9 0.99 
' ordcr 0 
ordcr l 
ordcr 2 
ordcr 3 
1.5 - ordcr 2 
0.5 
0 
-0.5 
ol 
-1.5 - 
0.025 
0.02 
0.015 
0.01 
0.005 
0 
-0.005 
-0.01 
ordcr 3 
Figure 2' Influence of Simultaneity of Appearance of A and B 
3.5 DENSITY OF TROPHIC FACTOR 
The total amount of trophic factor distributed by the TF wave over a connection 
that does not die prematurely is one of the two factors of development, the other 
being pruning. A trophic factor of 1 means that the connection can at best double 
its strength (but will do so only if both neurons are in synchrony all the time). We 
explored amounts of trophic fators from 0 to 20 by increments of 2, with both the 
"SYM" and the "ASY" default parameter sets. The results show that for low levels 
of trophic factor (for 0 to 6) there is a linear increase of all effects with the TF level 
(see Figure 3 for "SYM"; in "ASY" both order 2 and order 3 slopes are positive 
and have the same linear-with-ceiling shape) . 
' -6 
10 
order 0 
order 1 
order 2 
order 3 
0.3 order 3 
0 I 
0 10 20 
Figure 3: Influence of Trophic Factor density for "SYM" 
3.6 PROBABILITY OF PRUNING 
Pruning is determined by a probability of connection death, which has been varied 
from 0 to 0.99 (See Figures 4 and 5). In the "ASY" network, pruning has only 
554 T.P. REBOTIER, J. L. ELMAN 
a small negative effect on the slope of appearance of order 2 units, but it weights 
heavily against appearance of order 3 units. In the "SYM" network, pruning causes 
a stronger slope of disappearance for order 2 units, but its influence on order 3 units 
is non-linear, with an inverse U shape. The ascending part of that curve suggests 
that order 3 units appear by pruning of order 2 units. However, there is a need for 
blocks of order 2 units around the blocks of order 3 units, which accounts for the 
second part of the inverted U curve. 
.= 20 
= 15 
[0 
0.2 0.4 0.6 
total probabilily of Pruning 
m m order 0 
- - ' order 1 
order 2 
order 3 
5 order 0.3 order 3 
-5 0.1 
-10 
-15 0 
-20 -o. 1 
Figure 4: Influence of Pruning probability for "SYM" 
_= 2o C 
o 
.; 
 -20  
0.2 0.4 0.6 0.8 
mm m---- mm total probabili' of Pruning  
mm mm order 0 
' '  order 1 
orr 2 
' order 3 
20 
15 
10 
5 
0 
order 2 
order 3 
0,6  
0,4 .'. 
-0. 
Figure 5: Influence of Pruning probability for "ASY" 
4 CONCLUSION 
There are two results of primary interest in these simulations. First, we find that, 
with several different parameter settings, non-linearly separable functions appear 
almost systematically in the right half of the network. This is the region of "late- 
learning" columns. The finding that a network trained under a Hebbian learning 
regime might learn the XOR function is unanticipated, since Hebbian learning in the 
general case will not learn uncorrelated patterns. Given the biological plausibility 
of Hebbian learning, it would be useful to discover a scheme by which such function 
coud be learned. The result obtained here suggests one such plausible scheme. The 
Explorations with the Dynamic Wave Model 555 
"high-order" XOR function can be composed of two lower-order functions (e.g., 
AND and OR; NAND and NOR). The effect of forcing learning to follow a spa- 
riotemporal gradient is that early learning units extract low-order features from 
the input. These early learning units then provide an additional source of input to 
late-learning units, which learn the XOR runtion. 
The second result is that this learning scheme suggests one account by which brain 
organization might be produced, not through explicit assignment of function to 
different spatial regions, but as an emergent result of a developmental process. Thus, 
the current findings support Shager and Johnson's hypothesis that a dynamical wave 
model might lead to such organization. The question we are currently pursuing is 
whether complex inputs (e.g., inputs with spatial structure) might be decomposed 
in a similar manner, with early learning regions extracting simpler features, and 
later learning regions extracting higher order, more abstract features. 
Acknowledgement s 
We want to thank Jeff Shrager and Mark Johnson for their help and cooperation. 
This work was supported by a contract to the second author from the Office of 
Naval Research, contract N00014-93-1-0194. 
References 
Goldberg DE, 1989, "Genetic Algorithms", Addison-Wesley 
Kerzsberg M, Dehaene S, and Changeux JP, 1992, "Stabilization of complex input- 
output functions in neural clusters formed by synapse selection", Neural Networks, 
5:403-413 
Linsker R, 1986.a, "From basic network principles to neural architecture: emergence 
of spatial-opponent cells", Proceedings of the National Academy of Science USA, 
83:7508-7512 
Linsker R, 1986.b, "From basic network principles to neural architecture: emergence 
of orientation-selective cells", Proceedins of the National Academy of Science USA, 
83:8390-8394 
Linsker R, 1986.c, "From basic network principles to neural architecture: emergence 
of orientation columns", Proceedings of the National Academy of Science USA, 
83:8779-8783 
Miller KD, 1992. "Development of orientation columns via competition between on 
and off-center inputs", NeuroReport 3:73-76 
Miller KD, 1994, "A model for the development of simple cell receptive fields and 
the ordered arrangement of orientation columns through activity-dependant com- 
petition between on and off-center inputs", the Journal of Neuroscience, 14:409-441 
Press WH, 1988, "Numerical recipes in C: the art of scientific computing", Cam- 
bridge University Press, Cambridge, MASS 
Shrager J and Johnson MH, 1995, "Modeling the development of cortical function", 
in Kovacks I and Julesz B, Eds., "Maturational Windows ans Cortical Plasticity" 
(working title), The Santa Fe Institute Press, Santa Fe, NM 
yon der Marlsburg C, 1973, "Self-organisation of orientation sensitive cells in the 
striate cortex", Kybernetic, 14:85-100 
