Adaptive Synchronization of 
Neural and Physical Oscillators 
Kenji Doya 
University of California, San Diego 
La Jolla, CA 92093-0322, USA 
$huji Yoshizawa 
University of Tokyo 
Bunkyo-ku, Tokyo 113, Japan 
Abstract 
Animal locomotion patterns are controlled by recurrent neural networks 
called central pattern generators (GPGs). Although a GPG can oscillate 
autonomously, its rhythm and phase must be well coordinated with the 
state of the physical system using sensory inputs. In this paper we propose 
a learning algorithm for synchronizing neural and physical oscillators with 
specific phase relationships. Sensory input connections are modified by the 
correlation between cellular activities and input signals. Simulations show 
that the learning rule can be used for setting sensory feedback connections 
to a GPG as well as coupling connections between GPGs. 
1 
CENTRAL AND SENSORY MECHANISMS IN 
LOCOMOTION CONTROL 
Patterns of animal locomotion, such as walking, swimming, and flying, are generated 
by recurrent neural networks that are located in segmental ganglia of invertebrates 
and spinal cords of vertebrates (Barnes and Gladden, 1985). These networks can 
produce basic rhythms of locomotion without sensory inputs and are called central 
pattern generators (CPGs). The physical systems of locomotion, such as legs, fins, 
and wings combined with physical environments, have their own oscillatory char- 
acteristics. Therefore, in order to realize efficient locomotion, the frequency and 
the phase of oscillation of a GPG must be well coordinated with the state of the 
physical system. For example, the bursting patterns of motoneurons that drive a 
leg muscle must be coordinated with the configuration of the leg, its contact with 
the ground, and the state of other legs. 109 
1 10 Doya and Yoshizawa 
The oscillation pattern of a CPG is largely affected by proprioceptive inputs. It has 
been shown in crayfish (Siller et al., 1986) and lamprey (Grillnet et al, 1990) that the 
oscillation of a CPG is entrained by cyclic stimuli to stretch sensory neurons over a 
wide range of frequency. Both negative and positive feedback pathways are found in 
those systems. Elucidation of the function of the sensory inputs to GPGs requires 
computational studies of neural and physical dynamical systems. Algorithms for 
the learning of rhythmic patterns in recurrent neural networks have been derived by 
Doya and Yoshizawa (1989), Pearlmutter (1989), and Williams and Zipset (1989). 
In this paper we propose a learning algorithm for synchronizing a neural oscillator 
to rhythmic input signals with a specific phase relationship. 
It is well known that a coupling between nonlinear oscillators can entrainment their 
frequencies. The relative phase between oscillators is determined by the parameters 
of coupling and the difference of their intrinsic frequencies. For example, either 
in-phase or anti-phase oscillation results from symmetric coupling between neural 
oscillators with similar intrinsic frequencies (Kawato and Suzuki, 1980). Efficient 
locomotion involves subtle phase relationships between physical variables and motor 
commands. Accordingly, our goal is to derive a learning algorithm that can finely 
tune the sensory input connections by which the relative phase between physical 
and neural oscillators is kept at a specific value required by the task. 
2 LEARNING OF SYNCHRONIZATION 
We will deal with the following continuous-time model of a CPG network. 
d c s 
= + + v,,y,(t), 
j-m1 k=l 
(1) 
where x,(t) and g,(xi(t)) (i = 1,..., C) represent the states and the outputs of CPG 
neurons and y(t) (k = 1, ...,$) represents sensory inputs. We assume that the 
connection weights W = {wij ) are already established so that the network oscillates 
without sensory inputs. The goal of learning is to find the input connection weights 
V = {vii} that make the network state x(t) = (x(t), ...,xc(t))  entrained to the 
input signal y(t) = (y(t), ... ,ys(t))  with a specific relative phase. 
2.1 AN OBJECTIVE FUNCTION FOR PHASE-LOCKING 
The standard way to derive a learning algorithm is to find out an objective function 
to be minimized. If we can approximate the waveforms of xi(t) and y(t) by sine 
waves, a linear relationship 
x(t) = Py(t) 
specifies a phase-locked oscillation of x(t) and y(t). For example, if we have y = 
1 1 
sinwt and Y2 = coswt, then a matrix P = ( vrg)specifies  = x/sin(wt+r/4)and 
2 = 2 sin(wt + /3). Even when the wavefor are not sinusoidal, nimization of 
an objective function 
 1 {xiCt)- piYCt))  (2) 
Z(t): ilx(t)- Py(t)ll 2:  
i=1 k=l 
Adaptive Synchronization of Neural and Physical Oscillators 111 
determines a specific relative phase between x(t) and y(t). Thus we call P = {Pit } 
a phase-lock matrix. 
2.2 LEARNING PROCEDURE 
Using the above objective function, we will derive a learning procedure for phase- 
locked oscillation of x(t) and y(t). First, an appropriate phase-lock matrix P is 
identified while the relative phase between x(t) and y(t) changes gradually in time. 
Then, a feedback mechanism can be applied so that the network state x(t) is kept 
close to the target waveform P y(t). 
Suppose we actually have an appropriate phase relationship between x(t) and y(t), 
then the phase-lock matrix P can be obtained by gradient descent of E(t) with 
respect to Pit as follows (Widrow and Stearns, 1985). 
d OE(t) s 
Pit = --rl O-i = rl {xi(t) -- ZPisYS(t))yt(t)' 
j--1 
(3) 
If the coupling between x(t) and y(t) are weak enough, their relative phase changes 
in time unless their intrinsic frequencies are exactly equal and the systems are 
completely noiseless. By modulating the learning coefficient /by some performance 
index of the total system, for example, the speed of locomotion, it is possible to 
obtain a matrix P that satisfies the requirement of the task. 
Once a phase-lock matrix is derived, we can control x(t) close to P y(t) using the 
gradient of E(t) with respect to the network state 
s 
OE(t) = xi(t)- Zpityt(t). 
kml 
The simplest feedback algorithm is to add this term to the CPG dynamics as follows. 
d c s 
ri d-xi(t) = -xi(t) + Z wljgJ(xJ(t)) - a{xi(t) - Zpityt(t)}. 
j=l k=l 
The feedback gain a (> O) must be set small enough so that the feedback term 
does not destroy the intrinsic oscillation of the CPG. In that case, by neglecting the 
small additional decay term axi(t), we have 
d c $ 
rix,(t) = -xi(t) + Z wisgs(xs(t)) + Z apityt(t), 
j=l k=l 
(4) 
which is equivalent to the equation (1) with input weights vii = apit. 
112 
Doya and Yoshizawa 
3 DELAYED SYNCHRONIZATION 
We tested the above learning scheme on a delayed synchronization task; to find 
coupling weights between neural oscillators so that they synchronize with a specific 
time delay. We used the following coupled CPG model. 
c c 
-Zx (t) =  ri,y, (t), 
j=l k=l 
(5) 
y(t) = g(x(t)), (i = 1,...,C), 
where superscripts denote the indices of two CPGs (n = 1,2). The goal of learning 
was to synchronize the waveforms yl(t) and y(t) with a time delay AT. We used 
as the performance index. The learning coefficient U of equation (3) was modulated 
by the deviation of z(t) from its running average 2(t) using the following equations. 
,(t) = ,o (z(t) - (t)}, 
'" t (t) = -(t) + (t). 
(6) 
a 
b 
yl 
y2 
O. 0 4. 0 8. 0 12. 0 16. 0 20. 0 24. 0 28. 0 32. 0 
y2 '"' '" "" 
0. 0 4. 0 8. 0 12. 0 16. 
c 
y2 
d 
0.0 4.0 8.0 12.0 16.0 
0.0 4.0 8.0 12.0 16.0 
0.0 4.0 8.0 12.0 16.0 
Figure 1: Learning of delayed synchronization of neural oscillators. The dotted and 
solid curves represent y(t) and y(t) respectively. a:without coupling. b:AT - 0.0. 
c:AT - 1.0. c:AT = 2.0. d:AT = 3.0. 
Adaptive Synchronization of Neural and Physical Oscillators 113 
First, two CPGs were trained independently to oscillate with sinusoidal waveforms 
of period T = 4.0 and T9 = 5.0 using continuous-time back-propagation learning 
(Doya and Yoshizawa, 1989). Each CPG was composed of two neurons (C = 2) with 
time constants r = 1.0 and output functions g() = tanh(). Instead of following 
the two step procedure described in the previous section, the network dynamics (5) 
and the learning equations (3) and (6) were simulated concurrently with parameters 
a = 0.1, Uo = 0.2, and ra = 20.0. 
Figure 1 a shows the oscillation of two CPGs without coupling. Figures 1 b through 
e show the phase-locked waveforms after learning for 200 time units with different 
desired delay times. 
4 ZERO-LEGGED LOCOMOTION 
Next we applied the learning rule to the simplest locomotion system that in- 
volves a critical phase-lock between the state of the physical system and the motor 
command--a zero-legged locomotion system as shown in Figure 2 a. 
The physical system is composed of a wheel and a weight that moves back and 
forth on a track fixed radially in the wheel. It rolls on the ground by changing its 
balance with the displacement of the weight. In order to move the wheel in a given 
direction, the weight must be moved at a specific phase with the rotation angle of 
the wheel. The motion equations are shown in Appendix. 
First, a CPG network was trained to oscillate with a sinusoidal waveform of period 
T = 1.0 (Doya and Yoshizawa, 1989). The network consisted of one output and 
two hidden units (C = 3) with time constants ri = 0.2 and output functions gi( ) = 
tanh(). Next, the output of the CPG was used to drive the weight with a force 
f -- fmax g(Xl (t)). The position r and the velocity  of the weight and the rotation 
angle (cos0, sin0) and the angular velocity of the wheel J were used as sensory 
feedback inputs yi(t) (k = 1,..., 5) after scaling to [-1, 1]. 
In order to eliminate the effect of biases in x(t) and y(t), we used the following 
learning equations. 
(7) 
The rotation speed of the wheel was employed as the performance index z(t) after 
smoothing by the following equation. 
d 
r,z(t) = -z(t) + t(t). 
The learning coefficient /was modulated by equations (6). The time constants were 
rt = 4.0, ry = 1.0, rs = 1.0, and ra = 4.0. Each training run was started from a 
random configuration of the wheel and was finished after ten seconds. 
1 14 Doya and Yoshizawa 
a 
b 
_d2 
0.0 1.0 2.0 3.0 4.0 5.0 6.0 0.0 1.0 2.0 3.0 4.0 5.0 6.0 
c 
-O. 5 0.0 O. 5 
0.0 1.0 2.0 :3.0 4.0 5.0 6.0 0.0 1.0 2.0 :3.0 4.0 5.0 6.0 
-0.5 
0.0 O. 5 
Figure 2: Learning of zero-legged locomotion. 
Adaptive Synchronization of Neural and Physical Oscillators 115 
Figure 2 b is an example of the motion of the wheel without sensory feedback. 
The rhythms of the CPG and the physical system were not entrained to each other 
and the wheel wandered left and right. Figure 2 c shows an example of the wheel 
motion after 40 runs of training with parameters /0 = 0.1 and a = 0.2. At first, the 
oscillation of the CPG was slowed down by the sensory inputs and then accelerated 
with the rotation of the wheel in the right direction. 
We compared the patterns of sensory input connections made after learning with 
wheels of different sizes. Table I shows the connection weights to the output unit. 
The positive connection from sin 0 forces the weight to the right-hand side of the 
wheel and stabilize clockwise rotation. The negative connection from cos 0 with 
smaller radius fastens the rhythm of the CPG when the wheel rotates too fast and 
the weight is lifted up. The positive input from r with larger radius makes the 
weight stickier to both ends of the track and slows down the rhythm of the CPG. 
Table 1: Sensory input weights to the output unit (pt; k = 1,..., 5). 
radius 
2cm 
4cm 
6cm 
8cm 
10cm 
r  cos 0 sin 0 J 
0.15 -0.53 -1.35 1.32 0.07 
0.28 -0.55 -1.09 1.22 0.01 
0.67 -0.21 -0.41 0.98 0.00 
0.70 -0.33 -0.40 0.92 0.03 
0.90 -0.12 -0.30 0.93 -0.02 
5 DISCUSSION 
The architectures of CPGs in lower vertebrates and invertebrates are supposed to 
be determined by genetic information. Nevertheless, the way an animal utilizes the 
sensory inputs must be adaptive to the characteristics of the physical environments 
and the changing dimensions of its body parts. 
Back-propagation through forward models of physical systems can also be applied 
to the learning of sensory feedback (Jordan and Jacobs, 1990). However, learning of 
nonlinear dynamics of locomotion systems is a difficult task; moreover, multi-layer 
back-propagation is not appropriate as a biological model of learning. The learning 
rule (7) is similar to the covariance learning rule (Sejnowski and Stanton, 1990), 
which is a biological model of long term potentiation of synapses. 
Acknowledgements 
The authors thank Allen Se]verston, Peter Rowat, and those who gave comments 
to our poster at NIPS Conference. This work was partly supported by grants from 
the Ministry of Education, Culture, and Science of Japan. 
1 16 Doya and Yoshizawa 
References 
Barnes, W. J.P. & Gladden, M. H. (1985) Feedback and Motor Control in Inverte- 
brates and Vertebrates. Beckenham, Britain: Croom Helm. 
Doya, K. & Yoshizawa, S. (1989) Adaptive neural oscillator using continuous-time 
back-propagation learning. Neural Networks, 2, 375-386. 
Grillnet, S. & Matsushima, T. (1991) The neural network underlying locomotion in 
Lamprey--Synaptic and cellular mechanisms. Neuron, ?(July), 1-15. 
Jordan, M. I. & Jacobs, R. A. (1990) Learning to control an unstable system with 
forward modeling. In Touretzky, D. S. (ed.), Advances in Neural Information Pro- 
cessing Systems 2. San Mateo, CA: Morgan Kaufmann. 
Kawato, M. & Suzuki, R. (1980) Two coupled neural oscillators as a model of the 
circadian pacemaker. Journal of Theoretical Biology, 86,547-575. 
Pearlmutter, B. A. (1989) Learning state space trajectories in recurrent neural net- 
works. Neural Computation, 1,263-269. 
Sejnowski, T. J. & Stanton, P. K. (1990) Covariance storage in the Hippocampus. 
In Zornetzer, S. F. et al. (eds.), An Introduction to Neural and Electronic Networks, 
365-377. San Diego, CA: Academic Press. 
Siller, K. T., Skorupski, P., Elson, R. C., & Bush, M. H. (1986) Two identified 
afferent neurones entrain a central locomotor rhythm generator. Nature, 323,440- 
443. 
Widrow, B. & Stearns, S. D. (1985) Adaptive Signal Processing. Englewood Cliffs, 
N J: Prentice Hall. 
Williams, R. J. & Zipset, D. (1989) A learning algorithm for continually running 
fully recurrent neural networks. Neural Computation, 1,270-280. 
Appendix 
The dynamics of the zero-legged locomotion system: 
mR 2 sin 2 0) _ mg(cos 0 + mRsin20(r + Rcos 0) 
mi  - fo(l+ ;r Io ) 
+mRsin 0 v + 2m(r + R cos 0) t + mrt 2 
I0 ' 
I0' = -f0R sin 0 + mgsin O(r + Rcos 0) - (v + 2mb(r + R cos0))t), 
fo = fmax g(xl(t)) -- o'r 3 --//', 
Io = I + MR 2 + m(r + Rcos0) '. 
Parameters: the masses of the weight m = 0.2[kg] and the wheel M = 0.8[kg]; 
the radius of the wheel R = 0.02through0.1[m]; the inertial moment of the wheel 
I = MRS; the maximum force to the weight fmax = 5[N]; the stiffness of the 
limiter of the weight r = 20/R 3 [N/mS]; the damping coefficients of the weight 
motion/ = 0.2/R [N/(m/s)] and the wheel rotation v = O.05(M+m)R [N/(rad/s)]. 
