278 
THE HOPFIELD MODEL WITH MULTI-LEVEL NEURONS 
Michael Fleisher 
Depatmaent of Electrical Engineering 
Technion - Israel Institute of Technology 
Haifa 32000, Israel 
ABSTRACT 
The Hopfield neural network model for associative memory is generaEzed. The generalization 
replaces two state neurons by neurons taking a richer set of values. Two classes of neuron input output 
relations we developed guaranteeing convergence to stable states. The fu-st is a class of "continuous" rela- 
tions and the second is a class of allowed quantization rules for the neurons. The information capacity for 
networks from the second class is found to be of order S 3 bits for a network with S neurons. 
A generalization of the sum of outer products learning rule is developed and investigated as well. 
American Institute of Physics 1988 
279 
I. INTRODUCTION 
The ability to perform collective computation in a distributed system of flexible structure without 
global synchronization is an important engineering objective. Hopfield's neural network [1] is such a 
model of associative content addressable memory. 
An important property of the Hopfield neural network is its guaranteed convergence to stable states 
(interpreted as the stored memories). In this work we introduce a generalization of the Hopfield model by 
allowing the outputs of the neurons to take a richer set of values than Hopfield's original binary neurons. 
Sufficient conditions for preserving the convergence property are developed for the neuron input output 
relations. Two classes of relations are obtained. The first introduces neurons which simulate multi thres- 
hold functions, networks with such neurons will be called quantized neural networks (Q.N.N.). The secotad 
class introduces continuous neuron input output relations and networks with such neurons will be called 
continuous neural networks (C.N.N.). 
In Section II, we introduce Hopfield's neural network and show its convergence property. C.N.N. 
are introduced in Section HI and a sufficient condition for the neuron input output continuous relations is 
developed for preserving convergence. In Section IV, Q.N.N. are introduced and their input output rela- 
tions are analyzed in the same manner as in HI. In Section IV we look further at Q.N.N. by using the 
definition of information capacity for neural networks of [2] to obtain a tight asymptotic estimate of the 
capacity for a Qffq. N. with N neurons. Section VI is a generalized sum of outer products learning for the 
Q.N.N. and section VII is the discussion. 
II. THE HOPFIELD NEURAL NETWORK 
A neural network consists of N pairwise connected neurons. The i 'th neuron can be in one of two 
states: X i = -1 or X i = +1. The connections are fLxed real numbers denoted by Wij (the connection 
from neuron i to neuron j ). Define the state vector X to be a binary vector whose i 'th component 
corresponds to the state of the i 'th neuron. Randomly and asynchronously, each neuron examines its input 
and decides its next output in the following manner. Let t i be the threshold voltage of the i 'th neuron. If 
the weighted sum of the present other N-1 neuron outputs (which compose the i 'th neuron input) is 
280 
greater or equal to t i , the next X i (Xi +) is + 1, if not, Xi + is - 1. This action is given in (1). 
We give the following theorem 
N 
Xi+: sgn [ 2 WijXj-ti ] 
O) 
Theorem ,1, (of [1]) 
The network described with symmetric (llVij=W fi ) zero diagonal (Vii--0) connection matrix W 
has the convergence property. 
Pmo, ,,f. 
Define the quantity 
INN N 
E(x) =-  I; I; %xixj + I; tixi 
i j=l i=1 
(2) 
We show that E (X) can only de,,crea as a result of the action of the network. Suppose that X k changed 
to Xff = Xk +l, k , the resulting change in E is given by 
N 
=-axk ( E wkjxj-t) (3) 
j=l 
(Eq. (3) is correct because of the restrictions on W). The tetra in brackets is exactly the argument of the 
sgn function in (1) and therefore the signs of AX k and the tetra in brackets is the same (or Z k =0) and 
we get AE _< 0. Combining this with the fact that E (J.) is bounded shows that eventually the network 
will remain in a local minimum {fie (.). This completes the proof. 
The technique used in the proof of Theorem 1 is an important tool in analyzing neural networks. A 
network with a particular underlying E (X) function can be used to solve optimization problems with 
E (.) as the object of optimization. Thus we see another use of neural networks. 
281 
m. THE C.N.N. 
We ask ourselves the following question: How can we change the sgn function in (1) without affect- 
ing the convergence property? The new action rule for the i 'th neuron is 
N 
Xi+-- f i [ Z WijXj ] (4) 
j=l 
Our attention is focused on possible choices forfi ('). The following theorem gives a part of the answer. 
Theorem 2 
The network described by (4) (with symmetric zero diagonal W) has the convergence property if 
fi (') are strictly increasing and bounded. 
Proof 
Define 
1 N N N Xi 
E(.) =- -- Z WijXiXj 4-  I fi-l(tl)atl 
t j i=10 
(5) 
We show as before that E () can only decrease and since E is bounded (because of the boundedhess of 
fi 's) the theorem is proved. 
Xi 
Usingi(Xi)= I 
f i-l(u )du we have 
N 
i=l AXk 
] (6) 
Using the intermediale value theorem we get 
AE = - AXi [  WkjXj-gk(C )I = -AXi [ f ff X(Xk +AXi )-f ff l(C )] 
j=l 
(7) 
282 
where C is a point between X k and Xk+ k. Now, ff /k >0 we have 
C - X k +l( k = > f-l(c ) - f-l(x k +AX k ) and the term in brackets is greater or equal to zero 
"-> AE _<0. A similar argument holds for AX k < 0 (of course AX k :=0 => AE =0). This completes 
the proof. 
Some remarks: 
(a) Strictly increasing hounded neuron relalions are not the whole class of relations conserving the conver- 
gence property. This is seen immediately from the fact that Hopfield's original model (1) is not in' this 
class. 
Co) The E (X_) in the C.N.N. coincides with Hopfield's continuous neural network [3]. The difference 
between the two networks lies in the updating scheme. In our C.N.N. the neurons update their outputs at 
the moments they examine their inputs while in [3] the updating is in the form of a set of differential equa- 
tions featuring the time evolution of the network outputs. 
(c) The boundedness requirement of the neuron relations results from the boundedness of E (.). It is 
possible to impose further restrictions on W resulting in unbounded neuron relations but keeping E (X) 
hounded (from below). This was done in [4] where the neurons exhibit linear relations. 
IV. THE Q.N.N. 
We develop the class of quantization rules for the neurons, keeping the convergence property. 
Denote the set of possible neuron outputs by Yo < Y 1 < '.. < Yn and the set of threshold values by 
t 1 < t 2 < ' ' ' < tn the action of the neurons is given by 
N 
Xi +-1 if t I < Z Wijj -< tl+l /=O,...,n (8) 
j=! 
and t o = -o% t n + l = +oo. 
The following theorem gives a class of quantization rules with the convergence property. 
283 
Any quantization rule for the neurons which is an increasing step functioa that is 
Yo<Yi< '"Yn;tl< ... <t n 
Yields a network with the convergence property (with a g synunetric and zero diagonal). 
(9) 
Proof 
We proceed to prove. 
Define 
1N N N N 
E(X)=- I5 Z WijXiXj + tG(Xi)+ ZdXi 
i j=l i=1 i=1 
(lO) 
where G (X) is a piecewise linear convex U function defined by the relation 
G (YI)-G (YI-1) 
t +d=t I l=l,...,n (10 
Yi-YI_i 
As before we show AF _< 0. Suppose a change occurred in X k such thatX k =Yi_i,X;=Yi . We then 
have 
AE = -ZSX k 
[ - t 
j=l 
o(xD-c(xk) 
(12) 
A similar argument follows whenXk=Yi,X=Yi_ 1 < X k . Any bigger change inX k (from Yi to Yj 
with I i -j I > 1) yields the same result since it can be viewed as a sequence of I i -j I changes from Yi 
to Yj each resulting in AE -<0. The proof is completed by noting that ZSX k----O=-->AE =0 and E (X) is 
284 
Corollary 
Hopfield's original model is a special case of (9). 
V. INFORMATION CAPACITY OF THE Q.N.N. 
We use the dfinition of [2] for the information capacity of the Q.N.N. 
Definition 1 
The information capacity of the Q.N.N. (bits) is the log (Base 2) of the number of distinguishable 
networks of N neurons. Two networks are distinguishable if observing the state transitions of the neurons 
yields different observations. For Hopfield's original model it was shown in [2] that the capacity C of a 
network of N neurons is bounded by C < log (2(N-1)') 'v = 0(N3)b. It was also shown that 
C -> (N3)b and thus is exactly of the order N3b. It is obvious that in our case (which contains the 
original model) we must have C -> (N3)b as well (since the lower bound cannot decrease in this 
richer case). It is shown in the Appendix that the number of multi threshold functions of N-1 variables 
with n+l output levels is at most (n+l) N2+N+I since we have N neurons there will be 
( (n + ly v'+N +yv ti.ghe network thus 
C _(log ( (n+l)'V'-+v+f = 0(N3)b (14) 
o as before, C is exactly of 0(N 3)b. In fact, the rise in C is probably a factor of 0(log2n ) as can be 
seen from the upper bound. 
VI. "OUTER PRODUCT" LEARNING RULE 
For Hopfield's original network with two state neurons (taking the values 1) a natural and exten- 
sively investigated [ ],[ 1,[ ] learning mle is the so called sum of outer products construction. 
1 tc 
1--1 
where X 1 , X K re the desired stable states of the network. A well-known result for (15) is that the 
asymptotic capacity K of the network is 
285 
N-1 
K =  + 1 (16) 
41ogN 
In this section we introduce a natural generalization of (15) and prove a similar result for the asymp- 
totic capacity. We first limit the possible quantization rules to: 
X i = F (tl i ) =' 
Yo t l > Ui ->to 
Yn tn+l > ui -> tn 
(17) 
withYo < ''' < Yn 
to =-oo ; tn+l =oo 
with 
(a) n+l is even 
CO) V i Yi  O 
(C) Yi = - Yn-i 
Neat we state that the desired stable vectors X 1,   - X_ K are such that each component is picked 
independently at random from { Yo," ' YM } with equal probability. Thus, the K  N components of 
theX's are zero mean i.i.D random variables. Our modified learning rule is 
Wij = S Z Xl' (18) 
/=1 
Note that orXi  {+1, -1 } (18) is identical to (16). 
Define 
286 
A =max  
i,j IYj l 
We state that 
PROPOSITION: 
The asymptotic capacity of the above network is given by 
N 
16A 2 
 log N 
PROOF: 
(19) 
Define 
K vectors chosen randomly as 
P (K, N)= Pr[ are stable states with the W of 
described } 
() 
P(K,N)= 1 -Pr(L)A/j) > 1 -Pr(Aij) 
i =1,..., N (20) 
j=l ..... K 
where Aij is the event that the i th component of j th vector is in error. We concentrate on the event A 11 
W.L.G. 
The input U 1 when X ' is presented is given by 
N K-1 1 KN ,,. 
u= Z w,.xf =x +x +- Z Z x xf (2,) 
j=l N N /=2 j=2 XJ 
The first term is mapped by (17) into itseft and corresponds to the desired signal. 
The last term is a sum of (K-1)(N-1) i.i.D zero mean random variables and corresponds to 
noise. 
287 
K-1 X  is disposed of by assuming 
The middle term N 
O. (With a zero diagonal 
choice of W (using (18) with i ;j) this term does not appear). 
Pr(All)=Pr { noise gets us out of range } 
Denoting the noise by I we have 
Pt(All) <Pr(ll I > -) < 2exp - 
(K-1)(N-1)4A 2 
where the first inequality is from the der'tuition of AYand the second uses the lemma of [6] p. 58. We thus 
get 
P (K, N) > 1 - K  N  2exp{ - 
8(K-1)(N-1)A2 
substituting (19) and taking N --> oo we get P (K, N) --> 1 and this completes the proof. 
(23) 
VII. DISCUSSION 
Two classes of generalization of the Hopfield neural network model were presented. We give some 
remarks: 
(a) Any combination of neurons from the two classes will have the convergence property as well. 
(b) Our definition of the information capacity for the C.N.N. is useless since a full observation of the pos- 
sible state transitions of the network is impossible. 
288 
APPENDIX 
We prove the foUowing theorem. 
Theorem 
An upper bound on the number of malt/ threshold functions with N inputs and ff 
domain (out of (n +l)tV possible poinls) CN M is the solution of lhe recurrence relat/on 
Proof 
c#-' +. 
points in the 
Let us look on the N dimensional weight space W. Each input point X divides the weight space 
N 
and the theorem is proved. 
The solution of the recurrence in the case M=(n+l) N (all possible points) we have a bound on 
the number of mult/threshold functions of S variables equal to 
N-l 
i=l 
(n+l)N-1] ni < 
i - 
and the result used is established. 
C= C -I + n .C_ l 
v- M-I 
isC=(n+l) i{I i 
into n+l regions by n parallel hyporplanes  WiXi=t k k-1,...,n. We keep adding points in such 
i=l 
a way that the new tt hyperplanes corresponding to each added point partition the W space into as many 
regions as possible. Assume M-1 points have made CN M-I regions and we add the M 'th point. Each 
hyperplane (out of n) is divided into at most CNM_ 1 regiont (being itself an N-] dimensional space 
divided by (M-1)n hyperlines). We thus have after passing the n hyperplanes: 
289 
REFERENCES 
[I] Hopfield J. J., "Neural networks and physical systems with emergent collective computational abili- 
ties", Proc. Nat. Acad. Sci. USA, Vol. 79 (1982), pp. 2554-2558. 
[2] Abu-Mostafa Y.S. and Jacques J. St., "Inforotation capacity of the Hopfield model", IEEE Trans. on 
Info. Theory, Vol. IT-31 (1985, pp. 461-464. 
[3] Hopfield J. J., "Neurons with graded response have collective computational properties like those of 
two state neurons", Proc. Nat. Acad. Sci. USA, Vol. 81 (1984). 
[4] Fleisher M., "Fast processing of autoregressive signals by a neural network", to be presented at IEEE 
Conference, Israel 1987. 
[5] Levin, E., Private communication. 
[6] Petroy, "Sums of independent random variables". 
