Compact EEPROM-based Weight Functions 
A. Kramer, C. K. Sin, R. Chu, and P. K. Ko 
Department of Electrical Engineering and Computer Science 
University of California at Berkeley 
Berkeley, CA 94720 
Abstract 
We are focusing on the development of a highly compact neural net weight 
function based on the use of EEPROM devices. These devices have already 
proven useful for analog weight storage, but existing designs rely on the 
use of conventional voltage multiplication as the weight function, requiring 
additional transistors per synapse. A parasitic capacitance between the 
floating gate and the drain of the EEPROM structure leads to an unusual 
I-V characteristic which can be used to advantage in designing a compact 
synapse. This novel behavior is well characterized by a model we have 
developed. A single-device circuit results in a 1-quadrant synapse function 
which is nonlinear, though monotonic. A simple extension employing 2 
EEPROMs results in a 2 quadrant function which is much more linear. 
This approach offers the potential for more than a ten-fold increase in the 
density of neural net implementations. 
I INTRODUCTION- ANALOG WEIGHTING 
The recent surge of interest in neural networks and parallel analog computation has 
motivated the need for compact analog computing blocks. Analog weighting is an 
important computational function of this class. Analog weighting is the combining 
of two analog values, one of which is typically varying (the input) and one of which 
is typically fixed (the weight) or at least varying more slowly. The varying value 
is "weighted" by the fixed value through the "weighting function", typically mul- 
tiplications. Analog weighting is most interesting when the overall computational 
task involves computing the "weighted sum of the inputs." That is, to compute 
Ei--__l f(wi, vi) where f() is the weighting function and V = {wx, w2, ..., w,} and 
lOOl 
1002 Kramer Sin, Chu, and Ko 
V = {v, v2, ..., v,,} are the n-dimensional analog-valued weight and input vectors. 
This weighted sum is simply the dot product in the case where the weighting filnc- 
tion is nmltiplication. 
For large n, the only way to perform this computation efficiently is to use coinpact 
weighting functions and to take advantage of current summing. Using "conductive 
nmltiplication" as the weighting function (weights stored as conductances of single 
devices) results in an efficient implementation such as that shown in figure la. This 
implementation is probably optimal, but in practice it is not possible to implement 
sinall single-device programmable conductances which are linear. 
v 
I conventional 
multiplicatior 
W=bias 
I conventional 
storage 
V 
conventional IW 
multiplicatior 
const 
V 
I=f(W,V) 
(a) (b) (c) (d) 
Figure 1: Weighting function implementations: (a) ideal, (b) conventional, (c) 
EEPROM-based storage, (d) compact EEPROM-based nonlinear weight function 
1.1 CONVENTIONAL APPROACHES 
The problem of implementing analog weighting is often divided into the separate 
tasks of storing the fixed value (the weight) and combining the two analog values 
through the weighting function (figure lb). Conventional approaches to storing a 
fixed analog weight value are to use either digital storage with some forin of D/A 
conversion or to use volatile analog storage, which requires a large capacitor. Both 
of these storage technologies require a large area. 
The simplest and lnost widespread weighting function is lnultiplication [f(w,i) = 
wi]. Multiplication is attractive because of its mathematical and computational 
simplicity. Multiplication is also a fairly straightforward operation to implement in 
analog circuitry. When conventional technologies are used for weight storage, the 
additional area required to provide a multiplication function is not significant. Of 
course, the problem with this approach is that since a large area is required for 
weight storage, the result is not sufficiently compact. 
2 EEPROMS 
EEPROMs are "electrically erasable, programmable, read-only memories". They 
are essentially a JFET with a floating gate and a thin-oxide tunneling region be- 
tween the floating gate and the drain (figure 2). A sufficiently high field across 
the tunneling oxide will cause electrons to tunnel into or out of the floating gate, 
Compact EEPROM-based Weight Functions 1003 
effectively altering the threshold voltage of the device as seen fi'om the top gate. 
Normal operating (reading) voltages are sufficiently sinall to cause only insignificant 
"disturbance progranuning" of the charge on the floating gate, so an EEPROM can 
be viewed as a compact storage capacitor with a very long storage lifetime. 
tunnelin 
drain/ 
gate oxide 
source/ Ld rain 
tunneling oxide 
Figure 2: EEPROM layont and cross section 
Several gronps have found that charge leakage on EEPROMs is sufficiently small to 
guarantee that the threshold of a device can be retained with 4-8 bits of precision for 
a period of years [Kramer, 1989][Holier, 1989]. There are several drawbacks to the 
use of EEPROMs. Correct programming of t. hese devices to the desired value is hard 
to control and requires feedback. While the programming time for a single device 
is less than a millisecond, because devices must be programmed one-at-a-time, the 
time to prograin all the devices on a chip can be prohibitive. In addition, fabrication 
of EEPROMs is a non-standard process requiring several additional masks and the 
ability to make a thin tunneling oxide. 
2.1 EEPROM-BASED WEIGHT STORAGE 
The most straightforward manner to use an EEPROM in a weighting function is to 
store the weight with the device. For example, the threshold of an EEPROM device 
could be programmed to produce the desired bias current for an analog amplifier 
(figure lc). There are two advantages to this approach. Firstly, the weight storage 
mechanism is divorced from the actual weight function computation and hence 
places few constraints on it, and secondly, if the EEPROM is used in a static mode 
(all applied voltages are constant), the exact I-V characteristics of the EEPROM 
device are inconsequential. 
The major disadvantage of this approach is that of inefficiency, as additional cir- 
cuitry is needed to perform the weight function computation. An example of this 
can be seen in a recent EEPROM-based neural net implementation developed by 
the Intel corporation [Holler, 1989]. Though the weight value in this implementa- 
tion is stored on only two EEPROMs, an additional 4 transistors m'e needed for 
the multiplication function. In addition, though the circuit was designed to perform 
multiplication the output is not quite linear under the best of conditions and, under 
certain conditions, exhibits severe nonlinearity. Despite these limitations, this de- 
sign demonstrates the advantage of EEPROM storage technology over conventional 
approaches, as it is the most dense neural network implementation to date. 
1004 Kramer, Sin, Chu, and Ko 
3 EEPROM I-V CHARACTERISTICS 
Since linearity is difficult to implement and not a strict requirement of the weighting 
function, we have investigated the possibility of using the I-V characteristics of an 
EEPROM as the weight function. This approach has the advantage that a single 
device could be used for both weight storage and weight function computation, 
providing a very coinpact implementation. It is our hope that this approach will 
lead to useful synapses of less than 200urn 2 in area, less than a tenth the area used 
by the Intel synapse. 
Though an EEPROM is a JFET device, a parasitic capacitance of the structure 
results in an I-V characteristic which is unique. Conventional use of EEPROM 
devices in digital circuitry does not make use of this fact, so that this effect has 
not before been characterized or modeled. The floating gate of an EEPROM is 
controlled via capacitive coupling by the top gate. In addition, the thin-ox tunneling 
region between the floating gate and the drain creates a parasitic capacitor between 
these two nodes. Though the area of this drain capacitor is small relative to that of 
the top-gate floating-gate overlap area, the tunneling oxide is much thinner than the 
insulating oxide between the two gates, resulting in a significant drain capacitance 
(figure 3). 
We have developed a model for an EEPROM which includes this parasitic drain 
capacitance (figure 3). The basic contribution of this capacitance is to couple the 
floating-gate voltage to the drain voltage. This is most obvious when the device is 
saturated; while the current through a standard JFET is to first order independent 
of drain voltage in this region, in the case of an EEPROM, the current has a square 
law dependence on the drain voltage (equation 3). While this artifact of EEPROMs 
makes thein behave poorly as current sources, it may make them more useful as 
single-device weighting functions. 
floating 
] Cx '-- gate ---Cox  , 
cg_r_ a __f_ c ' 
EEPROM MODEL 
?/z  Cd 
Figure 3: EEPROM model and capacitor areas 
There are several ways to analyze our model depending on the level of accuracy 
desired [Sin, 1991]. We present here the results of simplest of these which captures 
the essential behavior of an EEPROM. This analysis is based on a linear channel 
approximation and the equations which result are similar in forin to those for a 
normal .]FET, with the addition of the dependence between the floating gate voltage 
and the drain voltage and all capacitive coupling factors. The equations for drain 
saturation voltage (Vs,a,), nonsaturated drain current (Is,,) and saturated drain 
current (Ics,,) are: 
Compact EEPROM-based Weight Functions 1005 
'"" = + c + c) -  2 C'o + c + c') (2) 
0.5Co. + 
ICp 
+ cv, - (Co. + c' + 
0.5Co. + C.q + C, 
(3) 
On EEPROM devices we have fabricated in house, our model matches measured 
I-V data well, especially in capturing the dependence of saturated drain current on 
drain voltage (figure 4). 
Vds 
Ids 
Vgs (uA) 
Ids 
).0(]  
140 (2(2 -- Measured / 
).oc ........ Simulated j 
100 O j I =4 
, /,....' ....../' 
. / _.... / =xv 
-..   =iV 
,o !n nn 1 nn ' nn  nn a rn 5,00 
=SV 
Vds(V) 
Figure 4: EEPROM I-V, lneasured and simulated. 
4 EEPROM-BASED WEIGHTING FUNCTIONS 
One way to make a compact weight function using an EEPROM is to use the device 
I-V characteristics directly. This could be accomplished by storing the weight as the 
device threshold voltage (Vt), applying the input value as the drain-source voltage 
(Vds) and setting the top gate voltage to a constant reference value (figure ld). 
In this case the synapse would look exactly like the I-V measuring circuit and the 
weighting function would be exactly the EEPROM I-V shown in figure 4, except 
that rather than leaving the threshold voltage fixed and varying the gate voltage, 
as was done to generate the curves shown, the gate voltage would be fixed to a 
constant value and different curves would be generated by programming the device 
threshold to different values. 
While extremely compact (a single device), this function is only a one quadrant 
function (both weight and input values must be positive or output is zero) and for 
1006 Kramer, Sin, Chu, and Ko 
many applications this is not sufficient. An easy way to provide a two-quadrant 
function based on a similar approach is to use two EEPROMs configured in a 
common-input, differential-output (lout = Ias+ -I_) scheme, as in the circuit 
depicted in figure 5. By programming the EEPROMs so that one is always active 
and one is always inactive, the output of the weight function can now be a "positive" 
or a "negative" current, depending on which device is chosen. Again, the weighting 
function is exactly the EEPROM I-V in this case. 
In addition to providing a two-quadrant filnction, this two-device circuit offers an- 
other interesting possibility. The same differential output scheme can be made to 
provide a much more linear two quadrant function if both "positive" and "negative 
devices are programmed to be active (negative thresholds). The "weight" in this 
case is the difference in threshold values between the two devices (W = Vt_ - l/t+). 
This scheme "subtracts" one device curve from the other. The model we have de- 
veloped indicates that this has the effect of canceling out much of the nonlinearity 
and restilts in a function which has three distinct regions, two of which are linear 
in the input voltage and the weight value. 
W = (Vt+ - Vt-) 
Iout= (Ids+ - Ids-) 
Vds = Vin 
Vmf = const (2.5V) 
Vref 
Measured 
........ Simulated -'" J 
0S0 
lout 
(uA) w:o 
o0o "- '"-""-"'-  9'=-1 
-050 -'"""'... 
-150 
0.00 1.00 2.00 3 00 4.00 5.00 
Vin (V) 
Figure 5: 2-quadrant, 2-EEPROM weighting function. 
The first of these linear regions occurs when both devices are active and neither 
is saturated (both devices modeled by equation 2). In this case, subtracting 
fi'om Ia+ cancels all nonlinearities and the differential is exactly the product of the 
input value (Va) and the weight (l/t_ - l/t+), with a scaling factor of Kp: 
I&+ - Ias_ = Kp Vas (Vt_ - Vt+) (4) 
The other linear region occurs when both devices are saturated (both modeled 
by equation 3). All nonlinearities also cancel in this case, but there is an offset 
remaining and the scaling factor is modified: 
Compact EEPROM-based Weight Functions 1007 
Ids + -- Ids_ 
(5) 
We have fabricated structures of this type and measured, as well as simulated their 
function characteristics. Measured data again agreed with our model (figtire 5). 
Note that the slope in this last region [scaling factor of KpCg/(O.5Cox + Ca + Ca)] 
will be strictly less that in the first region [scaling factor Kp]. The model indicates 
that one way to minimize this difference in slopes is to increase the size of the 
parasitic drain capacitance (Ca) relative to the gate capacitance (Ca). 
5 CONCLUSIONS 
While EEPROM devices have already proven useful for nonvolatile analog stor- 
age, we have discovered and characterized novel functional characteristics of the 
EEPROM device which should make thein useful as analog weighting functions. A 
parasitic drain-floating gate capacitance has been included in a model which accu- 
rately captures this behavior. Several coinpact nonlinear EEPROM-bascd weight 
functions have been proposed, including a single-device one-quadrant function and 
a more linear two-device two-quadrant function. Problems such as the usability of 
nonlinear weighting functions, selection of optimal EEPROM device parameters and 
potential fanout limitations of feeding the input into a low impedance node (drain) 
must all be resolved before this technology can be used for a full blown implementa- 
tion. Our model will be helpful in this work. The approach of using inherent device 
characteristics to build highly coinpact weighting functions promises to greatly im- 
prove the density and efficiency of massively parallel analog computation such as 
that performed by neural networks. 
Acknowledgement s 
Research sponsored by the Air Force Office of Scientific Research (AFSOR/JSEP) 
under Contract Number F49620-90-C-0029. 
References 
M. Holler, et. al., (1989) "An Electrically Trainable Artificial Neural Network 
(ETANN) with 10240 'Floating Gate' Synapses," Proceedings of the ICJNN-89, 
Washington D.C., 1989. 
A. Kramer, et. al, (1989) "EEPROM Device as a Reconfigurable Analog Element 
for Neural Networks," 1989 IEDM Technical Digest, Beaver Press, Alexandria, VA, 
Dec. 1989. 
C. K. Sin, (1990) EEPROM as an Analog Storage Element, Master's Thesis, Dept. 
of EECS, University of California at Berkeley, Berkeley, CA, Sept. 1990. 
