750 Koch, Bair, Harris, Horiuchi, Hsu and Luo 
Real- Time Computer 
Using Analog 
Vision and Robotics 
VLSI Circuits 
Christof Koch 
Wyeth Bair John G. Harris Timothy Horiuchi 
Andrew Hsu Jin Luo 
Computation and Neural Systems Program 
Caltech 216-76 
Pasadena, CA 91125 
ABSTRACT 
The long-term goal of our laboratory is the development of analog 
resistive network-based VLSI implementations of early and inter- 
mediate vision algorithms. We demonstrate an experimental cir- 
cuit for smoothing and segmenting noisy and sparse depth data 
using the resistive fuse and a 1-D edge-detection circuit for com- 
puting zero-crossings using two resistive grids with different space- 
constants. To demonstrate the robustness of our algorithms and 
of the fabricated analog CMOS VLSI chips, we are mounting these 
circuits onto small mobile vehicles operating in a real-time, labo- 
ratory environment. 
I INTRODUCTION 
A large number of computer vision algorithms for finding intensity edges, comput- 
ing motion, depth, and color, and recovering the 3-D shapes of objects have been 
developed within the framework of minimizing an associated "energy" functional. 
Such a variational formalism is attractive because it allows a priori constraints 
to be explicitly stated. The single most important constraint is that the physical 
processes underlying image formation, such as depth, orientation and surface re- 
flectance, change slowly in space. For instance, the depths of neighboring points on 
a surface are usually very similar. Standard regularization algorithms embody this 
smoothness constraint and lead to quadratic variational functionals with a unique 
global minimum (Poggio, Torre, and Koch, 1985). These quadratic functionals 
Real-Time Computer Vision and Robotics Using Analog VLSI Circuits 751 
(a) 
G 
G 
(b) 
3.1V 
Node 
Voltage 
3.0V 
I I I I I I I 
1 2 3 4 5 6 7 
Photoreceptor 
1 
Edge 
Output 
0 
Figure 1: (a) shows the schematic of the zero-crossing chip. The phototransistors 
logarithmically map light intensity to voltages that are applied via a conductance G 
onto the nodes of two linear resistive networks. The network resistances R and R2 
can be arbitrarily adjusted to achieve different space-constants. Transconductance 
amplifiers compute the difference of the smoothed network node vbltages and report 
a current proportional to that difference. The sign of current then drives exclusive-or 
circuitry (not shown) between each pair of neighboring pixels. The final output is 
a binary signal indicating the positions of the zero-crossings. The linear network 
resistances have been implemented using Mead's saturating resistor circuit (Mead, 
1989), and the vertical resistors are implemented with transconductance followers. 
(b) shows the measured response of a seven-pixel version of the chip to a bright 
background with a shadow cast across the middle three photoreceptors. The circles 
and triangles show the node voltages on the resistive networks with the smaller and 
larger space-constants, respectively. Edges are indicated by the binary output (bar 
chart at bottom) corresponding to the locations of zero-crossings. 
752 Koch, Bair, Harris, Horiuchi, Hsu and Luo 
can be mapped onto linear resistive networks, such that the stationary voltage dis- 
tribution, corresponding to the state of least power dissipation, is equivalent to 
the solution of the variational functional (Horn, 1974; Poggio and Koch, 1985). 
Smoothness breaks down, however, at discontinuities caused by occlusions or differ- 
ences in the physical processes underlying image formation (e.g., different surface 
reflectance properties). Detecting these discontinuities becomes crucial, not only 
because otherwise smoothness is incorrectly applied but also because the locations 
of discontinuities are often required for further image analysis and understanding. 
We describe two different approaches for finding discontinuities in early vision: (1) 
a 1-D edge-detection circuit for computing zero-crossings using two resistive grids 
with different space-constants, and (2) a 20 by 20 pixel circuit for smoothing and 
segmenting noisy and sparse depth data using the resistive fuse. 
Finally, while successfully demonstrating a highly integrated circuit on a stationary 
laboratory bench under controlled conditions is already a tremendous success, this 
is not the environment in which we ultimately intend them to be used. The jump 
from a sterile, well-controlled, and predictable environment such as that of the 
laboratory bench to a noisy and physically demanding environment of a mobile 
robot can often spell out the true limits of a circuit's robustness. In order to 
demonstrate the robustness and real-time performance of these circuits, we have 
mounted two such chips onto small toy vehicles. 
2 AN EDGE DETECTION CIRCUIT 
The zero-crossings of the Laplacian of the Gaussian, VaG, are often used for de- 
tecting edges. Marr and Hildreth (1980) discovered that the Mexican-hat shape 
of the X72(7 operator can be approximated by the difference of two Gaussians 
(DOG). In this spirit, we have built a chip that takes the difference of two resistive- 
network smoothings of photoreceptor input and finds the resulting zero-crossings. 
The Green's function of the resistive network, a decaying exponential, differs from 
the Gaussian, but simulations with digitized camera images have shown that the 
difference of exponentials (DOE) gives results nearly as good as the DOG. Further- 
more, resistive nets have a natural implementation in silicon, while implementing 
the Gaussian is cumbersome. 
The circuit, Figure la, uses two independent resistive networks to smooth the volt- 
ages supplied by logarithmic photoreceptors. The voltages on the two networks are 
subtracted and exclusive-or circuitry (not shown) is used to detect zero-crossings. In 
order to facilitate thresholding of edges, an additional current is computed at each 
node indicating the strength of the zero-crossing. This is particularly important 
for robust real-world performance where there will be many small (in magnitude 
of slope) zero-crossings due to noise. Figure lb shows the measured response of a 
seven-pixel version of the chip to a bright background with a shadow cast across the 
middle three photoreceptors. Subtracting the two network voltage traces shown at 
the top, we find two zero-crossings, which the chip correctly identifies in the binary 
output shown at the bottom. 
Real-Time Computer Vision and Robotics Using Analog VLSI Circuits 753 
(a) 
(b) 
300 , HRES 
I 
(hA) 
0 -vr USE 
-300-0.5 0.0 0.5 
AV (Volts) 
Figure 2: (a) Schematic diagram for the 20 by 20 pixel surface interpolation and 
smoothing chip. A rectangular mesh of resistive fuse elements (shown as rectangles) 
provide the smoothing and segmentation ability of the network. The data are given 
as battery values dij with the conductance G connecting the battery to the grid set 
to G = 1/2o '2, where o '2 is the variance of the additive Gaussian noise assumed to 
corrupt the data. (b) Measured current-voltage relationship for different settings 
of the resistive fuse. For a voltage of less than Va, across this two-terminal device, 
the circuit acts as a resistor with conductance ,X. Above Va,, the current is either 
abruptly set to zero (binary fuse) or smoothly goes to zero (analog fuse). We can 
continuously vary the I-V curve from the hyperbolic tangent of Mead's saturating 
resistor (HRES) to that of an analog fuse (Fig. 2b), effectively implementing a 
continuation method for minimizing the non-convex functional. The I-V curve of a 
binary fuse is also illustrated. 
754 Koch, Bair, Harris, Horiuchi, Hsu and Luo 
3 A CIRCUIT FOR SMOOTHING AND SEGMENTING 
Many researchers have extended regularization theory to include discontinuities. 
Let us consider the problem of interpolating noisy and sparse 1-D data (the 2-D 
generalization is straightforward), where the depth data di is given on a discrete 
grid. Associated with each lattice point is the value of the recovered surface fi 
and a binary line discontinuity i. When the surface is expected to be smooth 
(with a first-order, membrane-type stabilizer) except at isolated discontinuities, the 
functional to be minimized is given by: 
where a 2 is the variance of the additive Gaussian noise process assumed to corrupt 
the data di, and ,k and c are free parameters. The first term implements the 
piecewise smooth constraint: if all variables, with the exception of fi, fi+, and 
are held fixed and (fi+ - fi) 2 < , it is "cheaper" to pay the price ,k(fi+l - fi) 2 
and set i = 0 than to pay the larger price c; if the gradient becomes too steep, 
i = 1, and the surface is segmented at that location. The second term, with the 
sum only including those locations i where data exist, forces the surface f to be 
close to the measured data d. How close depends on the estimated magnitude of 
the noise, in this case on rr 2. The final surface f is the one that best satisfies the 
conflicting demands of piecewise smoothness and fidelity on the measured data. 
To minimize the 2-D generalization of eq. (1), we xnap the functional J onto the 
circuit shown in Fig. 2a such that the stationary voltage at every gridpoint then 
corresponds to fij. The cost functional J is interpreted as electrical co-content, 
the generalization of power for nonlinear networks. We designed a two-terminal 
nonlinear device, which we call a resistive fuse, to implement piecewise smoothness 
(Fig. 2b). If the magnitude of the voltage drop across the device is less than 
VT --' (0/) 1/2, the fuse acts as a linear resistor with conductance ,k. If VT is 
exceeded, however, the fuse breaks and the current goes to zero. The operation of 
the fuse is fully reversible. We built a 20 by 20 pixel fuse network chip and show 
its segmentation and smoothing performance in Figure 3. 
4 AUTONOMOUS VEHICLES 
Our goal--beyond the design and fabrication of analog resistive-network chips--is 
to build mobile testbeds for the evaluation of chips as well as to provide a systems 
perspective on the usefulness of certain vision algorithms. Due to the small size 
and power requirements of these chips, it is possible to utilize the vast resource of 
commercially available toy vehicles. The advantages of toy cars over robotic vehicles 
built for research are their low cost, ease of modification, high power-to-weight ratio, 
availability, and inherent robustness to the real-world. Accordingly, we integrated 
two analog resistive-network chips designed and built in Mead's laboratory onto 
small toy cars controlled by a digital microprocessor (see Figure 4). 
Real-Time Computer Vision and Robotics Using Analog VLSI Circuits 755 
(d) Node Number (e) Node Number (f) Node Number 
Figure 3: Experimental data from the fuse network chip. We use as input data a 
tower (corresponding to dij = 3.0 V) rising from a plane (corresponding to 2.0 V) 
with superimposed Gaussian noise. (a) shows the input with the variance of the 
noise set to 0.2 V, (b) the voltage output using the fuse configured as a saturating 
resistance, and (c) the output when the fuse elements are activated. (d), (e), 
and (f) illustrate the same behavior along a horizontal slice across the chip for 
a 2 = 0.4 V. We used a hardware deterministic algorithm of varying the fuse I-V 
curve of the saturating resistor to that of the analog fuse (following the arrow in 
Fig. 2b) as well as increasing the conductance A. This algorithm is closely related 
to other deterministic approximations based on continuation methods or a Mean 
Field Theory approach (Koch, Marroquin, and Yuille, 1986; Blake and Zisserman, 
1987; Geiger and Girosi, 1989). Notice that the amplitude of the noise in the last 
case (40% of the amplitude of the voltage step) is so large that a single filtering 
step on the input (d) will fail to detect the tower. Cooperativity and hysteresis 
are required for optimal performance. Notice the "bad" pixel in the middle of the 
tower (in c). Its effect is localized, however, to a single element. 
756 Koch, Bair, Harris, Horiuchi, Hsu and Luo 
Figure 4: The vehicle in the foreground, being right, utilizes the center-o[intensity 
chip for tracking light sources. The vehicle is being controlled by a MC68HC805B6, 
a micro-controller with on-board RAM, EEPROM, A/D converters, PWM (pulse 
width modulation) generators, and serial ports. The vehicle in the background 
utilizes the Silicon Retina for line/movement tracking. This vehicle has two retinas 
mounted (one being forwards and one being backwards) for the investigation of 
different steering options. The vehicle is controlled by a MC6802 nficroprocessor. 
The first vehicle uses a center-of-intensity chip (DeWeerth, 1988) to follow bright 
spots of light. The chip uses a 200 x 200 pixel array of photoreceptors with circuitry 
along the edges to compute the mean, or median, of light intensity. The car can 
successfully track a flashlight mounted on a human-driven radio-controlled car at 
approximately 1.5 n/sec. The second vehicle utilizes the Silicon Retina (Mahowald 
and Mead, 1989) to track edges and movement. The first of two behaviors tracks a 
black line of tape laid down on the floor of the laboratory at a maximum speed of 
approximately 1 m/sec. The retina is operated in the "edge-enhancement" mode 
(i.e., computing a difference-of-Gaussians-like transform) to find any high-contrast 
edges in a one-dimensional scan across the field of view. The second behavior tracks 
movement (temporal derivative of the intensity) in two dilnensions while "looking" 
down in front of the vehicle, thus acquiring crude, close-range, distance information. 
By dangling a small "bug-like" object in its field of view, the car maintains a 
Real-Time Computer Vision and Robotics Using Analog VLSI Circuits 757 
constant distance from the movement, a behavior reminiscent of a curious puppy 
playing with an insect. 
5 CONCLUSION 
By studying analog VLSI circuits, we are finding methodologies for building fast, 
cheap, low-power vision hardware that detects and utilizes discontinuities. Using 
sensing strategies that simultaneously take advantage of a robot's mobility and our 
real-time vision processing will allow the system to perform actions that assist in 
the acquisition of information, thereby transgressing some of the limitations of the 
simple sensors. As argued convincingly by Brooks (1987), we believe we can build 
mobile machines able to explore their environments relying on a small number of 
special-purpose analog circuits and avoid large, general-purpose and power-hungry 
digital processors. 
Acknowledgements 
We thank Misha Mahowald and Steve DeWeerth for making their chips available 
to us for mounting on our vehicles. Without Carver Mead, none of this research 
would be possible. Our laboratory is supported by the Office of Naval Research, as 
well as by Rockwell International and Hughes Aircraft Corporation. 
References 
Blake, A. and Zisserman, A. (1987) Visual Reconstruction. MIT Press: Cambridge, MA. 
Brooks, R. A. (1987) Proc. Workshop in Foundations of AI. Endicott House: Dedham, 
MA. 
DeWeerth, S. and Mead, C.A. (1988) A two-dimensional visual tracking array. In: Advanced 
Research in VLSI: Proceedings of the Fifth MIT Conference, pp. 259-275. MIT Press: 
Cambridge, MA. 
Geiger, D. and Girosi, F. (1989) Parallel and deterministic algorithms for MRFs: surface 
reconstruction and integration. AI Memo No. 1114, MIT, Cambridge, MA. 
Geman, S. and Geman, D. (1984) Stochastic relaxation, Gibbs distribution and the Bayesian 
restoration of images. IEEE Trans. Pattern Anal. Mach. InteII. 6:721-741. 
Harris, J. G., Koch, C., Luo, J. and Wyatt, J. (1989) Resistive fuses: analog hardware for 
detecting discontinuities in early vision. In: Analog VLSI Implementations of Neural 
Systems, Mead, C. and Ismail, M. (eds.), pp. 27-55. Kluwer: Norwell, MA. 
Horn, B. K. P. (1974) Determining lightness from an image. Computer Graphics and Image 
Processing 3:277-299. 
Koch, C., Marroquin, J., and Yuille, A. (1986) Analog "neuronal" networks in early vision. 
Proc. Natl. Acad. Sci. USA 83:4263-4267. 
Marr, D. and Hildreth, E.C. (1980) Theory of edge detection. Proc. Roy. Soc. Loud. B 
207:187-217. 
Mead, C. and Mahowald, M.A., (1988) A silicon model of early visual processing. Neural 
Networks 1:91-97. 
Mead, C. A. (1989) Analog VLSI and Neural Systems. Addison-Wesley: Reading, MA. 
Poggio, T. and Koch, C. (1985) Ill-posed problems in early vision: from computational 
theory to analogue networks. Proc. R. Soc. Loud. B 226:303-323. 
Poggio, T., Torre, V., and Koch, C. (1985) Computational vision and regularization theory. 
Nature 317:314-319. 
