Optimal Sampling of Natural Images: A Design 
Principle for the Visual System? 
William Bialek, a'b Daniel L. Ruderman, a and A. Zee c 
aDepartment of Physics, and 
Department of Molecular and Cell Biology 
University of California at Berkeley 
Berkeley, California 94720 
bNEC Research Institute 
4 Independence Way 
Princeton, New Jersey 08540 
Clnstitute for Theoretical Physics 
University of California at Santa Barbara 
Santa Barbara, California 93106 
Abstract 
We formulate the problem of optimizing the sampling of natural images 
using an array of linear filters. Optimization of information capacity is 
constrained by the noise levels of the individual channels and by a penalty 
for the construction of long-range interconnections in the array. At low 
signal-to-noise ratios the optimal filter characteristics correspond to bound 
states of a SchrSdinger equation in which the signal spectrum plays the 
role of the potential. The resulting optimal filters are remarkably similar 
to those observed in the mammalian visual cortex and the retinal ganglion 
cells of lower vertebrates. The observed scale invariance of natural images 
plays an essential role in this construction. 
363 
364 Bialek, Ruderman, and Zee 
1 Introduction 
Under certain conditions the visual system is capable of performing extremely effi- 
cient signal processing [1]. One of the major theoretical issues in neural computation 
is to understand how this efficiency is reached given the constraints imposed by the 
biological hardware. Part of the problem [2] is simply to give an informative rep- 
resentation of the visual world using a limited number of neurons, each of which 
has a limited information capacity. The information capacity of the visual system 
is determined in part by the spatial transfer characteristics, or "receptive fields," of 
the individual cells. From a theoretical point of view we can ask if there exists an 
optimal choice for these receptive fields, a choice which maximizes the information 
transfer through the system given the hardware constraints. We show that this 
optimization problem has a simple formulation which allows us to use the intuition 
developed through the variational approach to quantum mechanics. 
In general our approach leads to receptive fields which are quite unlike those ob- 
served for cells in the visual cortex. In particular orientation selectivity is not a 
generic prediction. The optimal filters, however, depend on the statistical proper- 
ties of the images we are trying to sample. Natural images have a symmetry -- scale 
invariance [4] -- which saves the theory: The optimal receptive fields for sampling 
of natural images are indeed orientation selective and bear a striking resemblance 
to observed receptive field characteristics in the mammalian visual cortex as well as 
the retinal ganglion of lower vertebrates. 
2 General Theoretical Formulation 
We assume that images are defined by a scalar field 4(x) on a two dimensional 
surface with coordinates x. This image is sampled by an array of cells whose 
outputs Y are given by 
Y, = f d2xF(x- x,)(x) + r/m, (1) 
where the cell is loacted at site Xn, its spatial transfer function or receptive field is 
defined by F, and r/is an independent noise source at each sampling point. We will 
assume for simplicity that the noise source is Gaussian, with (r/2) = 2. Our task 
is to find the receptive field F which maximizes the information provided about  
by the set of outputs {Yn}. 
If the field  is itself chosen from a stationary Gaussian distribution then the infor- 
mation carried by the {Y. } is given by [3] 
1 ln 5. m + dk'(x-x)lP(k)12S(k) (2) 
I = 2 ln  (2) 2 ' 
where S(k) is the power spectrum of the signals, 
S(k) = f d2ye -ik'y ((x + y)(x)), (3) 
and P(k) = f 2e-k'xr(x) is the receptive field in momentum (Fourier) space. 
Optimal Sampling of Natural Images 365 
At low signal-to-noise ratios (large a 2) we have 
N f d2k 
I , 21n2er 2 (2)21P(k)125(k), (4) 
where N is the total number of cells. 
To make our definition of the noise level a meaningful we must constrain the total 
"gain" of the filters F. One simple approach is to normalize the functions F in the 
usual L 2 sense, 
d2xF2(x)- (2)21P(k)12 = 1. (5) 
If we imagine driving the system with spectrally white images, this condition fixes 
the total signal power psing through the filter. 
Even with normalization, optimization of information capacity is still not well- 
posed. To avoid pathologies we must constrain the scale of variations in k-space. 
This makes sense biologically since we know that sharp features in k-space can 
be achieved only by introducing long-range interactions in real space, and cells in 
the visual system typically have rather local interconnections. We implement this 
constraint by introducing a penalty proportional to the mean square spatial extent 
of the receptive field, 
(2)e ]V,(k) (6) 
With all the constraints we find that, at low signal to noise ratio, our optimization 
problem becomes that of minimizing the functional 
f d2k 1/ d2k 
C['] = (1/2)a (2.)21V(k)12 - 21n2o. 2 
f d2k 12 
- A (2r)2 IP(k) , (7) 
where A is a Lagrange multiplier and c measures the strength of the locality con- 
straint. The optimal filters are then solutions of the variational equation, 
aVp(k) 1 
2 21n22-$(k)(k) = A(k). (8) 
We recognize this as the SchrSdinger equation for a particle moving in k-space, in 
which the mass M = h2/o, the potential V(k) = -$(k)/21n2a 2, and A is the 
energy eigenvalue. Since we are interested in normalizable F, we are restricted to 
bound states, and the optimal filter is just the bound state wave function. 
There are in general several optimal filters, corresponding to the different bound 
states. Each of these filters gives the same value for the total cost function C'[] 
and hence is equally "good" in this context. Thus each sampling point should be 
served by a set of filters rather than just one. Indeed, in the visual cortex one finds 
a given region of the visual field being sampled by many cells with different spatial 
frequency and orientation selectivities. 
366 Bialek, Ruderman, and Zee 
3 A Near-Fatal Flaw and its Resolution 
If the signal spectra $(k) are isotropic, so that features appear at all orientations 
across the visual field, all of the bound states of the corresponding SchrSdinger 
equation are eigenstates of angular momentum. But real visual neurons have recep- 
tive fields with a single optimal orientation, not the multiple optima expected if the 
filters F correspond to angular momentum eigenstates. One would like to combine 
different angular momentum eigenfunctions to generate filters which respond to lo- 
calized regions of orientation. In general, however, the different angular momenta 
are associated with different energy eigenvalues and hence it is impossible to form 
linear combinations which are still solutions of the variational problem. 
We can construct receptive fields which are localized in orientation if there is some 
extra symmetry or accidental degeneracy which allows the existence of equal-energy 
states with different angular momenta. If we believe that real receptive fields are the 
solutions of our variational problem, it must be the case that the signal spectrum 
$(k) for natural images possesses such a symmetry. 
Recently Field [4] has measured the power spectra of several natural scenes. As 
one might expect from discussions of "fractal" landscapes, these spectra are scale 
invariant, with $(k) = A/Ikl 2. It is easy to see that the corresponding quantum 
mechanics problem is a bit sick -- the energy is not bounded from below. In the 
present context, however, this sickness is a saving grace. The equivalent SchrSdinger 
equation is 
= 
- - 2 In 221kl 
If we take q = ( 2--)k, then for bound states (A < 0) we find 
B 
+ = 
(10) 
with B = A/ln2r 2. Thus we see that the energy A can be scaled away; there 
is no quantization condition. We are free to choose any value of A, but for each 
such value there are several angular momentum states. Since they correspond to 
the same energy, superpositions of these states are also solutions of the original 
variational problem. The scale invariance of natural images is the symmetry we 
need in order to form localized receptive fields. 
4 Predicting Receptive Fields 
To solve Eq. (9) we find it easier to transform back to real space. The result is 
_2, O2 F  OF 
r2(1 + r }-r 2 + r(1 + 5r )rr + [r2(4 + B + 02/Oq 2) + 02/Oq2]F = O, 
(11) 
where b is the angular variable and r = ()lxl. Angular momentum states 
F,, ~ e " have the asymptotic F,,(r << 1) ~ r +", F,,(r >> 1) ~ r *i("), with 
A+ (m) = -2 4-x/rn a - B. We see that for rn a < B the solutions are oscillatory func- 
tions of r, since A has an imaginary part. For rn a > B + 4 the solution can diverge 
Optimal Sampling of Natural Images 367 
as r becomes large, and in this case we must be careful to choose solutions which 
are regular both at the origin and at infinity if we are to maintain the constraint in 
Eq. (5). Numerically we find that there are no such solutions; the functions which 
behave as r +I'll near the origin diverge at large r if m e > B + 4. We conclude 
that for a given value of B, which measures the signal-to-noise ratio, there exists a 
finite set of angular momentum states; these states can then be superposed to give 
receptive fields with localized angular sensitivity. 
In fact all linear combinations of m-states are solutions to the variational problem 
at low signal to noise ratio, so the precise form of orientation tuning is not deter- 
mined. If we continue our expansion of the information capacity in powers of the 
signal-to-noise ratio we find terms which will select different linear combinations of 
the m-states and hence determine the precise orientation selectivity. These higher- 
order terms, however, involve multi-point correlation functions of the image. At the 
lowest SNR, corresponding to the first term in our expansion, we are sensitive only 
to the two-point function (power spectrum) of the signal ensemble, which carries 
no information about angular correlations. A truly predictive theory of orientation 
tuning must thus rest on measurements of angular correlations in natural images; 
as far as we know such measurements have not been reported. 
Even without knowing the details of the higher-order correlation functions we can 
make some progress. To begin, it is clear that at very small B orientation selectivity 
is impossible since there are only m = 0 solutions. This is the limit of very low SNR, 
or equivalently very strong constraints on the locality of the receptive field (large 
c above). The circularly symmetric receptive fields that one finds in this limit are 
center-surround in structure, with the surround becoming more prominent as the 
signal-to-noise ratio is increased. These predictions are in qualitative accord with 
what one sees in the mammalian retina, which is indeed extremely local -- receptive 
field centers for foveal ganglion cells may consist of just a single cone photoreceptor. 
As one proceeds to the the cortex the constraints of locality are weaker and orien- 
tation selectivity becomes possible. Similarly in lower vertebrates there is a greater 
range of lateral connectivity in the retina itself, and hence orientation selectivity is 
possible at the level of the ganglion cell. 
To proceed further we have explored the types of receptive fields which can be 
produced by superposing m-states at a given value of B. We consider for the 
moment only even-symmetric receptive fields, so we add all terms in phase. One 
such receptive field is shown in Fig. 1, together with experimental results for a 
simple cell in the primary visual cortex of monkeys [5]. It is clear that we can obtain 
reasonable correspondence between theory and experiment. Obviously we have 
made no detailed "fit" to the data, and indeed we are just beginning a quantitative 
comparison of theory with experiment. Much of the arbitrariness in the construction 
of Fig. I will be removed once we have control over the higher terms in the SNR 
expansion, as described above. 
It is interesting that, at low SNR, there is no preferred value for the length scale. 
Thus the optimal system may choose to sample images at many different scales and 
at different scales in different regions of the image. The experimental variability in 
spatial frequency tuning from cell to cell may thus not represent biological sloppiness 
but rather the fact that any peak spatial frequency constitutes an optimal filter in 
the sense defined here. 
368 
Bialek, Ruderman, and Zee 
Figure l: Model (left3 and monkey (right) receptive fieldS. 
reference [5]. 
Monkey IF is from 
spatiM frequency are among 
,5 Discussion  -,ion and surprisingly there have been 
  _e,ronS for orleno- ., 
[ectivity of corUCa_2'.h visual system. o perspective. One 
The pe  .n [cts uo .... rom some theoretic[ stage 
the best *' 
mmnY attempts to derive these features preprOCeSSing 
proach is to rgue thm[ such selectivity provides nturl observed organization 
computations- A very different view is that the does no 
more complex consequence of developmental rules, but this pproach organiza- 
o[ the cortex is a which mgY be expressed by cortical 
address he computational [unc[ion possibility ha[ cor[ical recepuve 
can be predicted from variaionai 
tion. Finally several authors have considered he o[hesis; he issue is whether 
is i[ hyp .  rinCpe- 
fields are in some sense optimal, so that they . 
we have adopted h _,,;oular vara[onak P . 
principle [6, 7, 8. C[[[  rumen for any p'- o apply m 
natural principle 
one can make a compelhn -  like  very . , his principle must be 
o[ information capacity seems have emphxze .m ee saUstxcs. 
Optimizaio f visual processing. s we cons[rain[s and -[ ,..a Differen 
the early st.ag a knowledge of hataware for the 
especially ' 
suppieente   _ 
uthors hmve mde derent choic :  'on of iniortlo 
[erent . ver, may be related  -. '--t trou  L o aproblem 
[ormUlat?n,  the receptive fields is equtVatet , , 
fixed information rans[er 
some fixea s redundancy at iven very successful 
minimization of the ....... a, nroach h g. _, and monkey, 
_ -. _ al This ia r ... fidds m c  ..... hope 
marion, [o nd eanCa t-' ,. - 11 recepUw :A It ts 
discussed by Atick--- cure o[ gangttn_tor o be determ-,' given 
oredictions for he s[ru .. ..... , aram .... s problems cn be 
stig some artr'Jsc[utiOn s to variafionM 
kithough there are fields  
that these ide o[ receptive 
Optimal Sampling of Natural Images 369 
more detailed tests in the lower vertebrate retinas, where it is possible to charac- 
terize signals and noise at each of three layers of processing cicuitry. 
As far as we know our work is unique in that the statistics of natural images, is 
an essential component of the theory. Indeed the scale invariance of natural im- 
ages plays a decisive role in our prediction of orientation selectivity; other classes 
of signals would result in qualitatively different receptive fields. We find this direct 
linkage between the properties of natural images and the architecture of natural 
computing systems to be extremely attractive. The semi-quantitative correspon- 
dence between predicted and observed receptive fields (Fig. 1) suggests that we 
have the kernel of a truly predictive theory for visual processing. 
Acknowledgements 
We thank K. DeValois, R. DeValois, J. D. Jackson, and N. Socci for helpful discus- 
sions. Work at Berkeley was supported in part by the National Science Foundation 
through a Presidential Young Investigator Award (to WB), supplemented by funds 
from Sun Microsystems and Cray Research, and by the Fannie and John Hertz 
Foundation through a graduate fellowship (to DLR). Work in Santa Barbara was 
supported in part by the NSF through Grant No. PHY82-17853, supplemented by 
funds from NASA. 
References 
[1] W. Bialek. In E. Jen, editor, 1989 Lectures in Complex Systems, SFI Studies 
in the Sciences of Complexity, Lect. Vol. II, pages 513-595. Addison-Wesley, 
Menlo Park, CA, 1990. 
[2] H. B. Barlow. In W. A. Rosenblith, editor, Sensory Communication, page 217. 
MIT Press, Cambridge, MA, 1961. 
[3] C. E. Shannon and W. Weaver. The Mathematical Theory of Communication. 
University of Illinois Press, Urbana, IL, 1949. 
[4] D. Field. J. Opt. Soc. Am., 4:2379, 1987. 
[5] M. A. Webster and R. L. DeValois. J. Opt. Soc. Am., 2:1124-1132, 1985. 
[6] B. Sakitt and H. B. Barlow. Biol. Cybern., 43:97-108, 1982. 
[7] R. Linsker. In D. Touretzky, editor, Advances in Neural Information Processing 
1, page 186. Morgan Kaufmann, San Mateo, CA, 1989. 
[8] J. J. Atick and A. N. Redlich. Neural Computation, 2:308, 1990 
