Information through a Spiking Neuron 
Charles F. Stevens and Anthony Zador 
Salk Institute MNL/S 
La Jolla, CA 92037 
zador@salk.edu 
Abstract 
While it is generally agreed that neurons transmit information 
about their synaptic inputs through spike trains, the code by which 
this information is transmitted is not well understood. An upper 
bound on the information encoded is obtained by hypothesizing 
that the precise timing of each spike conveys information. Here we 
develop a general approach to quantifying the information carried 
by spike trains under this hypothesis, and apply it to the leaky 
integrate-and-fire (IF) model of neuronal dynamics. We formu- 
late the problem in terms of the probability distribution p(T) of 
interspike intervals (ISis), assuming that spikes are detected with 
arbitrary but finite temporal resolution. In the absence of added 
noise, all the variability in the ISis could encode information, and 
the information rate is simply the entropy of the ISI distribution, 
H(T) = (-p(T)log2p(T)) , times the spike rate. H(T) thus pro- 
vides an exact expression for the information rate. The methods 
developed here can be used to determine experimentally the infor- 
mation carried by spike trains, even when the lower bound of the 
information rate provided by the stimulus reconstruction method 
is not tight. In a preliminary series of experiments, we have used 
these methods to estimate information rates of hippocampal neu- 
rons in slice in response to somatic current injection. These pilot 
experiments suggest information rates as high as 6.3 bits/spike. 
1 Information rate of spike trains 
Cortical neurons use spike trains to communicate with other neurons. The output 
of each neuron is a stochastic function of its inpu.t from the other neurons. It is of 
interest to know how much each neuron is telling other neurons about its inputs. 
How much information does the spike train provide about a signal? Consider noise 
n(t) added to a signal s(t) to produce some total input y(t) = s(t) + n(t). This 
is then passed through a (possibly stochastic) functional .' to produce the output 
spike train .'[y(t)] -. z(t). We assume that all the information contained in the 
spike train can be represented by the list of spike times; that is, there is no extra 
information contained in properties such as spike height or width. Note, however, 
that many characteristics of the spike train such as the mean or instantaneous rate 
76 C. STEVENS, A. ZADOR 
can be derived from this representation; if such a derivative property turns out to 
be the relevant one, then this formulation can be specialized appropriately. 
We will be interested, then, in the mutual information I(S(t); Z(t)) between the 
input signal ensemble S(t) and the output spike train ensemble Z(t). This is defined 
in terms of the entropy H(S) of the signal, the entropy H(Z) of the spike train, 
and their joint entropy H(S, Z), 
I(S;Z) = H(S) + H(Z)- H(S,Z). (1) 
Note that the mutual information is symmetric, I(S; Z) = I(Z; $), since the joint 
entropy H(S, Z) = H(Z, S). Note also that if the signal S(t) and the spike train 
Z(t) are completely independent, then the mutual information is 0, since the joint 
entropy is just the sum of the individual entropie H(S,Z) - H(S)+ H(Z). This is 
completely in line with our intuition, since in this case the spike train can provide 
no information about the signal. 
1.1 Information estimation through stimulus reconstruction 
Bialek and colleagues (Bialek et al., 1991) have used the reconstruction method 
to obtain a strict lower bound on the mutual information in an experimental set- 
ting. This method is based on an expression mathematically equivalent to eq. (1) 
involving the conditional entropy H(SIZ ) of the signal given the spike train, 
(S;Z) = (S)- (SlZ) 
>_ H(S)- Hest(SlZ), (2) 
where Hest(SIZ ) is an upper bound on the conditional entropy obtained from a 
reconstruction Ses t (t) of the signal. The entropy is estimated from the second order 
statistics of the reconstruction error e(t)  s(t)-Ses t (t); from the maximum entropy 
property of the Gaussian this is an upper bound. Intuitively, the first equation says 
that the information gained about the spike train by observing the stimulus is just 
the initial uncertainty of the signal (in the absence of knowledge of the spike train) 
minus the uncertainty that remains about the signal once the spike train is known, 
and the second equation says that this second uncertainty must be greater for any 
particular estimate than for the optimal estimate. 
1.2 Information estimation through spike train reliability 
We have adopted a different approach based an equivalent expression for the mutual 
information: 
(S; Z) = /(Z) - /(ZlS ). (a) 
The first term H(Z) is the entropy of the spike train, while the second H(ZIS) 
is the conditional entropy of the spike train given the signal; intuitively this like 
the inverse repeatability of the spike train given repeated applications of the same 
signal. Eq. (3) has the advantage that, if the spike train is a deterministic function 
of the input, it permits exact calculation of the mutual information. This follows 
from an important difference between the conditional entropy term here and in eq. 
2: whereas H(SIZ ) has both a deterministic and a stochastic component, H(ZIS ) 
has only a stochastic component. Thus in the absence of added noise, the discrete 
entropy H(ZI$ ): 0, and eq. (3) reduces to I($; Z) - H(Z). 
If ISis are independent, then the H(Z) can be simply expressed in terms of the 
entropy of the (discrete) ISI distribution p(T), 
H (T) = -  p(Ti ) log p(T/) (4) 
i=0 
Information Through a Spiking Neuron 77 
as H(Z) = nil(T), where n is the number of spikes in Z. Here v(T/) is the prob- 
ability that the spike occurred in the interval (/)At to (i + 1)At. The assumption 
of finite timing precision At keeps the potential information finite. The advantage 
of considering the ISI distribution p(T) rather than the full spike train distribution 
p(Z) is that the former is univariate while the latter is multivariate; estimating the 
former requires much less data. 
Under what conditions are ISis independent? Correlations between ISis can arise 
either through the stimulus or the spike generation mechanism itself. Below we shall 
guarantee that correlations do not arise from the spike-generator by considering the 
forgetful integrate-and-fire (IF) model, in which all information about the previous 
spike is eliminated by the next spike. If we further limit ourselves to temporally 
uncorrelated stimuli (i.e. stimuli drawn from a white noise ensemble), then we can 
be sure that ISis are independent, and eq. (4) can be applied. 
In the presence of noise, H(ZIT ) must also be evaluated, to give 
I(S; T) = H(T)- H(TIS ). (5) 
H(TIS ) is the conditional entropy of the ISI given the signal, 
H(TIS)=-(p(Tj,si(t))log2p(Tj,si(t)) ) (6) 
j=l si(t) 
where p(Tjlsi(t)) is the probability of obtaining an ISI of Tj in response to a par- 
ticular stimulus si(t) in the presence of noise n(t). The conditional entropy can be 
thought of as a quantification of the reliability of the spike generating mechanism: 
it is the average trial-to-trial variability of the spike train generated in response to 
repeated applications of the same stimulus. 
1.3 Maximum spike train entropy 
In what follows, it will be useful to compare the information rate for the IF neuron 
with the limiting case of an exponential ISI distribution, which has the maximum 
entropy for any point process of the given rate (Papoulis, 1984). This provides an 
upper bound on the information rate possible for any spike train, given the spike 
rate and the temporal precision. Let f(T) = e --r be an exponential distribution 
with a mean spike rate . Assuming a temporal precision of At, the entropy/spike 
is H(T) = log2 -, and the entropy/time for a rate  is H(T) = 1og -t' 
For example, if  = 1 Hz and At = 0.001 sec, this gives (11.4 bits/second) (1 
spike/second) = 11.4 bits/spike. That is, if we discretize a I Hz spike train into 
I msec bins, it is not' possible for it to transmit more than 11.4 bits/second. If 
we reduce the bin size two-fold, the rate increases by log 1/2 = I bit/spike to 
12.4 bits/spike, while if we double it we lose one bit/s to get 10.4 bit/s. Note 
that at a different firing rate, e.g.  = 2 Hz, halving the bin size still increases 
the entropy/spike by I bit/spike, but because the spike rate is twice as high, this 
becomes a 2 bit/second increase in the information rate. 
1.4 The IF model 
Now we consider the functional .' describing the forgetful leaky IF model of spike 
generation. Suppose we add some noise n(t) to a signal s(t), y(t) - n(t) + s(t), 
and threshold the sum to produce a spike train z(t) - .'[s(t) + n(t)]. Specifically, 
suppose the voltage v(t) of the neuron obeys )(t) - -v(t)/r + y(t), where r is the 
membrane time constant, both s(Q and n(t) have a white Gaussian distributions 
and y(t) has mean F and variance er . If the voltage reaches the threshold 00 at some 
time t, the neuron emits a spike at that time and resets to the initial condition v0. 
78 C. STEVENS, A. ZADOR 
In the language of neurobiology, this model can be thought of (Tuckwell, 1988) as 
the limiting case of a neuron with a leaky IF spike generating mechanism receiving 
many excitatory and inhibitory synaptic inputs. Note that since the input y(t) is 
white, there are no correlations in the spike train induced by the signal, and since 
the neuron resets after each spike there are no correlations induced by the spike- 
generating mechanism. Thus ISis are independent, and eq. (4) can be applied. 
We will estimate the mutual information I(S, Z) between the ensemble of input 
signals S and the ensemble of outputs Z. Since in this model ISis are independent by 
construction, we need only evaluate H(T) and H(TIS); for this we must determine 
p(T), the distribution of ISis, and p(Tlsi), the conditional distribution of ISis for 
an ensemble of signals si(t). Note that p(T) corresponds to the first passage time 
distribution of the Ornstein-Uhlenbeck process (Tuckwell, 1988). 
The neuron model we are considering has two regimes determined by the relation 
of the asymptotic membrane potential (in the absence of threshold) ttr and the 
threshold 0. In the suprathreshold regime,/r > 0, threshold crossings occur even if 
the signal variance is zero (er 2 = 0). In the subthreshold regime, ttr _< 0, threshold 
crossings occur only if a 2 > 0. However, in the limit that E{T} >> v, i.e. the mean 
firing rate is low compared with the integration time constant (this can only occur 
in the subthreshold regime), the ISI distribution is exponential, and its coefficient 
of variation (CV) is unity (cf. (Softky and Koch, 1993)). In this low-rate regime the 
firing is deterministically Potsson; by this we mean to distinguish it from the more 
usual usage of Potsson neuron, the stochastic situation in which the instantaneous 
firing rate parameter (the probability of firing over some interval) depends on the 
stimulus (i.e.  cr s(t)). In the present case the exponential ISI distribution arises 
from a deterministic mechanism. 
At the border between these regimes, when the threshold is just equal to the asymp- 
totic potential, 00 = ttr, we have an explicit and exact solution for the entire ISI 
distribution (Sugiyama et al., 1970) 
(tr)(r/2)-a/2[,2r/T_ 1]-a/2exp(2T/r (a2r)(e2r/T_ 1)  
i0(T) ---- (271.) 1/20 ' -- ). (7) 
This is the special case where, in the absence of fluctuations (a 2 -- 0), the membrane 
potential hovers just subthreshold. Its neurophysiological interpretation is that the 
excitatory inputs just balance the inhibitory inputs, so that the neuron hovers just 
on the verge of firing. 
1.5 Information rates for noisy and noiseless signals 
Here we compare the information rate for a IF neuron at the "balance point" ttr - 0 
with the maximum entropy spike train. For simplicity and brevity we consider only 
the zero-noise case, i.e. n(t): 0. Fig. 1A shows the information per spike as a 
function of the firing rate calculated from eq. (7), which was varied by changing 
the signal variance er . We assume that spikes can be resolved with a temporal 
resolution of 1 msec, i.e. that the ISI distribution has bins 1 msec wide. The 
dashed line shows the theoretical upper bound given by the exponential distribution; 
this limit can be approached by a neuron operating far below threshold, in the 
Potsson limit. For both the IF model and the upper bound, the information per 
spike is a monotonically decreasing function of the spike rate; the model almost 
achieves the upper bound when the mean ISI is just equal to the membrane time 
constant. In the model the information saturates at very low firing rates, but for the 
exponential distribution the information increases without bound. At high firing 
rates the information goes to zero when the firing rate is too fast for individual ISis 
to be resolved at the temporal resolution. Fig. lB shows that the information rate 
(information per second) when the neuron is at the balance point goes through a 
Information Through a Spiking Neuron 79 
maximum as the firing rate increases. The maximum occurs at a lower firing rate 
than for the exponential distribution (dashed line). 
1.6 Bounding information rates by stimulus reconstruction 
By construction, eq. (3) gives an exact expression for the information rate in this 
model. We can therefore compare the lower bound provided by the stimulus recon- 
struction method eq. (2) (Bialek et al., 1991). That is, we can assess how tight 
a lower bound it provides. Fig. 2 shows the lower bound provided by the recon- 
struction (solid line) and the reliability (dashed line) methods as a function of the 
firing rate. The firing rate was increased by increasing the mean/ of the input 
stimulus y(t), and noise was set to 0. At low firing rates the two estimates are 
nearly identical, but at high firing rates the reconstruction method substantially 
underestimates the information rate. The amount of the underestimate depends on 
the model parameters, and decreases as noise is added to the stimulus. The tight- 
ness of the bound is therefore an empirical question. While Bialek and colleagues 
(1996) show that under the conditions of their experiments the underestimate is less 
than a factor of two, it is clear that the potential for underestimate under different 
conditions or in different systems is greater. 
2 Discussion 
While it is generally agreed that spike trains encode information about a neuron's 
inputs, it is not clear how that information is encoded. One idea is that it is the 
mean firing rate alone that encodes the signal, and that variability about this mean 
is effectively noise. An alternative view is that it is the variability itself that encodes 
the signal, i.e. that the information is encoded in the precise times at which spikes 
occur. In this view the information can be expressed in terms of the interspike 
interval (ISI) distribution of the spike train. This encoding scheme yields much 
higher information rates than one in which only the mean rate (over some interval 
longer than the typical ISI) is considered. Here we have quantified the information 
content of spike trains under the latter hypothesis for a simple neuronal model. 
We consider a model in which by construction the ISis are independent, so that the 
information rate (in bits/sec) can be computed directly from the information per 
spike (in bits/spike) and the spike rate (in spikes/sec). The information per spike 
in turn depends on the temporal precision with which spikes can be resolved (if 
precision were infinite, then the information content would be infinite as well, since 
any message could for example be encoded in the decimal expansion of the precise 
arrival time of a single spike), the reliability of the spike transduction mechanism, 
and the entropy of the ISI distribution itself. For low firing rates, when the neuron 
is in the subthreshold limit, the ISI distribution is close to the theoretically maximal 
exponential distribution. 
Much of the recent interest in information theoretic analyses of the neural code can 
attributed to the seminal work of Bialek and colleagues (Bialek et al., 1991; Rieke 
et al., 1996), who measured the information rate for sensory neurons in a number of 
systems. The present results are in broad agreement with those of DeWeese (1996), 
who considered the information rate of a linear-filtered threshold crossing x (LFTC) 
model. DeWeese developed a functional expansion, in which the first term describes 
the limit in which spike times (not ISis) are independent, and the second term is 
a correction for correlations. The LFTC model differs from the present IF model 
mainly in that it does not "reset" after each spike. Consequently the "natural" 
In the LFTC model, Gaussian signal and noise are convolved with a hnear filter; the 
times at which the resulting waveform crosses some threshold are called "spikes". 
8 0 C. STEVENS, A. ZADOR 
representation of the spike train in the LFTC model is as a sequence t0...t, of 
firing times, while in the IF model the "natural" representation is as a sequence 
Ti    Tn of ISIs. The choice is one of convenience, since the two representations are 
equivalent. 
The two models are complementary. In the LFTC model, results can be obtained for 
colored signals and noise, while such conditions are awkward in the IF model. In the 
IF model by contrast, a class of highly correlated spike trains can be conveniently 
considered that are awkward in the LFTC model. That is, the indendent-ISI condi- 
tion required in the IF model is less restrictive than the independent-spike condition 
of the LFTC model--spikes are independent iff ISis are indepenndent and the ISI 
distribution p(T) is exponential. In particular, at high firing rates the ISI distri- 
bution can be far from exponential (and therefore the spikes far from independent) 
even when the ISis themselves are independent. 
Because we have assumed that the input s(t) is white, its entropy is infinite, and the 
mutual information can grow without bound as the temporal precision with which 
spikes are resolved improves. Nevertheless, the spike train is transmitting only a 
minute fraction of the total available information. The signal thereby saturates the 
capacity of the spike train. While it is not at all clear whether this is how real 
neurons actually behave, it is not implausible: a typical cortical neuron receives as 
many as 104 synaptic inputs, and if the information rate of each input is the same as 
the target, then the information rate impinging upon the target is 10q-fold greater 
(neglecting synaptic unreliability, which could decrease this substantially) than its 
capacity. 
In a preliminary series of experiments, we have used the reliability method to esti- 
mate the information rate of hippocampal neuronal spike trains in slice in response 
to somatic current injection (Stevens and Zador, unpublished). Under these condi- 
tions ISis appear to be independent, so the method developed here can be applied. 
In these pilot experiments, an information rates as high as 6.3 bits/spike was ob- 
served. 
References 
Bialek, W., Rieke, F., de Ruyter van Steveninck, R., and Warland, D. (1991). 
Reading a neural code. Science, 252:1854-1857. 
DeWeese, M. (1996). Optimization principles for the neural code. In Hasselmo, 
M., editor, Advances in Neural Information Processing Systems, vol. 8. MIT 
Press, Cambridge, MA. 
Papoulis, A. (1984). Probability, random variables and stochastic processes, 2 nd 
edition. McGraw-Hill. 
Rieke, F., Warland, D., de Ruyter van Steveninck, R., and Bialek, W. (1996). Neural 
Coding. MIT Press. 
Softky, W. and Koch, C. (1993). The highly irregular firing of cortical cells is 
inconsistent with temporal integration of random epsps. J. Neuroscience., 
13:334-350. 
Sugiyama, H., Moore, G., and Perkel, D. (1970). Solutions for a stochastic model 
of neuronal spike production. Mathematical Biosciences, 8:323-341. 
Tuckwell, H. (1988). Introduction to theorelical neurobiology (2 vols.). Cambridge. 
Information Through a Spiking Neuron 81 
lO 
Information at balance point 
10  102 
10 3 
104 
1500 
1000 
10 o 10  102 103 104 
firing rate (Hz) 
Figure 1: Information rate at balance point. (A; top) The information per spike 
decreases monotonically with the spike rate (solid line). It is bounded above by 
the entropy of the exponential limit (dashed line), which is the highest entropy ISI 
distribution for a given mean rate; this limit is approached for the IF neuron in 
the subthreshold regime. The information rate goes to 0 when the firing rate is of 
the same order as the temporal resolution At. The information per spike at the 
balance point is nearly optimal when E{T}  r. (r - 50 msec; At -- 1 msec); 
(B; bottom) Information per second for above conditions. The information rate 
for both the balance point (solid curve) and the exponential distribution (dashed 
curve) pass through a maximum, but the maximum is greater and occurs at an 
higher rate for the latter. For firing rates much smaller than v, the rates are almost 
indistinguishable. (r - 50 msec; At -- 1 msec) 
30O 
2S0 
200 
150 
lOO 
50 
 'o 'o 'o 'o 'o e'o 
spike rate (Hz) 
o 7'0 so 
Figure 2: Estimating information by stimulus reconstruction. The information 
rate estimated by the reconstruction method solid line and the exact information 
rate dashed line are shown as a function of the firing rate. The reconstruction 
method significantly underestimates the actual information, particularly at high 
firing rates. The firing rate was varied through the mean input F. The parameters 
were: membrane time constant v = 20 msec; spike bin size At = 1 msec; signal 
2 0.8; threshold Q - 10. 
variance r s = 
