Synaptic Transmission: An 
Information-Theoretic Perspective 
Amit Manwani and Christof Koch 
Computation and Neural Systems Program 
California Institute of Technology 
Pasadena, CA 91125 
email: quixoteklab.caltech. edu 
kochklab. caltech. edu 
Abstract 
Here we analyze synaptic transmission from an information-theoretic 
perspective. We derive closed-form expressions for the lower-bounds on 
the capacity of a simple model of a cortical synapse under two explicit 
coding paradigms. Under the "signal estimation" paradigm, we assume 
the signal to be encoded in the mean firing rate of a Poisson neuron. The 
performance of an optimal linear estimator of the signal then provides 
a lower bound on the capacity for signal estimation. Under the "signal 
detection" paradigm, the presence or absence of the signal has to be de- 
tected. Performance of the optimal spike detector allows us to compute 
a lower bound on the capacity for signal detection. We find that single 
synapses (for empirically measured parameter values) transmit informa- 
tion poorly but significant improvement can be achieved with a small 
amount of redundancy. 
1 Introduction 
Tools from estimation and information theory have recently been applied by researchers 
(Bialek et. al, 1991) to quantify how well neurons transmit information about their random 
inputs in their spike outputs. In these approaches, the neuron is treated like a black-box, 
characterized empirically by a set of input-output records. This ignores the specific nature 
of neuronal processing in terms of its known biophysical properties. However, a systematic 
study of processing at various stages in a biophysically faithful model of a single neuron 
should be able to identify the role of each stage in information transfer in terms of the pa- 
meters relating to the neuron's dendritic structure, its spiking mechanism, etc. Employing 
this reductionist approach, we focus on a important component of neural processing, the 
synapse, and analyze a simple model of a cortical synapse under two different representa- 
'tional paradigms. Under the "signal estimation" paradigm, we assume that the input signal 
202 A. Manwani and C. Koch 
is linearly encoded in the mean firing rate of a Poisson neuron and the mean-square error 
in the reconstruction of the signal from the post-synaptic voltage quantifies system per- 
formance. From the performance of the optimal linear estimator of the signal, a lower 
bound on the capacity for signal estimation can be computed. Under the "signal detection" 
paradigm, we assme that information is encoded in an all-or-none format and the error in 
deciding whether or not a presynaptic spike occurred by observing the post-synaptic voltage 
quantifies system performance. This is similar to the conventional absent/present(Yes-No) 
decision paradigm used in psychophysics. Performance of the optimal spike detector in 
this case allows us to compute a lower bound on the capacity for signal detection. 
01' 0 J 
Poisson ] ' ' ' ] J n(t) Estimator 
ke   etease J Optimal 
Encoding _  
Stimulus I / Spike  , ,  Estimate 
-J k(t) J.Train / JNo,se 
<' ' J/'P(a)J J J  / J I re(t) 
Spike / Optreal Dision 
No Spike Dettor Spike / 
 Ske 
Encodina SvnaDtic Channel Deco{!ina 
Figure 1: Schematic block diagram for the signal detection and estimation tasks. The 
synapse is modeled as a binary channel followed by a filter h(t) = a t e:cp(-t/ts). where 
a is a random variable with probability density, P(a) = c (cm)k-e:cp(-cm)/(k - 1)!. 
The binary channel, (inset, eo = Pr[spontaneous release], q = Pr [release failure]) models 
probabilistic vesicle release and h(t) models the variable epsp size observed for cortical 
synapses. n(t) denotes additive post-synaptic voltage noise and is assumed to be Gaussian 
and white over a bandwidth Bn. Performance of the optimal linear estimator (Wiener 
Filter) and the optimal spike detector (Matched Filter) quantify synaptic efficacy for signal 
estimation and detection respectively. 
2 The Synaptic Channel 
Synaptic transmission in cortical neurons is known to be highly random though the role of 
this variability in neural computation and coding is still unclear. In central synapses, each 
synaptic bouton contains only a single active release zone, as opposed to the hundreds or 
thousands found at the much more reliable neuromuscularjunction. Thus, in response to an 
action potential in the presynaptic terminal at most one vesicle is released (Korn and Faber, 
1991). Moreover, the probability of vesicle release p is known to be generally low (0.1 
to 0.4) from in vitro studies in some vertebrate and invertebrate systems (Stevens, 1994). 
This unreliability is further compounded by the trial-to-trial variability in the amplitude of 
the post-synaptic response to a vesicular release (Bekkers et. al, 1990). In some cases, the 
variance in the size of EPSP is as large as the mean. The empirically measured distribution 
of amplitudes is usually skewed to the right (possibly biased due the inability of measuring 
very small events) and can be modeled by a Gamma distribution. 
In light of the above, we model the synapse as a binary channel cascaded by a random am- 
plitude filter (Fig. 1). The binary channel accounts for the probabilistic vesicle release. eo 
Synaptic Transmission: An Information-Theoretic Perspective 203 
and q denote the probabilities of spontaneous vesicle release and failure respectively. We 
follow the binary channel convention used in digital communications (q = l-p), whereas, 
p is more commonly used in neurobiology. The filter h(t) is chosen to correspond to the 
epsp profile of a fast AMPA-like synapse. The amplitude of the filter a is modeled as ran- 
dom variable with density P(a), mean it, and standard deviation a,. The CV (standard 
deviation/mean) of the distribution is denoted by UV,. We also assume that additive Gaus- 
sian voltage noise n(t) at the post-synaptic site further corrupts the epsp response. n(t) is 
2 and a bandwidth Bn corresponding to the membrane 
assumed to white with variance o- n 
time constant r. One can define an effective signal-to-noise ratio, SNiri = E,/No, given 
by the ratio of the energy in the epsp pulse, Eh -- fc h2 (t) dt to the noise power spectral 
2/B. The performance of the synapse depends on the SNiri and not on 
density, No = a N 
the absolute values of Eh or a,. In the above model, by regarding synaptic parameters as 
constants, we have tacitly ignored history dependent effects like paired-pulse facilitation, 
vesicle depletion, calcium buffering, etc, which endow the synapse with the nature of a 
sophisticated nonlinear fiker (Markram and Tsodyks, 1997). 
a) 
Effective Continuous 
Estimation Chanl 
b) x=0 
No Spike 
Spike 
X=I 
Y=O 
////No Spike 
///  Spike 
"  :1 
Effective Binary 
Detection Chanl 
Figure 2: (a) Effective chan- 
nel model for signal estima- 
tion. re(t), ff(t), it(t) denote 
the stimulus, the best linear es- 
timate, and the reconstruction 
noise respectively. (b) Effective 
channel model for signal detec- 
tion. X and Y denote the bi- 
nary variables corresponding to 
the input and the decision re- 
spectively. Pf and Pm are the 
effective error probabilities. 
3 Signal Estimation 
Let us assume that the spike train of the presynaptic neuron can be modeled as a doubly 
stochastic Poisson process with a rate (t) = k(t)  re(t) given as a convolution between 
the stimulus re(t) and a filter k(t). The stimulus is drown from a probability distribution 
which we assume to be Gaussian. k(t) = ezp(-t/r) is a low-pass filter which models 
the phenomenological relationship between a neuron's firing rate and its input current. r 
is chosen to correspond to the membrane time constant. The exact form of k(t) is not 
cmcial and the above form is assumed primarily for analytical tractability. The objective is 
to find the optimal estimator of re(t) from the post-synaptic voltage v (t), where optimality 
is in a least-mean square sense. The optimal mean-square estimator is, in general, non- 
linear and reduces to a linear filter only when all the signals and noises are Gaussian. 
However, instead of making this assumption, we restrict ourselves to the analysis of the 
optimal linear estimator, (t) = g(t)  v(t), i.e. the filter g(t) which minimizes the 
mean-square error E = ((re(t) - (t)) 2) where (.) denotes an ensemble average. The 
overall estimation system shown in Fig. 1 can be characterized by an effective continuous 
channel (Fig. 2a) where it(t) = a(t) - re(t) denotes the effective reconstruction noise. 
System performance can be quantified by E, the lower E, the better the synapse at signal 
transmission. The expression for the optimal filter (enerfilter) in the frequency domain is 
g(w) = S.v(-w)/Sv(w) where S.(w) is the cross-spectral density (Fourier transform 
of the cross-correlation P) of re(t) and s(t) and S. (co) is the power spectral density of 
v(t). The minimummean-square erroris givenby, E = a2- fs I S,(w) I 2/S(w) dw. 
The set $ = {co [ S(w)  0} is called the support of S(w). 
204 A. Manwani and C. Koch 
Another measure of system performance is the mutual information rate I(m; v) between 
re(t) and v(t), defined as the rate of information transmitted by v(t) about s(t). By the 
Data Processing inequality (Cover 1991), I(m, v) > I(m, a). A lower bound of I(m, 
1 log 2 [ s.. (,,)] d (units 
and thus of I(m; v) is given by the simple expression fib =  fs s () 
of bits/sec). The lower bound is achieved when (t) is Gaussian and is independent of 
re(t). Since the spike train s(t) =  (t - ti) is a Poisson_process with rate k(t)  re(t), 
its power spectrum is given by the expression, Sss(w) = ,X+ [ K(w) 12 S,m(CO) where 
 is the mean firing rate. We assume that the mean (/m) and variance (a2) of re(t) are 
chosen such that the probability that )(t) < 0 is negligible 1 The vesicle release process 
is the spike train gated by the binary channel and so it is also a Poisson process with rate 
(1 - el)A(t). Since v(t) =  aih(t - ti) + n(t) is a filtered Poisson process, its power 
spectral density is given by Svv(w) =[ H(w) 12 { (12 +a2)(1-q)+lfi(1-q) 2 
Sm,(w) } + Snn (w). The cross-spectral density is given by the expression Svm (w) = 
(1 - q)lS,m(w)H(w)K(w). This allows us to write the mean-square error as, 
= 
E = - + + 
+ cv. s..(.,) 
(1 z Ig(w)12 ' szz(co)= (1- q)22 I H(co ) 121 g(o)12 
Thus, the power spectral density oft(t) is given by San = XH(w) + Sii(w). Notice 
that if K(w) --> oc, E --+ 0 i.e. perfect reconstruction takes place in the limit of high 
firing rates. For the parameter values chosen, Sii (w) << Xii (w), and can be ignored. 
Consequently, signal estimation is shot noise limited and synaptic variability increases shot 
noise by a factor Nsyn = (1 + CV,2)/(1 - q). For CV = 0.6 and e = 0.6, N, yn = 3.4, 
and for CV = 1 and el = 0.6, N, yn = 5. If re(t) is chosen to be white, band-limited to 
Bm Hz, closed-form expressions for E and Itb can be obtained. The expression for Itb is 
tedious and provides little insight and so we present only the expression for E below. 
E(7, BT) = a}[1 - 7 1 tan-  BT 
BW ( )] 
'7 = 212mNvnB m , By = 2rB,r 
E is a monotonic function of 7 (decreasing) and B- (increasing). 7 can be considered as 
the effective number of spikes available per unit signal bandwidth and By is the ratio of 
the signal bandwidth and the neuron bandwidth. Plots of normalized reconstruction error 
E = E/a2 and Itb versus mean firing rate () for different values of signal bandwidth Bm 
are shown in Fig. 3a and Fig. 3b respectively. Observe that Itb (bits/sec) is insensitive to B, 
for firing rates upto 200Hz because the decrease in quality of estimation (E increases with 
B,) is compensated by an increase in the number of independent samples (2B,) available 
per second. This phenomenon is characteristic of systems operating in the low SNR regime. 
Ib has the generic form, Ib = B log(1 + S/(NB)), where B, S and N denote signal 
bandwidth, signal power and noise power respectively. For low SNR, I 0 B S/(NB) = 
S/N, is independent of B. So one can argue that, for our choice of parameters, a single 
synapse is a low SNR system. The analysis generalizes very easily to the case of multiple 
synapses where all are driven by the same signal s (t). (Manwani and Koch, in preparation). 
However, instead of presenting the rigorous analysis, we appeal to the intuition gained from 
the single synapse case. Since a single synapse can be regarded as a shot noise source, 
n parallel synapses can be treated as n parallel noise sources. Let us make the plausible 
lWe choose/, and a, so that X = 3ax (std of,X) so that Prob[,X(t) _ 0] < 0.01. 
Synaptic Transmission: An Information-Theoretic Perspective 205 
assumption that these noises are uncorrelated. If optimal estimation is carried out separately 
for each synapse and the estimates are combined optimally, the effective noise variance 
is given by the harmonic mean of the individual variances i.e. 2 
i/0.eff = Z, 
However, if the noises are added first and optimal estimation is camed out with respect 
to the sum,, the effective noise variance is given by the arithmetic mean of the individual 
2 2 2 
0.hi/12 We assume 
variances, i.e. 0.neff = i . If that all synapses are similar so that 
2 2 = 0 '2/ft. Plots of Er and Itb for the case of 5 identical synapses are 
shown in Fig. 3c and Fig. 3d respectively. Notice that Itb increases with Bm suggesting 
that the system is no longer in the low SNR regime. Thus, though a single synapse has very 
low capacity, a small amount of redundancy causes a considerable increase in performance. 
This is consistent with the fact the in the low SNt regime, I increases linearly with SNIP, 
consequently, linearly with n, the number of synapses. 
a) b) 
1 16 
0.9 
0.8 
LU 
c o.6 
0.4 
0.9 
. 0.7 
N 
: 0.8 
Z o.s 
0.4. 
x 
x 
x 
x x x 
x x 
X m = 10Hz 
O Bin= 25 Hz 
I Brn: .50 Hz 
-- m Bin= 75 Hz 
Bm= 100 Hz 
14 
12 
14 
12 
- 0 
20 40 60 80 100 120 140 160 180 200 0 
o 
Firing Rate (Hz) 
X Bm= 10Hz 
O Bm= 25 Hz 
 am-- so Hz 
---- am= 75 Hz 
" Bin= 100 Hz 
+ 0 
+ 0 
;o 
o x x 
o o x x 
o x x 
x x i 
20 40 60 80 100 120 140 160 180 201) 
Firing Rate (Hz) 
Figure 3: E and Itb vs. mean firing rate () for n = 1 [(a) and (b)] and n = 5 [(c) and (d)] identical 
synapses respectively (different values of B,) for signal estimation. Parameter values are e - 0.6, 
eo = O, CVa = 0.6, ts = 0.5 msec, r = 10msec, a, = 0.1 mV, B, = 100 Hz. 
4 Signal Detection 
The goal in signal detection is to decide which member from a finite set of signals was 
generated by a source, on the basis of measurements related to the output only in a statistical 
sense. Our example corresponds to its simplest case, that of binary detection. The objective 
is to derive an optimal spike detector based on the post-synaptic voltage in a given time 
interval. The criterion of optimality is minimum probability of error (Pe). A false alarm 
206 A. Manwani and C. Koch 
(FA) error occurs when a spike is falsely detected even when no presynaptic spike occurs 
and a miss error (M) occurs when a spike fails to be detected. The probabilities of the errors 
are denoted by Pf and P,, respectively. Thus, Pe = (1 -po) Pf +po P, wherepo denotes 
the a priori probability of a spike occurrence. Let X and Y be binary variables denoting 
spike occurrence and the decision respectively. Thus, X - 1 if a spike occurred else X -- 
0. Similarly, Y = 1 expresses the decision that a spike occurred. The posterior likelihood 
ratio is defined as (v) = Pr(v [ X = 1)/Pr(v I X = 0) and the prior likelihood as 
o -- (1 - po)/po. The optimal spike detector employs the well-known likelihood ratio 
test, "If(v) >_ o Y = 1 else Y =0". WhenX = 1, v(t) = a h(t)+n(t) else v(t) = n(t). 
Since a is a random variable, (v): (f Pr(v I X = 1; a) P(a) da)/Pr(v I X = 0). If 
the noise n(t) is Gaussian and white, it can be shown that the optimal decision rule reduces 
to a matchedfilter 2, i.e. if the correlation, r between v(t) and h(t) exceeds a particular 
threshold (denoted by r/), Y - 1 else Y -= 0. The overall decision system shown in 
Fig. 1 can be treated as effective binary channel (Fig. 2b). The system performance can 
be quantified either by Pe or I(X; Y), the mutual information between the binary random 
variables, X and Y. Note that even when n(t) = 0 (SNR = oc), P  0 due to the 
unreliability ofvesicular release. Let P' denote the probability of error when SNR = oc. 
If eo = 0, P' = po el is the minimum possible detection error. Let P and po denote FA 
and M errors when the release is ideal (e = 0, eo: 0). It can be shown that 
Pe - P + Pm [Po( 1 -- q) -- (1 -- po)eO] + P[(1 - po)(1 - Co) - Poq] 
Pi = P;, Pro=Pro  +e,(1-Pm +P;) 
Both Pf and Pm depend on r/. The optimal value of/is chosen such that Pe is minimized. 
In general, Pf and po can not be expressed in closed-form and the optimal r] is found using 
the graphical ROC analysis procedure. If we normalize a such that/, = 1, Pf and P& can 
be parametrically expressed in terms of a normalized threshold r/*, P] = 0.5 [1 - Erf 01' )], 
po = 0.511 + fc ErfOI* -  a) P(a) da]. I(X; Y) can be computed using the 
formula for the mutual information for a binary channel, I = 7-l(po (1 - P,,) + (1 - 
Po) P) -po7-l(Pm) - (1 -po)7-l(P) where 7t(x) = -x log2 (x) - (1 - x)log2 (1 - x) is 
the binary entropy function. The analysis can be generalized to the case of n synapses but 
the expressions involve n-dimensional integrals which need to be evaluated numerically. 
The Central Limit Theorem can be used to simplify the case of very large n. Plots of 
Pc and I(X; Y) versus n for different values of SNR (1,10, c) for the case of identical 
synapses are shown in Fig. 4a and Fig. 4b respectively. Yet again, we observe the poor 
performance of a single synapse and the substantial improvement due to redundancy. The 
linear increase of I with n is similar to the result obtained for signal estimation. 
5 Conclusions 
We find that a single synapse is rather ineffective as a communication device but with 
a little redundancy neuronal communication can be made much more robust. Infact, a 
single synapse can be considered as a low SNR device, while 5 independent synapses 
in parallel approach a high SNR system. This is consistently echoed in the results for 
signal estimation and signal detection. The values of information rates we obtain are very 
small compared to numbers obtained from some peripheral sensory neurons (Rieke et. al, 
1996). This could be due to an over-conservative choice of parameter values on our part 
or could argue for the preponderance of redundancy in neural systems. What we have 
presented above are preliminary results of work in progress and so the path ahead is much 
2or deterministic a, the result is well-known, but even if a is a one-sided random variable, the 
matched filter can be shown to be optimal. 
Synaric Transmission: An Information-Theoretic Perspective 207 
a) b) 
o.4 
0.1 
 -- SNR = Inf 
I --,- SNR = 10 
'  ' ; ,  , 
Number of Synapses (n) 
o.4 
, 
 SNR: 10 /  
,  ,  ,  , 
Number of Synapses (n) 
Pe (a) and/to (b) vs. the number of synapses, n, (different values of SNR) for signal detection. 
$NR = Inf. corresponds to no post-synaptic voltage noise. All the synapses are assumed to be 
identical. Parameter values are po = 0.5, e = 0.6, eo = 0, CVa = 0.6, t = 0.5 msec, r = 10 msec, 
a, = 0.1 mV, B, = 100 Hz. 
longer than the distance we have covered so far. To the best of our knowledge, analysis 
of distinct individual components of a neuron from an communications standpoint has not 
been camed out before. 
Acknowledgements 
This research was supported by NSF, NIMH and the Sloan Center for Theoretical Neuro- 
science. We thank Fabrizio Gabbiani for illuminating discussions. 
References 
Bekkers, J.M., Richerson, G.B. and Stevens, C.F. (1990) "Origin of variability in quantal 
size in cultured hippocampal neurons and hippocampal slices," Proc. Natl. Acad. Sci. USA 
87: 5359-5362. 
Bialek, W. Rieke, F. van Steveninck, R.D.R. and Warland, D. (1991) "Reading a neural 
code," Science 252: 1854-1857. 
Cover, T.M., and Thomas, J.A. (1991) Elements of Information Theory. New York: Wiley. 
Kom, H. and Faber, D.S. (1991) "Quantal analysis and synaptic efficacy in the CNS," 
Trends Neurosci. 14: 439-445. 
Markram, H. and Tsodyks, T. (1996) "Redistibution of synaptic efficacy between neocorti- 
cal pyramidal neurons," Nature 382:807-810. 
Rieke, F. Warland, D. van Steveninck, R.D.R. and Bialek, W. (1996) Spikes.' Exploring the 
Neural Code. Cambridge: MIT Press. 
Stevens, C.F. (1994) "What form should a cortical theory take," In: Large-Scale Neuronal 
Theories of the Brain, Koch, C. and Davis, J.L., eds., pp. 239-256. Cambridge: MIT Press. 
