Learning Exact Patterns of Quasi-synchronization 
among Spiking Neurons 
from Data on Multi-unit Recordings 
Laura Martignon 
Max Planck Institute 
for Psychological Research 
Adaptive Behavior and Cognition 
80802 Munich, Germany 
laura(mpip f-muenchen.mpg.de 
Gustavo Deco 
Siemens AG 
Central Research 
Otto Hahn Ring 6 
81730 Munich 
gustavo.deco(zfe.siemens.de 
Kathryn Laskey 
Dept. of Systems Engineering 
and the Krasnow Institute 
George Mason University 
Fairfax, Va. 22030 
klaskeygmu.edu 
Eilon Vaadia 
Dept. of Physiology 
Hadassah Medical School 
Hebrew University of Jerusalem 
Jerusalem 91010, Israel 
eilon@hbf. huji.ac.il 
Abstract 
This paper develops arguments for a family of temporal log-linear models 
to represent spatio-temporal correlations among the spiking events in a 
group of neurons. The models can represent not just pairwise correlations 
but also correlations of higher order. Methods are discussed for inferring 
the existence or absence of correlations and estimating their strength. 
A frequentist and a Bayesian approach to correlation detection are 
compared. The frequentist method is based on G 2 statistic with estimates 
obtained via the Max-Ent principle. In the Bayesian approach a Markov 
Chain Monte Carlo Model Composition (MC 3) algorithm is applied to 
search over connectivity structures and Laplace's method is used to 
approximate their posterior probability. Performance of the methods was 
tested on synthetic data. The methods were applied to experimental data 
obtained by the fourth author by means of measurements carried out on 
behaving Rhesus monkeys at the Hadassah Medical School of the Hebrew 
University. As conjectured, neural connectivity structures need not be 
neither hierarchical nor decomposable. 
Learning Quasi-synchronization Patterns among Spiking Neurons 77 
1 INTRODUCTION 
Hebb conjectured that information processing in the brain is achieved through 
collective action of groups of neurons, which he called cell assemblies (Hebb, 1949). 
followers were leg with a twofold challenge: 
 to define cell assemblies in an unambiguous way. 
 to conceive and carry out the experiments that demonstrate their existence. 
the 
His 
Cell assemblies have been defined in various sometimes conflicting ways, both in terms of 
anatomy and of shared function. One persistent approach characterizes the cell assembly 
by near-simultaneity or some other specific timing relation in the firing of the involved 
neurons. If two neurons converge on a third one, their synaptic influence is much larger 
for near-coincident firing, due to the spario-temporal summation in the dendrite 
(Abeles, 1991; Abeles et al. 1993). Thus syn-ftring is directly available to the brain as a 
potential code. 
The second challenge has led physiologists to develop methods to observe the 
simultaneous activity of individual neurons to seek evidence for spario-temporal patterns. 
It is now possible to obtain multi-unit recordings of up to 1 O0 neurons in awake behaving 
animals. In the data we analyze, the spiking events (in the 1 msec range) are encoded as 
sequences of O's and l's, and the activity of the whole group is described as a sequence of 
binary configurations. This paper presents a statistical model in which the parameters 
represent spario-temporal firing patterns. We discuss methods for estimating these 
pararameters and drawing inferences about which interactions are present. 
2 PARAMETERS FOR SPATIO-TEMPORAL FIRING PATTERNS 
The term spatial correlation has been used to denote synchronous firing of a group of 
neurons, while the term temporal correlation has been used to indicate chains of firing 
events at specific temporal intervals. Terms like "couple" or "triplet" have been used to 
denote spatio-temporal patterns of two or three neurons (Abeles et al., 1993; Gain, 1996) 
firing simultaneously or in sequence. Establishing the presence of such patterns is not 
straightforward. For example, three neurons may fire together more often than expected 
by chance  without exhibiting an authentic third order interaction. This phenomenon may 
be due, for instance, to synchronous firing of two couples out of the three neurons. 
Authentic triplets, and, in general, authentic n-th order correlations, must therefore be 
distinguished from correlations that can be explained in terms of lower order interactions. 
In what follows, we present a parameterized model that represents a spario-temporal 
correlation by a parameter that depends on the involved neurons and on a set of time 
intervals, where synchronization is characterized by all time intervals being zero. 
Assume that the sequence of configurations -t = (xct,'",xrN,t) of Nneurons forms a 
Markov chain of order r. Let  be the time step, and denote the conditional distribution 
for X__ t given previous configurations by p(x, [Xct_s),X(t_2S),...,Xct_r6)). We 
assume that all transition probabilities are strictly positive and expand the logarithm of 
the conditional distribution as: 
that is to say, more often than predicted by the null hypothesis of independence. 
78 L. Martignon, K. Laskey, G. Deco and E. Vaadia 
p(x t I Xct_6,Xct_26,...,Xct_r6  } = a[00 -I-  OAX A  
A.--. 
(1) 
l<j<k 
neurons in A are active. The set .=. _ 2 ^ of all subsets for which 0 A 
called the interaction structure for the distribution p. The effect 0 A 
where each A is a subset of pairs ofsubscripts of the form (i,t- sJ) that includes at 
least one pair of the form (i,t). Here X A = 1-I x(ij,t_%a ) denotes the event that all 
is non-zero is 
is called the 
interaction strength for the interaction on subset A. Clearly, 0 A = 0 is equivalent to 
A  ,w, and is taken to indicate absence of an order-lA I interaction among neurons in 
A. we denote the structure-specific vector of non-zero interaction strengths by _0_0s. 
Consider a set A of N binary neurons and denote by ]9 the probability distribution on 
the binary configurations of A. 
DEFINITION 1: We say that neurons ,i2 ,'",ik d exhibit a spatio-temporal 
pattern if there is a set of time intervals rnt, rn2t,..., rnkt with at least one 
/T/i -' 0, such that 0 A  0 in Equation (1), where 
A 
DEFINITION 2: A subset [i,i2 ,'",ilc 1 of neurons exhibits a synchronization or 
spatial correlation if 0 A  0 for g = 
In the case of absence of any temporal dependencies the configurations are independent and 
we drop the time index: 
p(x; = o + OAX A; (2) 
where A is any nonempty subset of A and XA = 1-I xi' 
iA 
Of course (2) is unrealistic. Temporal correlation of some kind is always present, one 
such example being the refractory period after firing. Nevertheless, (2) may be adequate in 
cases of weak temporal correlation. Although the models (1) and (2) are statistical not 
physiological, it is an established conjecture that synaptic connection between two 
neurons will manifest as a non-zero 0 A for the corresponding set g in the temporal 
model (1). Another example leading to non-zero 0 A will be simultaneous activation of 
the neurons in g due to a common input, as illustrated in Figure 1 below. Such a 0 A 
will appear in model (1) with time intervals equal to 0. An attractive feature of our 
models is that it is capable of distinguishing between cases a. and b. of Figure 1. This 
can be seen by extending the model (2) to include the external neurons (H in case a., H,K 
in case b.) and then marginalizing. An information-theoretic argument supports the 
choice of 0 A  0 as a natural indicator of an order-I A I interaction among the neurons 
in g. Assume that we are in the case of no temporal correlation. The absence of 
interaction of order I A I 
Learning Quasi-synchronization Patterns among Spiking Neurons 79 
H 
a. Figure 1 b. 
among neurons in A should be taken to mean that the distribution is determined by the 
marginal distributions on proper subsets of m. A well established criterion for selecting 
a distribution among those matching the lower order marginals fixed by proper subsets of 
A, is Max-Ent. According to the Max-Ent principle the distribution that maximizes 
entropy is the one which is maximally non-committal with regard to missing information. 
The probability distribution ]9' that maximizes entropy among distributions with the 
same marginals as the distribution ]9 on proper subsets of m has a log-linear 
expansion in which only 0 B , B c A, B A can possibly be non-zero. 2 
3 THE FREQUENTIST APPROACH 
We treat here the case of no temporal dependencies. The general case is treated in 
Martignon-Deco, 1997; Deco-Martignon, 1997. We also assume that our data are 
stationary. We test the presence of synchronization of neurons in m by the following 
procedure: we condition on silence of neurons in the complement of m in A and call the 
resulting frequency distribution p . We construct the Max-Ent model determined by the 
marginals of p on proper subsets of m. The well-known method for constructing this 
type of Max-Ent models is the I.P.F.P. Algorithm (Bishop et ai.,1975). We propose 
here another simpler and quicker procedure: 
If B is a subset of A, denote by /B the configuration that has a component 1 for 
every index in B and 0 elsewhere. 
Define ]9*(/B) = ]9('/ ) + (--l)lBI A, where A is to be determined by solving for 
0 4 -= 0, where 0 4 is the coefficient corresponding to A in the log-expansion of ]9 * 
As can be shown (Martignon et al, 1995), 0 4can be written as 
2 This was observed by J. Good in 1963 (Bishop et al. 1975). It is interesting to note that ]9' 
minimizes the Kullback-Leibler distance from ]9 in the manifold of distributions with a log- 
linear expansion in which only 0t , B c A, B : A can possibly be non-zero. 
80 L. Martignon, K. Laskey, G. Deco and E. Vaadia 
0 A =  (--1)IA-BI  t 9 ' (Z'B)' The distribution/9 * maximizes entropy among those 
BoA 
with the same marginals of/9 on proper subsets of A .3 We use/9 * as estimate of/9 for 
tests by means of G 2 statistic (Bishop et al., 1975). 
4 THE BAYESIAN APPROACH 
We treat here the case of no temporal dependencies. The general case is treated in Laskey- 
Martignon, 1997. Information about 9(X}prior to observing any data is represented by 
a joint probability distribution called the prior distribution over  and the O's. 
Observations are used to update this probability distribution to obtain a posterior 
distribution over structures and parameters. The posterior probability of a cluster A can 
be interpreted as the probability that the r nodes in cluster A exhibit a degree-r 
interaction. The posterior distribution for 0Arepresents structure-specific information 
about the magnitude of the interaction. The mean or mode of the posterior distribution 
can be used as a point estimate of the interaction strength; the standard deviation of the 
posterior distribution reflects remaining uncertainty about the interaction strength. 
We exhibit a family of log-linear models capable of capturing interactions of all 
orders. An algorithm is presented for learning both structure and parameters in a unified 
Bayesian framework. Each model structure specifies a set of clusters of nodes, and 
structure-specific parameters represent the directions and strengths of interactions among 
them. The Bayesian learning algorithm gives high posterior probability to models that 
are consistent with the data. Results include a probability, given the observations, that a 
set of neurons fires simultaneously, and a posterior probability distribution for the 
strength of the interaction, conditional on its occurrence. 
The prior distribution we used has two components. The first component assigns a prior 
probability to each structure. In our model, interactions are independent of each other and 
each interaction has a probability of. 1. This reflects the prior expectation that not many 
interactions are expected to be present. The second component of the prior distribution is 
the conditional distribution of interaction strengths given the structure. If an interaction is 
not in the structure, the corresponding strength parameter 0 A is taken to be identically 
zero given structure . All interactions belonging to  are taken to be independent and 
normally distributed with mean zero and standard deviation 2. This reflects the prior 
expectation that interaction strength magnitudes are rarely larger than 4 in absolute value. 
Computing the posterior probability of a structure  requires integrating out of the 
joint mass-density function of the structure , the interaction strength 0 A, and the 
data X. The solution to this integral cannot be obtained in closed form. We use 
Laplace's method (Kass-Raftery, 1995; Tierney-Kadane,1986) to estimate the posterior 
probability of structures. The posterior distribution of 0 A given frequency data also 
3 This is due to the fact that there is a unique distribution with the same marginals of f on 
proper subsets of A such that the coefficient corresponding to ,A_ in its log-expansion 
is zero. 
Learning Quasi-synchronization Patterns among Spiking Neurons 81 
cannot be obtained in closed form. We use the mode of the posterior distribution as a 
point estimate of 0 A. The standard deviation of 0A' which indicates how precisely OA 
can be estimated from the given data, is estimated using a normal approximation to the 
posterior distribution (Laskey-Martignon, 1997). The covariance matrix of the OA is 
estimated as the inverse Fisher information matrix evaluated at the mode of the posterior 
distribution. The posterior probability of an interaction OAis the sum over the posterior 
probabilities of all structures containing A. We used a Markov chain Monte Carlo Model 
Composition algorithm (MC 3) to search over structures. This stochastic algorithm 
converges to a stationary distribution in which structure  is visited with probability 
equal to its posterior probability. We ran the MC 3 algorithm for 15,000 runs and 
estimated the posterior probability of a structure as its frequency of occurrence over the 
15,000 runs. We estimated interaction strength parameters and standard deviations using 
only the 100 highest-probability structures. Although the number of possible structures 
is astronomical, typically most of the posterior probability is contained in relatively few 
structures. We found this to be the case, which justifies using only the most probable 
structures to estimate interaction strength parameters. 
5 RESULTS 
We applied our models to data from an experiment in which spiking events among groups 
of neurons were analyzed through multi-unit recordings of 6-16 units in the frontal cortex 
of Rhesus monkeys. The monkeys were trained to localize a source of light and, after a 
delay, to touch the target from which the light blink was presented. At the beginning of 
each trial the monkeys touched a "ready-key", then the central ready light was turned on. 
Later, a visual cue was given in the form of a 200-ms light blink coming from either the 
left or the right. Then, after a delay of 1 to 32 seconds, the color of the ready light 
changed from red to orange and the monkeys had to release the ready key and touch the 
target from which the cue was given. The spiking events (in the 1 millisecond range) of 
each neuron were encoded as a sequence of zeros and ones, and the activity of the group 
was described as a sequence of configurations of these binary states. The fourth author 
provided data corresponding to piecewise stationary segments of the trials, which presented 
weak temporal correlation, corresponding to intervals of 2000 milliseconds around the 
ready-signal. He adjoined these 94 segments and formed a data-set of 188,000 msec. The 
data were then binned in time windows of 40 milliseconds. The criterion we used to fix 
the binwidth was robustness with regards to variations of the offsets. We selected a subset 
of eight of the neurons for which data were recorded. We analyzed recordings prior to the 
ready-signal separately from data recorded after the ready-signal. Each of these data sets is 
assumed to consist of independent trials from a model of the form (2). 
Cluster Posterior prob. Posterior prob. MAP estimate of Standard Significance 
A of A of A 0A deviation of 
(frequency) (best 0A 
100models) 
6,8 .9 .89 0.47 0.11 4.0853 
4,5,6,7 .30 0.32 2.30 0.64 No 
2,3,6 .40 0.38 2.30 0.64 2.35 
1,3,4 close to Drior close to [rior 4.7 
Table 1' results for pre-ready signal data. Effects with posterior prob. > O. 1 
82 L. Martignon, K. Laskey, G. Deco and E. Vaadia 
Cluster Posterior prob. Posterior prob. MAP estimate of Standard Significance 
A of A of A 0A deviation of 
(frequency) (best 100 0A 
models) 
5,6 .79 0.96 1.00 0.27 1.82 
4,7 .246 0.18 0.93 0.34 2.68 
1,4,5,6 0.18 0.13 1.06 0.36 No 
1,3,4,6,7 0.24 0.17 2.69 0.13 No 
Table2:results for post-ready signal data. Effects with posterior prob >0.1 
Another set of data from 5 simulated neurons was provided by the fourth author for a 
double-check of the methods. Only second order correlations had been simulated: a 
synapse lasting 2 msec, an inhibitory common input, and two excitatory common inputs. 
The Bayesian method was very accurate, detecting exactly the simulated interactions. 
The frequentist method made one mistake. Other data sets with temporal correlations 
have also been analyzed. By means of the frequentist approach on shifted data, temporal 
triplets have been detected and even fourth order correlations. Temporal correlograms are 
computed for shifts of up to 50 msec (Martignon-Deco, 1997). 
References 
Hebb, D. (1949) The Organization of Behavior. New York: Wiley, 1949. 
Abeles, M.(1991 )Corticonics.' Neural Circuits of the Cerebral Cortex. Cambridge: Cambridge University Press, 
1991. 
Abeles, M., H. Bergman, E. Margalit, and E. Vaadia. (1993) "Spatiotemporal Firing Patterns in the Frontal 
Cortex of Behaving Monkeys." Journal of Neurophysiology 70, 4:, 1629-1638. 
Griin S. (1996) Unitary Joint-Events in Multiple-Neuron Spiking Activity-Detection, Significance and 
Interpretation. Verlag Harry Deutsch, Frankfurt. 
Martignon L. and Deco G. (1997) "Neurostatistics of Spatio-Temporal Patterns of Neural Activation: the 
frequentist approach" Technical Report, MPI-ABC no.3. 
Deco G. and Martignon L. (1997)"Higher-order Phenomena among Spiking Events of Groups of Neurons" 
Preprint. 
Bishop, Y., S. Fienberg, and P. Holland (1975) Discrete Multivariate Analysis. Cambridge, MA: MIT Press. 
Martignon L,.v. Hasseln H. Grtin S, Aertsen A, Palm G.(1995) "Detecting Higher Order Interactions among 
the Spiking Events of a Group of Neurons" Biol.Cyb. 73, 69-81. 
Kass,. and Raftery A. (1995) "Bayes factors"Journal of the American Statistical Association 90, no. 430:, 
773-795. 
Tierney, L., and J. B. Kadane (1986) "Accurate Approximations for Posterior Moments and Marginal 
Densities." Journal of the American Statistical Association 81, 82-86 
Laskey K., and Martignon L.(1997) "Neurostatistics of Spatio-temporal Patterns of Neural Activation: the 
Bayesian Approach", in preparation 
Laskey K., and Martignon, L.(1996) "Bayesian Learning of Log-linear Models for Neural Connectivity" 
Proceedings of the XII Conference on Uncertainty in Artificial Intelligence, Horvitz E. ed., 
Morgan-Kaufmann, San Mateo. 
