In this paper we will use a direct method
[DeWeese, 1995, DeWeese, 1996, Stevens and Zador, 1996, de Ruyter van Steveninck et al., 1997] to
estimate the mutual information. Direct methods use another form of
the expression eq. (5) for mutual information,
![]()
The first term
is the entropy of the output spike train
itself, while the second
is the conditional entropy of
the output given the inputs. The first term measures of the
variability of the spike train in response to the ensemble of
different inputs, while the second measures the reliability of the
response to repeated presentations of the same inputs. The second
term depends on the reliability of the synapses and spike generating
mechanism: to the extent the same inputs produce the same outputs,
this term approaches zero.
The direct method has two advantages over the reconstruction method in
the present context. First, it does not require the construction of a
``reconstructor'' for estimating the input from the output. Although
the optimal linear reconstructor is straightforward to estimate, the
construction of more sophisticated (i.e. nonlinear)
reconstructors can be a delicate art. Second, it provides an estimate
of information that is limited only by the errors in the estimation of
and
; the reconstruction method by contrast
provides only a lower bound on the mutual information that is limited
by the quality of the reconstructor.
As noted above, the estimation of
and
can
require vast amounts of data. If, however, interspike intervals
(ISIs) in the output spike train were independent, then the entropies
could be simply expressed in terms of the entropy of the associated
ISI distributions. The information per spike
is then
given simply by
![]()
where H(T) are
are total and conditional entropies,
respectively, of the ISI distribution. The information rate (units:
bits/second) is then just the information per spike (units:
bits/spike) times the firing rate R (units: spikes/second),
![]()
The representation of the output spike train as a sequence of firing
times
is entirely equivalent (except for edge
effects) to the representation as a sequence of ISIs
, where
. The advantage of using
ISIs rather than spike times is that H(T) depends only on the ISI
distribution p(T), which is a univariate distribution. This
dramatically reduces the amount of data required.
In the sequel we assume that spike times are discretized at a finite
time resolution
. The assumption of finite precision keeps
the potential information finite. If this assumption is not made, each
spike has potentially infinite information capacity; for example, a
message of arbitrary length could be encoded in the decimal expansion
of a single ISI.
Eq. 8 represents the information per spike as the
difference between two entropies. The first term is the total entropy
per spike,
![]()
where
is the probability that the length of the ISI was
between
and
. The distribution of ISIs can be obtained
from a single long (ideally, infinite) sequence of spike times.
The second term is the conditional entropy per spike. The conditional
entropy is just the entropy of the ISI distribution in response to a
particular set m of input spikes
, averaged over all
possible sets of inputs spikes

where
represents average. Here
is the probability of obtaining an ISI of length
in response to a particular particular set of input spikes
.
We used the following algorithm for estimating the conditional entropy:
In summary, we have described the three steps required to compute the information rate in our model. First, the total entropy per spike is computed from Eq. 10, and the conditional entropy per spike is computed from Eq. 11. Next, the information per spike is computed from Eq. 8. Finally, the information rate (information per time) is computed from Eq. 9.