## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Fundamental limits on persistent activity in networks of noisy neurons

Edited by Terrence J. Sejnowski, Salk Institute for Biological Studies, La Jolla, CA, and approved September 14, 2012 (received for review October 23, 2011)

### This article has a correction. Please see:

## Abstract

Neural noise limits the fidelity of representations in the brain. This limitation has been extensively analyzed for sensory coding. However, in short-term memory and integrator networks, where noise accumulates and can play an even more prominent role, much less is known about how neural noise interacts with neural and network parameters to determine the accuracy of the computation. Here we analytically derive how the stored memory in continuous attractor networks of probabilistically spiking neurons will degrade over time through diffusion. By combining statistical and dynamical approaches, we establish a fundamental limit on the network’s ability to maintain a persistent state: The noise-induced drift of the memory state over time within the network is strictly lower-bounded by the accuracy of estimation of the network’s instantaneous memory state by an ideal external observer. This result takes the form of an information-diffusion inequality. We derive some unexpected consequences: Despite the persistence time of short-term memory networks, it does not pay to accumulate spikes for longer than the cellular time-constant to read out their contents. For certain neural transfer functions, the conditions for optimal sensory coding coincide with those for optimal storage, implying that short-term memory may be co-localized with sensory representation.

Short-term memory is essential for survival in a world where data sources may disappear before consequent decisions or actions are taken. In many cases, the object of the memory is a continuous variable, for example, the direction of heading during navigation, the location of the eyes, or the orientation of a visual stimulus. Short-term memory is associated with patterns of neural firing that correlate with the stored quantity (1, 2) and persist during the storage period (3⇓–5). When the stored variable is continuous, the persistent activity can take a continuum of states (1, 2, 5, 6).

Such persistent activity is thought, in many cases, to possess an underlying continuous attractor structure (7⇓⇓–10): the instantaneous value of a continuous variable is represented by setting the system to a point on a continuous manifold of stable fixed points. The stability of the fixed points generates a long time-constant for memory, even when cellular time-constants are short.

Moreover, continuous attractors provide some degree of resistance to ongoing noise: Any perturbation of the state of the system away from the attractor manifold is corrected because the state quickly decays back to the attractor. However, the component of noise parallel to the attractor manifold will remain uncorrected, causing loss of information about the stored memory (9, 11). Such noise-induced drift has been observed in numerical simulations of continuous attractor networks (10, 12, 13) and has been studied analytically in a specific continuous attractor network (14, 15). However, a general understanding of how neural and network properties affect the rate of drift and short-term memory performance has been missing.

In this work, we consider the dynamics of neural networks with a continuous attractor and otherwise arbitrary connectivity, weights, and neural nonlinearity. First, we use a theory of weak fluctuations in large networks to derive analytically how, in the absence of any external anchoring input, stochastic neural activity causes the represented variable to drift. We show that this drift is diffusive. We derive an expression quantifying the rate of memory loss – which for symmetric networks takes a particularly simple form – in terms of the number of neurons, the range of the stored variable, the neural noise, the neural nonlinearity, and the shape of the tuning curves within the network.

Second, we establish a fundamental relationship between a *dynamical* property, the rate of drift of the memory state due to internal noise in the network; and an abstract *statistical* limit on estimation, which bounds the information an external observer can gain about the underlying state of the stochastic network from a finite sample of its spikes. We show that the statistical limit on estimation enforces a finite minimum rate of diffusion in the network dynamics.

Third, we derive an unexpected result on the time required to read out the contents of continuous attractor memory networks: Despite the stochastic nature of spiking and the persistence of activity in the network, there is no advantage for an ideal observer in accumulating spikes beyond the biophysical time-constant of single neurons. Fourth, our results imply that in certain cases, the conditions for good sensory representation coincide with the conditions for good short-term memory.

It has long been realized that in physical systems, a deep relationship exists between thermodynamics and memory (16⇓⇓–19). More recently, such principles have been explored in man-made computing devices at the nanoscale, where fluctuations become prominent (19, 20). It is of interest to understand whether similar general principles relate internal noise in nonequilibrium biological systems to the dynamics of information and memory. Our results, relating stochastic dynamics in neural networks to memory degradation, establish that a statistical measure of internal noise constrains the dissipation of memory.

## Results

We consider a network of Poisson spiking neurons: Each neuron *i* generates Poisson spikes based on its time-varying firing rate , where is the summed recurrent input, *b*_{i} is a bias, and *ϕ* is the nonlinear neural transfer function. Spikes in neuron *j* drive synaptic activation *s*_{j}(*t*) [Eq. **7**] which, weighted by the synaptic strength *W*_{ij}, is the input to postsynaptic neuron *i*. The Poisson model is motivated by the Poisson-like statistics of spike trains in cortical areas (21, 22) and is a simple way of generating the spiking statistics that would be obtained under certain conditions from conductance-based neurons receiving noisy inputs (23, 24). In deriving our analytical results, we assume that each neuron receives a large number of spikes within its cellular time-constant, so that network dynamics can be reduced to rate equations, Eq. **10**, with weak fluctuations arising from the variability of the Poisson process (referred to as the *weak fluctuation limit*, *Materials and Methods*).

We are interested in networks whose rate dynamics admit a continuous manifold of stable steady states (attractors), , parametrized by a continuous variable θ. For simplicity of presentation, we describe below our results for a one-dimensional manifold, but also show (*SI Appendix*) that they readily generalize to higher dimensions. At each attractor state, the synaptic activation obeys [1]where τ is the synaptic time-constant.

In what follows, we consider the network evolving under its own noisy dynamics, Eqs. **7**–**9** in *Materials and Methods*. The instantaneous network state *s*(*t*) remains close to the attractor manifold, but is not precisely on it because of continuously generated neural noise (Fig. 1*A*). Nevertheless, the instantaneous state can be mapped to a single point θ(*t*) on the low-dimensional attractor manifold, defined as the point to which the state would flow under noise-free rate dynamics. We refer to this point as the *instantaneous attractor state*.

### Diffusive Drift due to Ongoing Neural Noise.

We seek to derive how noisy neural spiking drives the instantaneous attractor state to drift. The effect of noise is to perturb the network away from the attractor manifold. Most components of this perturbation are rapidly erased by flow to the attractor manifold, and are thus irrelevant, but the component parallel to the attractor manifold is not and is responsible for gradual drift of the instantaneous state along the attractor. In general, noise can introduce two types of drift along the attractor: The first is random, and the second is systematic. We assume below that the system is well-tuned to eliminate systematic drift (for discussion of this assumption, see *SI Appendix*, section II) and focus on the random component because it is responsible for memory degradation and information loss. We use the neural network dynamics [Eqs. **7**–**9**] to obtain that at short times, the variance of the attractor state, relative to its initial value, grows linearly with the elapsed time: (*SI Appendix*). Such behavior is the hallmark of a diffusion process. The larger the diffusion coefficient, , the faster the growth of variance.

We derive for networks of Poisson spiking neurons with arbitrary nonlinearity and connectivity (*SI Appendix*, Eq. **S15**) and highlight here the result for networks with symmetric weights:* [2]where is the total steady-state synaptic input. A convenient feature of Eq. **2** is that it depends on potentially observable quantities like the neural tuning curve and on the neural transfer function *ϕ* through . Thus, it could be applied to experimental data to derive a prediction for the diffusivity expected from neural noise.

We can see directly from Eq. **2** that the diffusivity of a memory state decreases as the single-neuron time-constant τ increases and also decreases with increasing *N*, the number of neurons (assuming the weights are scaled as 1/*N*, to keep the total synaptic inputs and the tuning curves fixed). Similarly, the diffusivity decreases if the tuning curves (and hence peak firing rates) are scaled up as *N* remains fixed. Thus, the stability of stored memories increases with the biophysical neural time-constant, the peak firing rate, and the network size.

Over long times, the growth of variance is obtained from the diffusion coefficient by integrating the diffusion equation (*SI Appendix*, section III). To test our theoretical results for and for the time-dependent variance, we apply them to the neural noise-driven drift in numerical simulations of two different attractor network models (Fig. 2). In one, a ring network (Fig. 2*A*), the attractor dynamics arise due to a rotation symmetry in the network weights. Such networks are proposed to underlie the representation of head direction and orientation tuning (7, 8). The other network lacks such a symmetry (Fig. 2*E*) and is similar to models of the oculomotor system (9) and decision-making circuits (25, 26). The analytical result of Eq. **2** provides an excellent quantitative prediction (with no free parameters) of the drift of the attractor state in numerical simulations of both models (Fig. 2 *C*, *D*, and *G*).

Our derivation of the diffusion coefficient is the first main result of this paper. By minimizing the diffusion coefficient of Eq. **2** (or its more general form in *SI Appendix*, Eq. **S15**) with respect to quantities like the network weights and the neural nonlinearity (under appropriate constraints), it is possible to study what network parameters and architectures will maximize the persistence of memory states.

### Internal Fisher Information (FI).

We next consider a statistical property of the network: the Fisher information rate in the network’s stochastic spikes about the instantaneous attractor state. We call this quantity the internal FI rate (*SI Appendix*, section IV), or internal FI for short: [3]A priori, internal FI appears unconnected to the network’s diffusive dynamics, but as we will show in the next section, it is in fact intimately related to the diffusion coefficient.

The expression for internal FI appears to have the same mathematical form as the Fisher information derived for sensory coding (27, 28), but only if represents the stimulus-evoked response of the network and the variable θ is the value of a present stimulus. By contrast, here, is the input-free steady-state response of the network and θ(*t*) represents an internal, dynamical network variable. Thus, internal FI is a meaningful quantity even in the absence of any stimulus, past or present (Fig. 1*B*). In this sense, internal FI is an intrinsic property of the network independent of its representational role and is distinct from the typical use of Fisher information in neural systems as a quantity linked to a stimulus (27⇓–29).

### Information-Diffusion Inequality.

Equipped with the expressions for the diffusion coefficient (Eq. **2**, and *SI Appendix*, Eq. **S15**) and the internal FI [Eq. **3**], we obtain the second main result of this paper, the *information-diffusion inequality* (proof in *SI Appendix*, section V for arbitrary connectivity and neural nonlinearity): [4]In words, the internal FI sets a lower bound on the diffusion of the attractor state.

The information-diffusion inequality can be interpreted as follows. In maintaining its persistent activity, the network effectively acts as an estimator of its state along the attractor, based on the spikes emitted within a time τ. Under this interpretation, Eq. **4** follows from a pair of inequalities: First, an ideal observer estimating the instantaneous attractor state from spike-counts has a squared error that equals or exceeds the inverse of the internal FI, according to the well-known Cramér–Rao bound. Second, the accuracy of the neural network in estimating its own instantaneous attractor state can be no better than, and may be significantly worse than, the ideal estimator. These combined limits on the network’s estimation of its instantaneous attractor state constrain its dynamics to be diffusive with a lower bound on the diffusion coefficient provided by the internal FI.^{†} The inequality contains an overall factor of τ^{2} because over the characteristic time-scale τ, the internal FI is proportional to τ, as is the squared displacement from diffusion.

#### Generalizations.

The information-diffusion inequality holds under a number of generalizations. First, for multidimensional attractors, the inequality retains the same form as in Eq. **4**, with referring to the diffusion coefficient tensor and *J*^{-1} representing the inverse of the internal FI matrix. The inequality signifies that the difference between the two sides is a positive-semidefinite matrix (*SI Appendix*, section V). Second, when the neural dynamics involve more than one time-scale, for instance a synaptic time-constant and a membrane time-constant, the same information-diffusion inequality [Eq. **4**] holds. However, τ is replaced by the sum of the two time-scales. Thus, the larger of the two time-scales dominates in setting the limit on diffusivity (*SI Appendix*, section VII). Third, when the Poisson-like noise in the dynamics is replaced by additive Gaussian noise (of constant variance), then the conditions under which the information-diffusion inequality is saturated change (see below), but Eq. **4** remains the same (*SI Appendix*, section VII).

#### Saturation of the information-diffusion inequality.

Here we consider whether the information-diffusion inequality is tight, i.e., whether the bound on diffusivity given by the inequality is ever saturated. The nested inequalities contained within our interpretation of the information-diffusion inequality, above, must become equalities for the bound to be saturated. In the weak fluctuation limit—where many spikes are emitted within the cellular time-constant—it is guaranteed that the Cramér–Rao bound can be saturated by an ideal estimator (27, 28). However, it is much less clear whether any continuous attractor network could match the performance of an ideal estimator in tracking its own instantaneous attractor state. It has been shown previously that continuous attractor networks, when evolving deterministically (noise free), can estimate the location of a bump input from a sample of spikes in a Bayes-optimal way (30). The memory network faces a more difficult task because it needs to continuously estimate a state that is drifting, and its dynamics as a readout network are noisy.

Surprisingly, a large class of networks can match an ideal observer in estimating their own evolving attractor state. Using Eqs. **3** and **4** it is straightforward to verify that in symmetric networks, if the neural transfer function is exponential, *ϕ*(*x*) ∝ exp(α*x*), then diffusion equals the inverse internal FI over the time-scale τ, and the information-diffusion inequality is saturated.^{‡} Thus, an exponential transfer function enables a symmetric network to optimally estimate its own state. In the remainder of this subsection we comment on networks with symmetric connectivity.

What aspect of the network dynamics dictates that an exponential nonlinearity is optimal for self-state estimation? If the Poisson noise model for neural response is replaced by white Gaussian noise with fixed variance, the expressions for and *J* change, but the information-diffusion inequality remains the same (*SI Appendix*, section VII). However, the inequality is saturated with a linear rather than exponential neural transfer function (*SI Appendix*, section VII). Therefore, the requirement of an exponential transfer function for saturation of the inequality is related to the Poisson variability in neural spiking.

We note that saturation of the information-diffusion inequality is not equivalent to minimizing diffusivity. As just shown, saturating the information-diffusion inequality constrains the neural nonlinearity but not the network weights, whereas minimizing the diffusivity could involve optimizing the weights as well.

Nevertheless, in networks with exponential neural nonlinearity, diffusivity can be minimized by maximizing internal FI. Two interesting consequences follow from the fact that internal FI has the same mathematical dependence on the tuning curves as the Fisher information for sensory representation when the internal attractor state variable θ(*t*) is replaced by the stimulus value Θ. First, the extensive literature on optimal tuning curves for sensory representation (28, 31) can be directly applied to identify which network architectures minimize diffusivity. For example, we can conclude from the sensory coding literature (31) that narrower tuning curves should improve memory persistence in a 1-d attractor, but broader tuning curves should improve persistence in 3-d. Second, if the sensory response of the network resembles its input-free tuning curves, as expected for a continuous attractor network (7), it follows that the same network architecture that optimizes sensory representation is also optimal for retaining short-term memories.

### Optimal Observation Time for Memory Readout.

Suppose that a memory network’s initial state was set to Θ on the attractor by a stimulus that was then removed at *t* = 0. At a later time *T*, a cue is presented signaling that the stimulus variable has to be recalled. Over what interval Δ*T* following the cue should a decoder collect spikes from the memory network to obtain an accurate estimate of Θ?

In sensory representation, where the sensory stimulus remains present and the aim is to decode Θ from the spikes of the encoding neurons, the variance of the ideal estimate drops linearly with the duration over which spikes are collected. Thus, the longer the time over which spikes are collected, the better the estimate of the encoded variable.

In memory networks, the stored variable diffuses not only over the delay period but also over the recall interval, so there is a cost associated with waiting to collect more spikes: The longer the time over which spikes are collected, the more statistics are gained about the recent state of the network, but the memory of the input has also further dissipated. An ideal estimator (*SI Appendix*, section VI) of Θ can continue to gain some information by waiting over longer times, but the gains are marginal (Fig. 3). Provided with the full history of neural spike times starting at the recall cue, the ideal estimator approaches its asymptotic performance in a characteristic time given by (*SI Appendix*, section VI): [5]where in the second relation we made use of the information-diffusion inequality. Thus, there is no significant advantage to waiting longer than approximately τ to estimate the stored state of a memory network.

It is unclear whether the brain implements the complex and precise dynamical updates required by the ideal estimator. Let us consider a simpler estimator that assumes the underlying attractor state is static. This naive estimator could be implemented as population averaging (28, 32) or as a continuous attractor readout network (30). As spikes are collected over time during the recall period, the naive estimator’s performance improves (Fig. 3). However, if spikes are accumulated for longer than the naive estimator’s performance begins to degrade due to the diffusion of the network state (*SI Appendix*, section VI and Fig. 3). As was the case for the ideal estimator, the information-diffusion inequality implies that there is no advantage to waiting for times longer than approximately τ when reading out the memory network. The variance of the naive estimator at is only slightly larger, by a factor , than the asymptotic variance of the ideal estimator, demonstrating that such an estimator could be useful for memory readout.

In summary, despite the persistent nature of activity in the attractor network, there is no advantage for memory readout in observing the network spikes for longer than the biophysical cellular time-constant τ. Further, the naive estimator is near optimal if its readout interval is comparable to τ.

### Minimum Variance of Stimulus Estimation.

Our result on the diffusion of a memory state, Eq. **2**, allows us to quantify the expected squared error in recalling, at time *T*, the initial state Θ of the network, set by a stimulus that was removed at *t* = 0. For large intervals *T*, this squared error is dominated by diffusion of the memory state along the attractor, with the diffusion coefficient given by Eq. **2**. If the represented variable is unbounded and if diffusion along the attractor is independent of the attractor state (i.e., if the diffusion coefficient is independent of θ), the expected squared error will be at least [6]where is the estimate of Θ based on all spikes emitted by the memory network starting at time *T*. The first term in the sum on the right-hand side is due to diffusion, and the second arises from the uncertainty in estimating the instantaneous attractor state from spikes by an ideal observer (*SI Appendix*, section VI), as analyzed in the previous section. For bounded variables and for state-dependent diffusion coefficients, the growth of diffusive variance can be similarly obtained, by integrating the diffusion equation (*SI Appendix*, section III) with the derived value of (Eq. **2** or *SI Appendix*, Eq. **S15**) Thus, our analysis bounds how information about a stored variable dissipates over time due to neural noise.

## Discussion

In summary, we have derived an expression in terms of neural and network parameters for how information in continuous attractor memory networks dissipates over time due to ongoing neural noise and proved that this diffusive information loss is fundamentally lower bounded by a statistical limit on estimation of the instantaneous attractor state by an ideal observer. Our results, derived under quite general conditions, are valid for highly nonlinear systems, in contrast to existing results on information loss through neural noise in linear memory networks (29, 33). Our results also contrast with existing analytical results on network models of integration and short-term memory, which are predominantly deterministic or focus on memory degradation due to systematic parameter mistuning (8, 9, 25, 26, 34⇓–36).

Neural coding is typically analyzed from the perspective of a downstream recipient of spikes. Our results highlight that the way a variable is coded in a memory network has consequences not only for representational capacity, as viewed from outside the network, but also for the network’s own internal dynamics.

Our spiking model for single neurons is quite simple, and more detailed models of neural spiking will doubtless produce somewhat different results for the diffusivity and internal FI. Already, in two extensions of the basic Poisson neuron model (a modified noise model and a model of neurons with two time-constants), the diffusion coefficient and the internal FI had modified forms (*SI Appendix*, section VII). Interestingly, however, in each case the information-diffusion inequality remained valid (*SI Appendix*, section VII). The very general nature of the principle underlying the information-diffusion inequality suggests that it may hold for neural models that contain more biophysical detail (24) and include more dynamical time-scales (37); for networks where spike–spike correlations are important and modify the Fisher information (38, 39) and diffusivity; as well as for different kinds of biological systems like gene–protein regulatory networks where stochasticity in the copy number of transcriptional regulators plays the role of Poisson noise (40). Deterministic spiking networks can have chaotic dynamics and produce Poisson-like spike trains (41, 42). If our assumption of Poisson independence holds true in those networks, then our results may also apply to such systems. These conjectures remain to be tested.

Our results suggest that in continuous attractor neural integrators (8⇓–10, 25, 26, 34), neural noise has a similar effect as noise in the input variable being integrated, justifying the phenomenological approach taken by many (25, 26, 43), who add a Gaussian noise term to the input of an otherwise perfect integrator. When this added term represents noise in only the stimulus [e.g., in random-dot motion stimuli (26)], its statistics are known and can be used to set the noise variance. However, in general the variance of added noise should also include the effects of stochasticity in the neural integrator, which is the diffusivity we have computed here. The variance of this term depends on the network structure and the nonlinearity of the integrator neurons, as given in Eq. **2** or *SI Appendix*, Eq. **S15**. This intrinsic neural noise reduces the accuracy of the integrator but does not shorten the integration time-constant.

Our results have implications for empirical studies of memory. The result that memory degradation due to neural noise is diffusive, implies that the variance in the recall of analog variables (44, 45), if stored directly in a continuous attractor network, is predicted by the diffusion equation (*SI Appendix*, Eq. **S26**), using the diffusion coefficient . The latter could be derived from our results, Eq. **2** or *SI Appendix*, Eq. **S15**, and in addition (or instead) be measured experimentally—by observing the instantaneous growth of recall variance starting from all possible states of the variable. At short times in the tuned system, the variance of recall should increase linearly, with slope .

The quantitative expression for the diffusion rate derived here can be used to probe to what extent short-term memory errors are due to fundamental neural fluctuations and, thus, whether the system performance is otherwise optimized. For example, it may be possible to estimate how much of the drift during oculomotor fixations is due to neural noise in the oculomotor integrator (46). The diffusion formula of Eq. **2** cannot be directly applied in this case because oculomotor spikes are significantly sub-Poisson. However, the information-diffusion inequality suggests that the internal FI can provide a good estimate of neural noise-driven diffusivity. In turn, at least in the goldfish oculomotor integrator with its well-characterized tuning curves and parameters, it may be possible to compute the internal FI (2, 47). Thus, this work provides a theoretical framework for assessing whether drift during fixations is dominated by integrator noise rather than other sources of variability, such as motor noise.

Using the information-diffusion inequality, we found that estimation of the memory network’s state does not significantly improve by waiting to collect spikes over longer intervals than approximately τ. This result solves what would otherwise be a conundrum: If it were profitable to integrate spikes for longer than a single-neuron time-constant, then a memory network would be required to read out the variable stored in another memory network. Instead, a network with no persistence can be near optimal in this task. We predict that our readout performance curves (Fig. 3) should match psychometric curves on recall performance as a function of the length of the postcue recall interval, with τ likely corresponding to biophysical single-neuron time-scales. Further, whether the measured psychometric curve is found to be monotonically increasing or peaked at the single-neuron time-constant may reveal whether memory readout is based on an ideal estimator or a simple but near-optimal estimator that assumes a static memory state.

We conclude with an intriguing implication of the information-diffusion inequality: With an exponential neural nonlinearity, minimizing diffusivity in symmetric networks is equivalent to maximizing internal FI, which implies that the same conditions optimize both memory performance and sensory representation.^{§} This suggests that a single brain area could perform both sensory representation and maintain a short-term memory, and thus provides mathematical support for the hypothesis from recent empirical findings (4, 5, 48), that sensory brain areas may be involved in storing the variables they represent.

## Materials and Methods

### Neural Network.

We consider a network of *N* linear–nonlinear Poisson (LNP) neurons with the following dynamics: The synaptic activation *s*_{i} due to spikes produced by neuron *i* is obtained by convolving the spike train with an exponential function: [7]where τ is the synaptic time-constant, taken for simplicity to be identical in all synapses, and are the spike times. The total synaptic input to neuron *i* is given by a weighted sum over the synaptic activations of its afferents [8]where *W*_{ij} is the weight matrix. The neuron emits spikes as an inhomogeneous Poisson process with rate [9]

### Weak Fluctuations.

In the limit in which each neuron is driven by many spikes, Eq. **7** can be replaced by (see *SI Appendix*, section I) [10]where ξ_{i} is a Gaussian process with covariance [11]This approximation is justified when the appropriate corresponding fluctuations in *g*_{i} are small compared to the mean. It may seem more justified to introduce the small noise term in *g* rather than *s*, but formally the two formulations are equivalent (*SI Appendix*, section I) because *g* and *s* are linearly related. One way to obtain the limit of weak fluctuations is to increase the number of neurons *N* in a network while scaling synaptic weights by 1/*N* to keep the total synaptic activation and neural tuning curves of individual neurons fixed (49). All the mathematical expressions in this work are lowest order terms of an expansion in 1/*N*. The noise-free rate equations are obtained by setting ξ_{i} to zero on the right-hand side of Eq. **10**: . Finally, the network weights are assumed to be tuned to support a continuous attractor manifold in the noise-free dynamics.^{¶}

### Network Simulations.

The ring-attractor network of Fig. 2 *A*–*C* consisted of *N* = 1,024 neurons with rotationally invariant inhibitory weights *W*_{ij} = *w*(θ_{i} - θ_{j}), where θ_{n} = 2π n/N and *w*(θ) = *A* exp[*k*_{1}(cos(θ) - 1)] - *A* exp[*k*_{2}(cos(θ) - 1)], with parameters: *A* = 1, *k*_{1} = 1, and *k*_{2} = 0.3. The neural transfer functions used in Fig. 2*C* were τ*ϕ*(*g*) = 10 exp(*g*) or τ*ϕ*(*g*) = 0.2[1 + tanh(*g* + 4)] (shown in Fig. 2*C*, *Inset*); *b* = -2 and the synaptic time-constant was τ = 10 ms. Synaptic activations *s*_{i} were initialized as uniformly distributed random numbers between 0 and 0.01, and the dynamics were simulated for 100 trials lasting 1,000 s each, using a stochastic Euler integrator with time-step *dt* = 0.1 ms. At each time-step per trial a spike in neuron *i* was generated with probability *r*_{i}(*t*)*dt*. To produce Fig. 2*C*, the instantaneous attractor state θ_{0}(*t*) was measured at each time-step by finding the angle θ_{i} at which *r*_{i}(*t*) is maximal. Subsequently, [θ(*t* + Δ*t*) - θ(*t*)]^{2} was averaged over the duration of the whole simulation, except for the first 10 ms. [Here θ(*t* + Δ*t*) - θ(*t*) was defined to be in the range (-π,π)]. Values of and *J* were obtained from Eqs. **2**, **3** and the steady state numerical solution for in the noise-free limit. The black trace in Fig. 2*D* was obtained using the theoretical prediction for and the analytic expression for the variance of a periodic variable, *SI Appendix*, Eq. **S32**.

The two-neuron network of Fig. 2 *E*–*G* consisted of *N* = 2 neurons coupled with weight matrix: *W*_{11} = *W*_{22} = 0; *W*_{12} = *W*_{21} = -1, and common feedforward input *b* = 5,000. The synaptic time-constant was τ = 100 ms, and synaptic dynamics, given by Eq. **7**, were integrated using the Euler method with time-step *dt* = 0.4 ms. Neural spiking was generated by an inhomogeneous Poisson process with rate given by *r*(*t*)*dt*, where *r*(*t*) = *ϕ*[*Ws*(*t*) + *b*] = *ϕ*[*g*(*t*)]. The neural transfer function was linear, τ*ϕ*(*x*) = *x*. Dynamics were simulated for *T*_{max} = 200 s. The two neurons may equivalently be thought of as two neuron groups of *M* neurons each, with a neuron from one group connecting to all neurons in the other group through uniform weights *W*/*M*. In this case, the firing rate *r*_{i}(*t*) of the *i*th group is simply the sum of the firing rates of all its individual cells. Fig. 2*F* shows the firing rates *r*_{1}(*t*) = *g*_{1}(*t*), *r*_{2}(*t*) = *g*_{2}(*t*) of the two neuron groups. The numerical variance σ^{2}(*T*) of Fig. 2*G* is computed as , with Δ*r*(*t*) = *r*_{1}(*t*) - *r*_{2}(*t*) and with the expectation taken over the times *t*∈[0,*T*_{max} - *T*] in the simulation. The value of was obtained from Eq. **2** to be *b*/2τ, by inserting the analytical solution for the steady states of neural activity in this two-neuron network: .

## Acknowledgments

We thank Sophie Deneve, Bard Ermentrout, and Haim Sompolinsky for helpful conversations and Ishay Mor for comments on the manuscript. Ila Fiete is a Sloan Foundation Fellow, a Searle Scholar, and a McKnight Scholar, and acknowledges funding from the National Science Foundation through NSF-EAGER 1148973.

## Footnotes

- ↵
^{1}To whom correspondence may be addressed. E-mail: ilafiete{at}mail.clm.utexas.edu or yoram.burak{at}elsc.huji.ac.il.

Author contributions: Y.B. and I.R.F. designed research, performed research, analyzed data, and wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1117386109/-/DCSupplemental.

↵

^{*}The result for symmetric networks is of special interest because many continuous attractor network models involve symmetric or nearly symmetric weights (7, 8, 11).↵

^{†}Because the internal FI is defined as the rate of Fisher information over an infinitesimally short time interval, there is no diffusion over such an interval. Thus, does not contribute to*J*(but*J*bounds ).↵

^{‡}The converse is also true: If the information-diffusion inequality is saturated, the transfer function is exponential (*SI Appendix*, section V).↵

^{§}The conclusion supposes that internal FI equals the Fisher information in sensory representation, or in other words, that tuning curves with and without external inputs are similar, which is typically true in continuous attractor networks (7).↵

^{¶}A small tuning correction, of higher order in 1/N, may be needed to eliminate all systematic drift in the noisy system, see*SI Appendix*, section II.

## References

- ↵
- ↵
- ↵
- Miller EK,
- Erickson CA,
- Desimone R

- ↵
- Supèr H,
- Spekreijse H,
- Lamme VA

- ↵
- ↵
- ↵
- Ben-Yishai R,
- Bar-Or R,
- Sompolinsky H

- ↵
- Zhang K

- ↵
- Seung HS

- ↵
- ↵
- ↵
- Compte A,
- Brunel N,
- Rakic GP,
- Wang X

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- Leff H,
- Rex A

- ↵
- ↵
- ↵
- ↵
- ↵
- Stevens C,
- Zador A

- ↵
- Gerstner W,
- Kistler W

- ↵
- ↵
- ↵
- ↵
- Seung HS,
- Sompolinsky H

- ↵
- Ganguli S,
- Huh D,
- Sompolinsky H

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- Mongillo G,
- Barak O,
- Tsodyks M

- ↵
- ↵
- ↵
- Rosenfeld N,
- Young JW,
- Alon U,
- Swain PS,
- Elowitz MB

- ↵
- ↵
- ↵
- ↵
- ↵
- Bays PM,
- Gorgoraptis N,
- Wee N,
- Marshall L,
- Husain M

- ↵
- ↵
- Aksay E,
- Baker R,
- Seung HS,
- Tank DW

- ↵
- ↵

## Citation Manager Formats

## Sign up for Article Alerts

## Article Classifications

- Biological Sciences
- Neuroscience

- Physical Sciences
- Physics