## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Thermodynamics and signatures of criticality in a network of neurons

Contributed by William Bialek, August 4, 2015 (sent for review July 22, 2014)

## Significance

The activity of a brain—or even a small region of a brain devoted to a particular task—cannot be just the summed activity of many independent neurons. Here we use methods from statistical physics to describe the collective activity in the retina as it responds to complex inputs such as those encountered in the natural environment. We find that the distribution of messages that the retina sends to the brain is very special, mathematically equivalent to the behavior of a material near a critical point in its phase diagram.

## Abstract

The activity of a neural network is defined by patterns of spiking and silence from the individual neurons. Because spikes are (relatively) sparse, patterns of activity with increasing numbers of spikes are less probable, but, with more spikes, the number of possible patterns increases. This tradeoff between probability and numerosity is mathematically equivalent to the relationship between entropy and energy in statistical physics. We construct this relationship for populations of up to *N* = 160 neurons in a small patch of the vertebrate retina, using a combination of direct and model-based analyses of experiments on the response of this network to naturalistic movies. We see signs of a thermodynamic limit, where the entropy per neuron approaches a smooth function of the energy per neuron as *N* increases. The form of this function corresponds to the distribution of activity being poised near an unusual kind of critical point. We suggest further tests of criticality, and give a brief discussion of its functional significance.

Our perception of the world seems a coherent whole, yet it is built out of the activities of thousands or even millions of neurons, and the same is true for our memories, thoughts, and actions. It is difficult to understand the emergence of behavioral and phenomenal coherence unless the underlying neural activity also is coherent. Put simply, the activity of a brain—or even a small region of a brain devoted to a particular task—cannot be just the summed activity of many independent neurons. How do we describe this collective activity?

Statistical mechanics provides a language for connecting the interactions among microscopic degrees of freedom to the macroscopic behavior of matter. It provides a quantitative theory of how a rigid solid emerges from the interactions between atoms, how a magnet emerges from the interactions between electron spins, and so on (1, 2). These are all collective phenomena: There is no sense in which a small cluster of molecules is solid or liquid; rather, solid and liquid are statements about the joint behaviors of many, many molecules.

At the core of equilibrium statistical mechanics is the Boltzmann distribution, which describes the probability of finding a system in any one of its possible microscopic states. As we consider systems with larger and larger numbers of degrees of freedom, this probabilistic description converges onto a deterministic, thermodynamic description. In the emergence of thermodynamics from statistical mechanics, many microscopic details are lost, and many systems that differ in their microscopic constituents nonetheless exhibit quantitatively similar thermodynamic behavior. Perhaps the oldest example of this idea is the “law of corresponding states” (3).

The power of statistical mechanics to describe collective, emergent phenomena in the inanimate world led many people to hope that it might also provide a natural language for describing networks of neurons (4⇓–6). However, if one takes the language of statistical mechanics seriously, then as we consider networks with larger and larger numbers of neurons, we should see the emergence of something like thermodynamics.

## Theory

At first sight, the notion of a thermodynamics for neural networks seems hopeless. Thermodynamics is about temperature and heat, both of which are irrelevant to the dynamics of these complex, nonequilibrium systems. However, all of the thermodynamic variables that we can measure in an equilibrium system can be calculated from the Boltzmann distribution, and hence statements about thermodynamics are equivalent to statements about this underlying probability distribution. It is then only a small jump to realize that all probability distributions over *N* variables can have an associated thermodynamics in the

To be concrete, consider a system with *N* elements; each element is described by a state *Supporting Information*, all of thermodynamics can be derived from the distribution of these energies. Specifically, what matters is how many states have *E*. We can count this number of states, *E*, *N* varies, then a thermodynamic limit will exist provided that both the entropy and the energy are proportional to *N* at large *N*. The existence of this limit is by no means guaranteed.

In most systems, including the networks that we study here, there are few states with high probability, and many more states with low probability. At large *N*, the competition between decreasing probability and increasing numerosity picks out a special value of *N* degrees of freedom in the system. At special values of these parameters, *E* diverges as *N* becomes large. This is a critical point, and it is mathematically equivalent to the divergence of the specific heat in an equilibrium system (10).

These observations focus our attention on the “density of states”

## Experimental Example

The vertebrate retina offers a unique system in which the activity of most of the neurons comprising a local circuit can be monitored simultaneously using multielectrode array recordings. As described more fully in ref. 11, we stimulated salamander retina with naturalistic grayscale movies of fish swimming in a tank (Fig. 1*A*), while recording from 100 to 200 retinal ganglion cells (RGCs); additional experiments used artificial stimulus ensembles, as described in *Supporting Information*. Sorting the raw data (12), we identified spikes from 160 neurons whose activity passed our quality checks and was stable for the whole *B*. These experiments monitored a substantial fraction of the RGCs in the area of the retina from which we record, capturing the behavior of an almost complete local population responsible for encoding a small patch of the visual world. The experiment collected a total of

## Counting States

Conceptually, estimating the function *C* and *D*, we show the first steps in this process. We identify the unique patterns of activity—combinations of spiking and silence across all 160 neurons—that occur in the experiment, and then count how many times each of these patterns occurs.

Even without trying to compute *D* are surprising. With *N* neurons that can either spike or remain silent, there are

To probe more deeply into the tail of low-probability events, we can construct models of the distribution of states, and we have done this using the maximum entropy method (11, 13): We take from experiment certain average behaviors of the network, and then search for models that match these data but otherwise have as little structure as possible. This works if matching a relatively small number of features produces a model that predicts many other aspects of the data.

The maximum entropy approach to networks of neurons has been explored, in several different systems, for nearly a decade (14⇓⇓⇓⇓⇓⇓⇓⇓–23), and there have been parallel efforts to use this approach in other biological contexts (24⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓–35). Recently, we have used the maximum entropy method to build models for the activity of up to *K* out of the *N* neurons spike in the same small window of time [*Z* is set to ensure normalization. All of the parameters

This model accurately predicts the correlations among triplets of neurons (figure 7 in ref. 11), and how the probability of spiking in individual neurons depends on activity in the rest of the population (figure 9 in ref. 11). One can even predict the time-dependent response of single cells from the behavior of the population, without reference to the visual stimulus (figure 15 in ref. 11). Most important for our present discussion, the distribution of the energy

The direct counting of states (Fig. 1) and the maximum entropy models (Fig. 2) give us two complementary ways of estimating the function *Supporting Information*.

As emphasized above, the plot of entropy vs. energy contains all of the thermodynamic behavior of a system, and this has a meaning for any probability distribution, even if we are not considering a system at thermal equilibrium. Thus, Fig. 3 is as close as we can get to constructing the thermodynamics of this network. With the direct counting of states, we see less and less of the plot at larger *N*, but the part we can see is approaching a limit as **4**, then, in a purely theoretical discussion, we can scale the couplings between neurons *N* to guarantee the existence of a thermodynamic limit (5), but with *A*, *Inset*. The results of this extrapolation are strikingly simple: The entropy is equal to the energy, within (small) error bars.

## Interpreting the Entropy vs. Energy Plot

If the plot of entropy vs. energy is a straight line with unit slope, then Eq. **2** is solved not by one single value of *E* but by a whole range. Not only do we have

We expect that states of lower probability (e.g., those in which more cells spike) are more numerous (because there are more ways to arrange *K* spikes among *N* cells as *K* increases from very low values). However, the usual result is that this trade-off—which is precisely the trade-off between energy and entropy in thermodynamics—selects typical states that all have roughly the same probability. The statement that

The vanishing of **3** and **4** is mathematically identical to the Boltzmann distribution. Thus, we can take this model seriously as a statistical mechanics problem, and compute the specific heat in the usual way. Further, we can change the effective temperature by considering a one-parameter family of models,**4**). Changing *T* is just a way of probing one direction in the parameter space of possible models, and is not a physical temperature; the goal is to see whether there is anything special about the model (at

Results for the heat capacity of our model vs. *T* are shown in Fig. 4. There is a dramatic peak, and, as we look at larger groups of neurons, the peak grows and moves closer to *N*, so that the specific heat, or heat capacity per neuron, is growing with *N*, as expected at a critical point, and these signatures are clearer in models that provide a more accurate description of the population activity; for details, see *Supporting Information*.

The temperature is only one axis in parameter space, and, along this direction, there are variations in both the correlations among neurons and their mean spike rates. As an alternative, we consider a family of models in which the strength of correlations changes but spike rates are fixed. To do this, we replace the energy function in Eq. **4** with*α* controls the strength of correlations, and we adjust all of the

At *A*, *Left*). For *A*, *Right*). This is reflected in a distribution of states that cluster around a small number of prototypical patterns, much as in the Hopfield network (4, 5). The entropy vs. energy plot, shown in Fig. 5*B*, singles out the ensemble at *A*, *Middle*): Going toward independence (smaller *α*) gives rise to a concave bump at low energies, whereas *D* that there is a peak in the specific heat of the model ensemble near

The evidence for criticality that we find here is consistent with extrapolations from the analysis of smaller populations (15, 18). Those predictions were based on the assumption that spike probabilities and pairwise correlations in the smaller populations are drawn from the same distribution as in the full system, and that these distributions are sufficient to determine the thermodynamic behavior (36). Signs of criticality also are observable in simpler models, which match only the distribution of summed activity in the network (22), but less accurate models have weaker signatures of criticality (Fig. S2).

## Couldn’t It Just Be…?

In equilibrium thermodynamics, poising a system at a critical point involves careful adjustment of temperature and other parameters. Finding that the retina seems to be poised near criticality should thus be treated with some skepticism. Here we consider some ways in which we could be misled into thinking that the system is critical when it is not (see also *Supporting Information*).

Part of our analysis is based on the use of maximum entropy models, and one could worry that the inference of these models is unreliable for finite data sets (37, 38). Expanding on the discussion of this problem in ref. 11, we find a clear peak in the specific heat when we learn models for

Although the inference of maximum entropy models is accurate, less interesting models might mimic the signatures of criticality. In particular, it has been suggested that independent neurons with a broad distribution of spike rates could generate a distribution of *N* neuron activity patterns *N* (Fig. 4), which is an essential feature of the data and its support for criticality.

In maximum entropy models, the probability distribution over patterns of neural activity is described in terms of interactions between neurons, such as the terms **4**; an alternative view is that the correlations result from the response of the neurons to fluctuating external signals. Testing this idea has a difficulty that has nothing to do with neurons: In equilibrium statistical mechanics, models in which spins (or other degrees of freedom) interact with one another are mathematically equivalent to a collection of spins responding independently to fluctuating fields (see *Supporting Information* for details). Thus, correlations always are interpretable as independent responses to unmeasured fluctuations, and, for neurons, there are many possibilities, including sensory inputs. However, the behavior that we see cannot be simply inherited from correlations in the visual stimulus, because we find signatures of criticality in response to movies with very different correlation structures (Fig. S4). Further, the pattern of correlations among neurons is not simply explained in terms of overlaps among receptive fields (Fig. S5), and, at fixed moments in the stimulus movie, neurons with nonzero spike probabilities have correlations across stimulus repetitions that can be even stronger than across the experiment as a whole (Fig. S6).

When we rewrite a model of interacting spins as independent spins responding to fluctuating fields, criticality usually requires that the distribution of fluctuations be very special, e.g., with the variance tuned to a particular value. In this sense, saying that correlations result from fluctuating inputs doesn’t explain our observations. Recently, it has been suggested that sufficiently broad distributions of fluctuations lead generically to critical phenomenology (40). As explained in *Supporting Information*, mean field models have the property that the variance of the effective fields becomes large at the critical point, but more general models do not, and the correlations we observe do not have the form expected from a mean field model. The fact that quantitative changes in the strength of correlations would drive the system away from criticality (Fig. 5*D*) suggests that the distribution of equivalent fluctuating fields must be tuned, rather than merely having sufficiently large fluctuations.

## Discussion

The traditional formulation of the neural coding problem makes an analogy to a dictionary, asking for the meaning of each neural response in terms of events in the outside world (41). However, before we can build a dictionary, we need to know the lexicon, and, for large populations of neurons, this already is a difficult problem: With 160 neurons, the number of possible responses is larger than the number of words in the vocabulary of a well-educated English speaker, and is more comparable to the number of possible short phrases or sentences. In the same way that the distribution of letters in words embodies spelling rules (28), and the distribution of words in sentences encodes aspects of grammar (42) and semantic categories (43), we expect the distribution of activity across neurons to reveal structures of biological significance.

In the small patch of the retina that we consider, no two cells have truly identical input/output characteristics (44). Nonetheless, if we count how many combinations of spiking and silence have a given probability in groups of *N*. This relationship between probability and numerosity of states is mathematically identical to the relationship between energy and entropy in statistical physics, and the simplification with increasing *N* suggests that we are seeing signs of a thermodynamic limit.

If we can identify the thermodynamic limit, we can try to place the network in a phase diagram of possible networks. Critical surfaces that separate different phases often are associated with a balance between probability and numerosity: States that are a factor *F* times less probable also are a factor *F* times more numerous. At conventional critical points, this balance occurs only in a small neighborhood of the typical probability, but, in the network of RGCs, it extends across a wide range of probabilities (Fig. 3). In model networks with slightly stronger or weaker correlations among pairs of neurons, this balance breaks down (Fig. 5), and less accurate models have weaker signatures of criticality (Fig. S2).

The strength of correlations depends on the structure of visual inputs, on the connectivity of the neural circuit, and on the state of adaptation in the system. The fact that we see signatures of criticality in response to very different movies, but not in model networks with stronger or weaker correlations, suggests that adaptation is tuning the system toward criticality. A sudden change of visual input statistics should thus drive the network to a noncritical state, and, during the course of adaptation, the distribution of activity should relax back to the critical surface. This can be tested directly.

Is criticality functional? The extreme inhomogeneity of the probability distribution over states makes it possible to have an instantaneously readable code for events that have a large dynamic range of likelihoods or surprise, and this may be well-suited to the the natural environment; it is not, however, an efficient code in the usual sense. Systems near critical points are maximally responsive to certain external signals, and this sensitivity also may be functionally useful. Most of the systems that exhibit criticality in the thermodynamic sense also exhibit a wide range of time scales in their dynamics, so that criticality may provide a general strategy for neural systems to bridge the gap between the microscopic time scale of spikes and the macroscopic time scales of behavior. Critical states are extremal in all these different senses, and more; it may be difficult to decide which is relevant for the organism.

Related signatures of criticality have been detected in ensembles of amino acid sequences for protein families (29), in flocks of birds (33) and swarms of insects (45), and in the network of genes controlling morphogenesis in the early fly embryo (46); there is also evidence that cell membranes have lipid compositions tuned to a true thermodynamic critical point (47). Different, dynamical notions of criticality have been explored in neural (48, 49) and genetic (50, 51) networks, and in the active mechanics of the inner ear (52⇓–54); recent work connects dynamical and statistical criticality, with the retina as an example (55). These results hint at a general principle, but there is room for skepticism. A new generation of experiments should provide decisive tests of these ideas.

## Materials and Methods

Experiments were performed on the larval tiger salamander, *Ambystoma tigrinum tigrinum*, in accordance with institutional animal care standards at Princeton University.

## Acknowledgments

We thank A. Cavagna, D. S. Fisher, I. Giardina, M. Ioffe, S. C. F. van Opheusden, E. Schneidman, D. J. Schwab, J. P. Sethna, and A. M. Walczak for helpful discussions and comments on the manuscript. Research was supported in part by National Science Foundation Grants PHY-1305525, PHY-1451171, and CCF-0939370, by National Institutes of Health Grant R01 EY14196, and by Austrian Science Foundation Grant FWF P25651. Additional support was provided by the Fannie and John Hertz Foundation, by the Swartz Foundation, by the W. M. Keck Foundation, and by the Simons Foundation.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. Email: wbialek{at}princeton.edu.

Author contributions: G.T., T.M., O.M., D.A., S.E.P., M.J.B., and W.B. designed research, performed research, contributed new reagents/analytic tools, analyzed data, and wrote the paper. This is a collaboration between theorists (G.T., T.M., S.E.P., and W.B.) and experimentalists (O.M., D.A., and M.J.B.). All authors contributed to all aspects of the work.

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1514188112/-/DCSupplemental.

Freely available online through the PNAS open access option.

## References

- ↵.
- Anderson PW

- ↵.
- Sethna JP

- ↵
- ↵.
- Hopfield JJ

- ↵.
- Amit DJ

- ↵.
- Hertz J,
- Krogh A,
- Palmer RG

- ↵.
- Ruelle D

- ↵
- ↵
- ↵
- ↵
- ↵.
- Marre O, et al.

- ↵
- ↵
- ↵.
- Tkačik G,
- Schneidman E,
- Berry MJ II,
- Bialek W

- ↵
- ↵.
- Tang A, et al.

- ↵.
- Tkačik G,
- Schneidman E,
- Berry MJ II,
- Bialek W

- ↵.
- Shlens J, et al.

- ↵
- ↵.
- Ganmor E,
- Segev R,
- Schneidman E

- ↵.
- Tkačik G, et al.

- ↵
- ↵.
- Lezon TR,
- Banavar JR,
- Cieplak M,
- Maritan A,
- Fedoroff NV

- ↵.
- Tkačik G

- ↵.
- Bialek W,
- Ranganathan R

- ↵.
- Weigt M,
- White RA,
- Szurmant H,
- Hoch JA,
- Hwa T

- ↵
- ↵.
- Mora T,
- Walczak AM,
- Bialek W,
- Callan CG Jr

- ↵
- ↵.
- Sułkowska JI,
- Morcos F,
- Weigt M,
- Hwa T,
- Onuchic JN

- ↵.
- Bialek W, et al.

- ↵.
- Bialek W, et al.

- ↵
- ↵
- ↵
- ↵
- ↵.
- Marsili M,
- Mastromatteo I,
- Roudi Y

- ↵.
- van Opheusden SCF

- ↵
- ↵.
- Rieke F,
- Warland D,
- de Ruyter van Steveninck R,
- Bialek Spikes W

- ↵.
- Pereira F

- ↵.
- Pereira FC,
- Tishby N,
- Lee L

*Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics*, ed Schubert LK (Assoc Comput Linguist, Stroudsburg, PA), pp 183–190 - ↵.
- Becker S,
- Thrun S,
- Obermayer K

- Schneidman E,
- Bialek W,
- Berry MJ II

- ↵
- ↵.
- Krotov D,
- Dubuis JO,
- Gregor T,
- Bialek W

- ↵
- ↵.
- Beggs JM,
- Plenz D

- ↵
- ↵.
- Nykter M, et al.

- ↵
- ↵
- ↵.
- Camalet S,
- Duke T,
- Jülicher F,
- Prost J

- ↵
- ↵

## Citation Manager Formats

## Sign up for Article Alerts

## Article Classifications

- Physical Sciences
- Physics

- Biological Sciences
- Neuroscience