# Sample and population exponents of generalized Taylor’s law

^{a}Laboratory of Ecohydrology, School of Architecture, Civil and Environmental Engineering, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland;^{b}Department of Aquatic Ecology, Eawag: Swiss Federal Institute of Aquatic Science and Technology, CH-8600 Dübendorf, Switzerland;^{c}Dipartimento di Fisica ed Astronomia, Università di Padova, I-35131 Padova, Italy;^{d}Institute of Information Theory and Automation, Academy of Sciences of the Czech Republic, CZ-18208 Prague, Czech Republic;^{e}Dipartimento di Ingegneria Civile, Edile ed Ambientale, Università di Padova, I-35131 Padova, Italy; and^{f}Laboratory of Populations, The Rockefeller University and Columbia University, New York, NY 10065-6399

See allHide authors and affiliations

Contributed by Andrea Rinaldo, March 27, 2015 (sent for review January 14, 2015; reviewed by Pablo A. Marquet)

## Significance

Taylor’s law (TL) has been verified very widely in the natural sciences, information technology, and finance. The widespread observation of TL suggests that a context-independent mechanism may be at work and stimulated the search for processes affecting the scaling of population fluctuations with population abundance. We show that limited sampling may explain why TL is often observed to have exponent

## Abstract

Taylor’s law (TL) states that the variance *V* of a nonnegative random variable is a power function of its mean *M*; i.e., *b* measured empirically via the scaling of sample mean and variance typically cluster around the value *b* pertaining to the mean and variance of population density, depending on details of the growth process. Is the widely reported sample exponent *k*th vs. the *j*th cumulants. The sample exponent

Taylor’s law (TL) (1), also known as fluctuation scaling in physics, is one of the most verified patterns in both the biological (2⇓⇓⇓–6) and physical (7⇓⇓⇓⇓–12) sciences. TL states that the variance of a nonnegative random variable *N* of a censused population and TL can arise in time (i.e., the statistics of *N* are computed over time) or in space (i.e., the statistics are computed over space). The widespread verification of TL has led many authors to suggest the existence of a universal mechanism for its emergence, although there is currently no consensus on what such a mechanism would be. Various approaches have been used in the attempt, ranging from the study of probability distributions compatible with the law (13⇓–15) to phenomenological and mechanistic models (16⇓⇓⇓–20). Although most empirical studies on spatial TL report an observed sample exponent *b* in the range 1–2 (1, 21), mostly around *b* can undergo abrupt transitions following smooth changes in the environmental autocorrelation.

Here, we distinguish between values of *b* derived from empirical fitting (sample exponents) and values obtained via theoretical models that pertain to the probability distribution of the random variable *N* (population exponents). We show that in a broad class of multiplicative growth models, the sample and population exponents coincide only if the number of observed samples or replicates is greater than an exponential function of the duration of observation. Among the relevant consequences, we demonstrate that the sample TL exponent robustly settles on

## Results

Let us consider multiplicative growth models in Markovian environments (24, 25). Let *t* and assume that the initial density is *Methods*). In our notation, *i* to state *j*; i.e., *Methods*) exact results on both sample and population TL exponents for a broad class of multiplicative processes, including state spaces with size higher than 2 and nonsymmetric transition matrices (*SI Methods*).

By adopting large deviation theory techniques (27, 28) and finite sample size arguments (29), we show (*Methods*) that for any choice of *χ*, the sample mean and variance in a finite set of *R* independent realizations of the process obey TL asymptotically as *e*) where the sample TL holds with different exponents. In the former regime, sample exponents inevitably tend to **9**) at small times to the asymptotic prediction **13**).

We derive a generalized TL that involves the scaling of the *k*th moment vs. the *j*th moment of the distribution of *Methods*) show that the generalized TL,*t* for any choice of *j* and *k* (including noninteger values), both for population and for sample moments (the positivity of *k*th and *j*th cumulants) (*SI Methods*). In accordance with the above results on the conventional TL (recovered in this framework with the choice *C* and *D*).

In ecological contexts, the number of realizations *R* that determine the possible convergence of sample and population TL exponents could refer, for instance, to independent patches experiencing different realizations of the same climate (24). In an established ecosystem, species have been present for several generations, and one might assume that the system is in the asymptotic regime *t* sample exponents satisfy the relation

A first example is drawn from a long-term census of plots within the Black Rock Forest (BRF) (5). It was shown that the Lewontin–Cohen model (a particular case of the multiplicative model studied here) describes the population dynamics of trees in BRF and provides an interpretation of the TL exponent (5). The interpretation of the six plots as distinct and independent replicates of the Lewontin–Cohen model is supported by statistical analysis (5) and allowed relating the model predictions to the spatial TL. Here, we computed, for each year *t*, the spatial sample moments *Methods*).

A second example uses the data collected by P. den Boer (30), who measured abundances of carabid beetles in various sites across The Netherlands within a 200-km^{2} area for 8 consecutive years. The dataset was shown to support the conventional spatial TL (16). We computed the sample moments of carabid beetles abundance, *t*. In the intraspecific analysis (Fig. 3), linear regressions of *Y* is the total number of years) gave the estimate of the sample exponent *Insets*); for every integer choice of *j* and *k* (here, up to *t* test does not reject the null hypothesis that the sample mean of the values of

The empirical confirmation and the finding that other demographic models predict the generalized TL with *SI Text*) indicate that these predictions are probably insensitive to the details of the dynamics, just as the original TL is quite robust (3, 15, 31).

## Discussion

Understanding to what extent widely reported macroecological patterns are the result of statistical instead of ecological processes is one of the main challenges in ecology (32). Here, we have uncovered a general mechanism that yields TL with the widely observed sample exponent

Limited sampling efforts might hinder the observation of abrupt transitions in population exponents that were recently discovered for theoretical multiplicative growth processes. Because fluctuations in population abundances strongly affect ecological dynamics, in particular extinction risk, comparable real-world transitions may harm fish populations, forests, and public health. Our calculation of the minimum number of samples required to observe such transitions may help to identify early signals of abrupt biotic change following smooth changes in the environment.

## Methods

### Theoretical Analysis.

Let *π* of the chain is unique and in the symmetric case satisfies

where *δ* is Kronecker’s delta. The random measure *r* appears in a realization of the Markov chain up to time *t*.

where *x* (*r* in a realization of the Markov chain up to time *t* (correspondingly, the proportion of *s* is *u* is a strictly positive vector in **4** depends on

where

The rate function does not depend on the values of the multiplicative factors *r* and *s*. As in ref. 25, we consider the ratio between

See the appendix in ref. 25 for a proof. Then, for the population moments of the population density

where *b* (which depends on *λ*) can thus be computed as

For certain values of *r* and *s*, *λ* (black line in Fig. 1*A*). The existence of such discontinuity was discovered and discussed in ref. 24. An analysis of the critical transition probability is available in *SI Methods* (Figs. S3 and S4). A generalized TL can be derived by adapting Eq. **8** to compute the scaling exponent for any pair of population moments as

Discontinuities can also arise for these population exponents (*SI Methods*).

Eqs. **9** and **10** hold true when one considers an infinite number of realizations of the multiplicative process, which ensures visiting the whole region *b* that is based on the sample mean and variance calculated over a finite set of *R* realizations of the multiplicative process. We present here a heuristic derivation of the sample exponent. A more rigorous calculation is given in *SI Methods*. We define *x* of *r* in *R* runs of the Markov chain up to time *t* is

With this definition, *r* in *R* realizations of the chain. Analogously, we define *r* are observed with probability *t*, one can adapt Varadhan’s lemma (or Laplace’s method of integration) to obtain, as a function of *t*, the approximate number of replicas *R* needed to explore rare events [i.e., to compute

Inversion of this formula (by taking the logarithm on both sides and expanding *R* realizations of the process can be approximated as

where the dependence on *t* is through **1**. Because *R* the suprema in Eq. **13** are computed over an increasingly narrower set around *t* increases (Fig. 5). Thus, for any finite number of realizations *R*, the sample exponent will approximate *R* (Eq. **12** and Fig. 2), for any choice of *λ*, *r*, and *s*. For example, with **9** are included in **9** is *r* and *s* are such that the population TL exponent *b* displays a discontinuity at *SI Methods*), then the above results give the minimum number of replicates required to observe such discontinuity also in the sample TL exponent.

Analogous considerations hold for the asymptotic sample exponent describing the scaling of the sample moments

which is the analog of Eq. **13** for any pair of sample moments. Fig. S1 *C* and *D* shows that simulation results and theoretical predictions for

A standard saddle-point calculation suggests that the limiting growth rate of the variance is equal to the limiting growth rate of the second moment also for ergodic transition matrices, apart from peculiar cases (see ref. 25 for a discussion of a counterexample). The same argument suggests that the limiting growth rate of the *k*th cumulant equals that of the *k*th moment (*t*. The suggested equivalence between the scaling exponents of cumulants and moments for ergodic *m*-step Markov chains, whose transition matrix is ergodic but not twofold irreducible. However, pathological counterexamples may exist.

Eq. **13** gives the estimated sample exponent of TL asymptotically, ignoring the constant term in the scaling of the variance *V* vs. the mean *M* as *t*, *R*, *λ*, *r*, and *s*) from the population exponent **9** (observed when *t* (thus not neglecting the constant term

(compare Eq. **8**) in Fig. 2*A* and as the sample moments in simulations in Fig. 2*B*.

See *SI Methods* for further details and generalizations.

### Empirical Analysis.

We used the BRF dataset to show that the generalized TL holds with sample exponent

The intraspecific form of TL and the generalized scaling relationship between higher moments (Eq. **2**) were tested using abundance data from 26 species of carabid beetles. We have limited the analysis of the intraspecific TL to the set of species that were present in all sites in each given year. We have followed the researchers who collected the carabid beetles abundance data (30) in excluding species with year samples with zero individuals in at least one of the sites from the statistical analysis. The authors of ref. 30 declared that they were unable to differentiate sites where a species was not present from sites where the density of such species was so low that no catches were realized. For each species, we selected data from a minimum of three to a maximum of six sites (all either woodland or heath) (30) and from a minimum of 4 to a maximum of 6 consecutive years. The precise number of sites and years varied for each species, depending on the number of sites and years in which at least one individual of such species was found in each site. The moments of species abundance were calculated separately for each species and for each available year. Linear regressions of *Y* is the total number of available years for the selected species and *k*th spatial sample moment in year *t*] gave the estimate of the sample exponent

The interspecific form of TL and the generalized scaling relationship for statistical moments (Eq. **2**) were investigated following ref. 16, using the carabid beetles dataset, computing spatial sample moments across similar sites. Data from sites labeled B, C, X, and AE in ref. 30, collected between 1961 and 1966, were used to calculate spatial moments across woodland sites. Data from sites labeled AT, N, Z, and AG in ref. 30, collected between 1963 and 1966, were used to calculate spatial moments across heath sites. As for the intraspecific TL analysis, we have limited the analysis of the interspecific TL to the set of species that were present in all sites in each given year. Spatial moments of carabid beetles abundance were computed for each species individually and separately for each year and site type (woodland or heath). For each year, we calculated the least-squares slope of *A* and Fig. S5 *M* and *N* show the scaling of the *k*th sample moment *A–L* shows the scaling of the *k*th sample moment

See *SI Methods* and Tables S3 and S4 for further details.

## Acknowledgments

We thank Dr. Hugo Touchette for discussions and Dr. Markus Fischer for discussions and a careful reading of the manuscript. A.R. and A.G. acknowledge the support provided by the discretionary funds of Eawag: Swiss Federal Institute of Aquatic Science and Technology and by the Swiss National Science Foundation Project 200021_157174. M.F. has been partially supported by Grantová agentura České republiky Grant P201/12/2613. J.E.C. acknowledges the support of US National Science Foundation Grant DMS-1225529 and the assistance of Priscilla K. Rogerson.

## Footnotes

- ↵
^{1}To whom correspondence may be addressed. Email: andrea.rinaldo{at}epfl.ch, andrea.giometto{at}epfl.ch, or marco.formentin{at}ruhr-uni-bochum.de.

Author contributions: A.G., M.F., A.R., J.E.C., and A.M. designed research, performed research, analyzed data, and wrote the paper.

Reviewers included: P.A.M., Catholic University of Chile.

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1505882112/-/DCSupplemental.

Freely available online through the PNAS open access option.

## References

- ↵
- ↵.
- Marquet PA, et al.

- ↵
- ↵.
- Ramsayer J,
- Fellous S,
- Cohen JE,
- Hochberg ME

- ↵.
- Cohen JE,
- Xu M,
- Schuster WSF

- ↵.
- Giometto A,
- Altermatt F,
- Carrara F,
- Maritan A,
- Rinaldo A

- ↵
- ↵
- ↵
- ↵.
- Caldarelli G

- ↵
- ↵
- ↵
- ↵.
- Jørgensen B

- ↵
- ↵.
- Hanski I

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- den Hollander F

*Fields Institute Monographs*(Am Math Soc, Providence, RI) - ↵
- ↵.
- Redner S

- ↵.
- den Boer P

*Miscellaneous Papers 14*(Landbouwhogeschool Wageningen, Wageningen, The Netherlands) - ↵.
- Xiao X,
- Locey KJ,
- White EP

- ↵
- ↵.
- Dembo A,
- Zeitouni O

*Stochastic Modelling and Applied Probability*(Springer, Berlin Heidelberg)

## Citation Manager Formats

### See related content:

- Skewed distributions lead to Taylor’s power law- Apr 07, 2015