## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Rainfall statistics, stationarity, and climate change

Contributed by Graham D. Farquhar, January 6, 2018 (sent for review August 23, 2017; reviewed by Hoshin V. Gupta and Alberto Montanari)

## Significance

Precipitation shows large year-to-year variations, and there is interest in whether there have been long-lasting changes. We use a global land-based database (1940–2009) of annual precipitation and find evidence for changes at around 14% of the global land surface. In contrast, around 76% of the global land shows little or no change. Our results emphasize the importance of fully accounting for natural variability when assessing long-term precipitation change.

## Abstract

There is a growing research interest in the detection of changes in hydrologic and climatic time series. Stationarity can be assessed using the autocorrelation function, but this is not yet common practice in hydrology and climate. Here, we use a global land-based gridded annual precipitation (hereafter *P*) database (1940–2009) and find that the lag 1 autocorrelation coefficient is statistically significant at around 14% of the global land surface, implying nonstationary behavior (90% confidence). In contrast, around 76% of the global land surface shows little or no change, implying stationary behavior. We use these results to assess change in the observed *P* over the most recent decade of the database. We find that the changes for most (84%) grid boxes are within the plausible bounds of no significant change at the 90% CI. The results emphasize the importance of adequately accounting for natural variability when assessing change.

Long-term planning for water resources, agriculture, and irrigation and for associated infrastructure is currently based on the ability to adequately characterize variations in the two key observables: streamflow and precipitation (*P*) (1⇓–3). The starting point is to gather the longest reliable instrumental time series to characterize the hydrologic statistics of streamflow and *P* (2). That traditional practice is intrinsic throughout hydrology. To use that historical knowledge requires a further assumption relating to the stationarity of the relevant time series. The assumption has been recently challenged (4), because future streamflow may not be stationary. The objection is well-supported by evidence of anthropogenic impacts on streamflow, such as increasing water extraction for irrigation or land use/cover changes (5⇓⇓–8). One consequence is that the future envelope of variability in streamflow is not necessarily well-represented by the past envelope, which potentially presents a major challenge to the community of hydrologists and water resource engineers (9, 10). A question remains about the stationarity of *P*. Several studies have noted changes in the intensity of short-term precipitation (e.g., subdaily) related to increasing temperatures (11, 12). The implication is that the short-term *P* is not stationary. The long-term *P* of relevance for water resources, agriculture, and irrigation depends on both storm intensity and storm frequency, and whether that is stationary has yet to be addressed. Here, we focus on interannual variations and restrict the analysis to annual *P*.

Recently, it has been widely argued that common climate variables (e.g., *P*) will be stationary if the mean remains constant over climatic (e.g., 30-y) timescales (for example, in ref. 13 and earlier in refs. 14 and 15). However, the formal definition of stationarity goes beyond the mean and involves higher-order statistics (variance and higher-order moments) as well as the autocovariance (or autocorrelation) (16⇓⇓–19). This formal definition, known as strict stationarity, has limited utility in practical applications, because we can only ever have samples from the underlying population. For that reason, practical applications use a different framework known as weak stationarity or sometimes also known as covariance stationarity, which uses a definition that is restricted to the mean and the autocorrelation (or autocovariance) (16, 17). We adopt this as our practical definition of stationarity. With that in mind, a time series is considered stationary if (*i*) the mean is constant and (*ii*) the autocorrelation only depends on the relative position in the time series (16, 17).

In climate science, we are restricted to sampling from finite periods, and the best that we can do is estimate the mean from the longest available records to provide the baseline against which we can assess the characteristically large year-to-year variations in *P* (20⇓–22). The other part of the definition refers to the autocorrelation. The autocorrelation is computed by shifting the data series in time and calculating the correlation with the original time series. Autocorrelation as a function of time means the progress of it (at any lag) as time passes. The autocorrelation at lag zero is one by definition. A useful reference is the autocorrelation function for a purely random (i.e., independent in a statistical sense and not implying nondeterminism) time series that, for all other lags, has an autocorrelation that is statistically indistinguishable from zero.

For example, in the Murray–Darling Basin in southeastern Australia, the autocorrelation of the annual *P* time series (1901–2007) is statistically indistinguishable from zero for lags from 1 to 30 y (23). That result shows that the 106-y record of annual rainfall for the Murray–Darling Basin is indistinguishable from a random process. In a physical sense, it means that there is no memory (17) of *P* from one year to the next. Given the seemingly random nature of annual *P* time series (24⇓⇓–27), we anticipated that the Murray–Darling Basin results might be typical of many regions worldwide.

## Annual Precipitation at the Radcliffe Observatory

To develop a deeper understanding, we use one of the longest instrumental records of annual *P*—the 244-y (1767–2010) record from the Radcliffe Observatory site at Oxford, United Kingdom (28) (Fig. 1*A*). We first estimate the autocorrelation of the time series for lags from 0 to 80 y. [The probability density function of the autocorrelation follows the normal distribution (*SI Appendix*, Fig. S1).] The 90% confidence limits (18, 19) for the autocorrelation (*B*). The process can be considered random. Note that, within the time series, one can detect shorter periods with apparent trends. For example, a steady increase in *P* from 1801 to 1850 is statistically significant (at the 5% level of significance) according to the Mann–Kendall test.

We now calculate the averages over climatic timescales: the commonly used 30-y (Fig. 1*C*) period as well as a 10-y (Fig. 1*D*) period. The results show the average changes from one period to the next, while the overall time series remains stationary.

The key question is whether the variations in the average *P* over successive 30-y (or 10-y) periods are large enough to be considered statistically significant. The variance (^{−1})^{2}, and the mean (μ) is 644 mm yr^{−1}. The time series has no serial correlation (i.e., zero autocorrelation), and sampling theory can be used to estimate the SE. The SE of the 244-y mean (29) is ^{−1}. For the 30-y average, the SE (^{−1}.

Because of the absence of serial correlation, the SD of the difference between the means (^{−1}. The relevant 90% CI (^{−1}) for the 30-y averages is shown in Fig. 1*C*. That variation is equivalent to 5.5% of the long-term mean μ. (The same logic was used in Fig. 1*D*.) Of the eight successive 30-y periods, one particularly dry period (1887–1916) fell outside the 90% limits, while two other 30-y periods were close to the bounds. For the decadal periods, six averages were on the 90% limits, with the rest falling within the 90% bounds (Fig. 1*D*). Note that use of a 95% CI would strengthen the conclusion (*SI Appendix*, Fig. S2). However, to reduce the probability of making a type II error, we adopt the 90% CI (19). In summary, the long-term Radcliffe Observatory annual *P* data reveal a record that is more or less indistinguishable from a random process [^{−1}, ^{−1})^{2}].

We also examined the observed annual time series of California statewide *P*, as California has recently emerged from a multiyear drought (30). The data were from the National Oceanic and Atmospheric Administration’s National Climatic Data Center (NCDC) “nClimDiv” divisional temperature–precipitation–drought database available at monthly time resolution from January 1895 to the present (31). As shown in *SI Appendix*, Fig. S3, we also reached the same conclusion as for the annual *P* at the Radcliffe Observatory site.

## Autocorrelation of Global Land-Based Precipitation

Is the random characteristic of annual *P* at the Radcliffe Observatory site and in the state of California typical? To examine this issue more broadly, we use the gridded monthly precipitation database from the Global Precipitation Climatology Center (GPCC) Version 5 (32, 33) (spatial resolution of 2.5° × 2.5°) over the global terrestrial surface. We use a spatial land mask that identifies grid boxes containing at least one measurement site (34) (*Methods Summary*). We use the period 1940–2009, over which three observational global land precipitation databases show consistent results (34). We calculate the autocorrelation of annual *P*, with lag 1 results shown in Fig. 2. (*SI Appendix*, Fig. S4 has maps showing the results for lags 1–8, and *SI Appendix*, Fig. S5 shows the overall probability density function of the autocorrelation.) With a 70-y record, the 90% limits for the autocorrelation are ±0.197. That threshold is used (Fig. 2) to distinguish grid boxes that have a lag 1 autocorrelation indistinguishable from zero (gray in Fig. 2) from those showing positive (yellow in Fig. 2) and negative (red in Fig. 2) lag 1 autocorrelations.

Most grid boxes (76.3% within the 90% confidence limit and 83.6% within the 95% limit) show no significant lag 1 autocorrelation (Table 1). Note that, with a 90% CI, one would a priori expect 10% of all grid boxes (5% at either tail) to have a significant lag 1 autocorrelation. We found that result for the negative lag 1 autocorrelation, but we also report that around 19% of the grid boxes have a positive autocorrelation (Table 1). This is 14% more than the expected 5%. Given the general lack of correlation for lags 2–8, the time series at most, but not all, grid boxes can be considered statistically indistinguishable from a random process. To give specific examples of randomness and nonrandomness, we also show the P time series at four grid boxes (Fig. 2). Two of those (numbered 1 and 3 in Fig. 2) are from grid boxes where the time series is random. One typical time series (numbered 2 in Fig. 2) is from the western Sahel region in Africa, where the long-term P record is quasicyclical, and as a result, there is a positive and statistically significant lag 1 autocorrelation throughout that region. The remaining time series (numbered 4 in Fig. 2) represents the relatively rare occurrence of a statistically significant but negative lag 1 autocorrelation. Close inspection of that time series (from Mozambique) shows that there has been a slight tendency for a wet year to follow a dry year and vice versa.

## Relating Changes in Mean Precipitation to the Variance

The autocorrelation analysis (Fig. 2) showed that the annual *P* was indistinguishable from a random process over most (∼76.3%) of the global land surface. The remaining 23.7% of the grid boxes have a statistically significant lag 1 autocorrelation that, in principle, needs to be accounted for when assessing whether the change is statistically significant (35). The presence of a statistically significant lag 1 autocorrelation means that the number of independent samples is smaller than the number of observations (35). Here, we ignore this (for the relevant 23.7% of grid boxes), with the net effect that the number of grid boxes reported to show no significant change will be an underestimate. With that in mind, we assume a time series of *P*, we assume that *SI Appendix*, Fig. S6). The change in the mean annual *P* at the 90% CI (1.64

Note that the multiplier (±1.64) can be adjusted for different CIs as appropriate. For a special case **1** becomes

The result (Eq. **1**) shows that, for a purely random process, we expect a larger difference ^{−1}. That translates into a fractional change over successive 10-y periods of (=82.3/644) 12.8%. In other words, we have 90% probability that the average *P* over the next 10 y will be within ±12.8% of the previous 10-y average *P*. If we use 30 y instead of 10 y, the change is ±7.3%, and for a 100-y period, it is ±4.0%. More generally, we have calculated the interannual variance at each grid box and can apply the logic (Eq. **1**) to the global land surface (*SI Appendix*, Fig. S7).

## Application to Regional Precipitation Change

Traditional analysis has focused on estimating long-term trends in *P* (26, 33, 34, 36). Such trend analyses provide an important overview but do not directly address several practical questions. For example, one question of interest to both water resources planners and to those in agriculture is whether the most recent period (e.g., decade) is significantly different from the long-term instrumental record. For a time series with minimal memory, Eq. **1** enables one to address this question by anticipating the magnitude of possible changes (at a given confidence level). Here, we ask whether the recent *P* (2000–2009) over the global land surface is significantly different from the long-term average. In Fig. 3, we show the change for each grid box as a function of the interannual variance over the period of 1940–2009. We use the theory (Eq. **1**) to overlay relevant CIs at

At most grid boxes, the observed changes fall within bounds that are expected from a random process given the grid box variance. For example, 83.8% of all grid boxes fall within the 90% CI. Recall that we have been conservative by ignoring the lag 1 autocorrelation in calculating the CI and that the real fraction showing no significant change will be somewhat larger, but we expect it to be less than the CI (90%). The key point is that the range of the observed change in P depends on the grid box-level variance. That, in turn, means that local regions showing the largest trends (figure 1D in ref. 34) will usually have the largest interannual variance of *P* (*SI Appendix*, Fig. S7, *Lower*), while the time series of annual *P* likely remains stationary (Fig. 2).

## Discussion and Conclusion

A change in the mean *P* from one 30-y period to the next is a necessary but not sufficient condition for nonstationary behavior. Interestingly, the famous statistical text by Kendall et al. (16) used the annual *P* time series in London (1813–1912), which looks very similar to the Radcliffe *P* (Fig. 2), as a classic example of a stationary process (ref. 16, p. 504).

The formal approach to evaluating whether a time series is stationary begins with an inspection followed by analysis of the autocorrelation function (16⇓⇓–19). Our investigation showed that the autocorrelation function for the 244-y *P* time series from the Radcliffe Observatory in the United Kingdom is indistinguishable from that for a random process (Fig. 1). We then extended this to a 70-y global *P* database and found that the autocorrelation function at most (∼76.3%) grid boxes is also indistinguishable from that for a stationary random process (Fig. 2).

Assume for the moment that the P in each grid box was generated by a purely random process. With a large number of grid boxes (we have 1,987), one can expect that the number showing significant lag 1 autocorrelation will only depend on the CI chosen by the analyst. For example, for a 90% CI, we know, a priori, that 10% (5% at either tail) would show statistically significant lag 1 autocorrelation. For a negative autocorrelation of annual P, the statistical interpretation is that a high value of annual rainfall implies less rainfall in the next year, and the fraction showing a statistically significant negative value is more or less exactly that expected for random process [in Table 1, 4.5% (compare with 5%) at the 90% level and 2.7% (compare with 2.5%) at the 95% level]. However, there are plausible physical interpretations for a positive lag 1 autocorrelation (37, 38), and we observe a higher fraction of statistically significant lag 1 autocorrelation values than expected for a purely random process. For example, we estimate that an additional 14.1% (90% confidence) of the grid boxes have P that is nonrandom (Table 1). Close inspection of the autocorrelation at other lags (*SI Appendix*, Fig. S4) shows that many of those lags are likely noise, because their location varies with the lag. The obvious exception is the Sahel region of Africa, where the positive autocorrelation can persist for lags up to 6 or 7 y (*SI Appendix*, Fig. S4). The P in that region has been subject to large interannual and interdecadal variability over the last 50 y that at least appears to resemble periodic changes (39). With that region aside, the annual P has apparently remained more or less stationary and indistinguishable from a random process over most of the global land surface.

One should not confuse the statistical significance discussed above with the functional significance. Even under a stationary climate, extreme events will occur. For example, in our own experience, the most recent drought in the Murray–Darling Basin (1998–2009) (40, 41) had severe socioeconomic, hydrologic, and biological consequences, and yet, the time series of *P* remained more or less stationary (23). In fact, the randomness typical of *P* time series as reported here means that one may have to wait at least a human lifetime before being confident about a statistically significant change in *P* (20⇓–22). That makes it even more important for hydrologists and climate scientists to rigorously incorporate the variance into assessments of changes in *P*.

## Methods Summary

We used the global land-based P observations from the GPCC Version 5 database (32, 33). The spatial resolution is 2.5°. We follow our earlier work (34) and use the GPCC metadata to define a mask (fixed over the entire period) by identifying all land-based grid boxes that contain at least one measurement site. The mask includes 1,987 grid boxes (about 69% of global land area, excluding Antarctica). We adopted the post-1940 period, because earlier work showed almost identical results in terms of mean and variance between GPCC and six other global P databases over this period (34).

## Acknowledgments

We thank Drs. Alberto Montanari, Hoshin Gupta, Ignacio Rodríguez-Iturbe, Gerard H. Roe, and Hans von Storch for comments that improved the manuscript. This research was supported by National Key Research and Development Program of China Grants 2016YFA0602402 and 2016YFC0401401, Australian Research Council Grant CE1101028, Key Strategic Program Grant ZDRW-ZS-2017-1 of the Chinese Academy of Sciences (CAS), the CAS Pioneer Hundred Talents Program, and the Open Research Fund of State Key Laboratory of Desert and Oasis Ecology in Xinjiang Institute of Ecology and Geography of the CAS.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. Email: Graham.Farquhar{at}anu.edu.au.

Author contributions: F.S., M.L.R., and G.D.F. designed research; F.S., M.L.R., and G.D.F. performed research; F.S., M.L.R., and G.D.F. analyzed data; and F.S. wrote the paper.

Reviewers: H.V.G., University of Arizona; and A.M., University of Bologna.

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1705349115/-/DCSupplemental.

- Copyright © 2018 the Author(s). Published by PNAS.

This open access article is distributed under Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND).

## References

- ↵
- Smith JA

*Handbook of Hydrology*, ed Maidment DR (McGraw-Hill, New York), Chap 3. - ↵
- Mosley MP,
- McKerchar AI

*Handbook of Hydrology*, ed Maidment DR (McGraw-Hill, New York), Chap 8. - ↵
- Bras RL,
- Rodriguez-Iturbe I

- ↵
- Milly PCD, et al.

- ↵
- Wood EF, et al.

- ↵
- ↵
- ↵
- Findell KL,
- Shevliakova E,
- Milly PCD,
- Stouffer RJ

- ↵
- Montanari A,
- Koutsoyiannis D

- ↵
- Milly PCD, et al.

- ↵
- Donat MG,
- Lowry AL,
- Alexander LV,
- O’Gorman PA,
- Maher N

- ↵
- Luong TM, et al.

- ↵
- Arguez A,
- Vose RS

- ↵
- Houghton JT

- IPCC

- ↵
- Knox JC,
- Kundzewicz ZW

- ↵
- Kendall M,
- Stuart A,
- Ord JK

- ↵
- von Storch H,
- Zwiers FW

- ↵
- Brockwell PJ,
- Davis RA

- ↵
- Wilks DS

*Statistical Methods in the Atmospheric Sciences*, International Geophysics Series (Academic, San Diego), 3rd Ed, Vol 100, p 395. - ↵
- Morin E

- ↵
- Hawkins E,
- Sutton R

- ↵
- Giorgi F,
- Bi X

- ↵
- Sun F,
- Roderick ML,
- Lim WH,
- Farquhar GD

- ↵Bunde, et al. (2013) Is there memory in precipitation?
*Nat Clim Change*3:174–175. - ↵
- Hunt BG

- ↵
- Kumar S,
- Merwade V,
- Kinter JL,
- Niyogi D

- ↵
- Pelletier JD,
- Turcotte DL

- ↵
- Burt TP,
- Howden NJK

- ↵
- ↵
- Diffenbaugh NS,
- Daniel LS,
- Touma D

*Proc Natl Acad Sci USA*112:3931–3936. - ↵
- US National Climate Data Center

- ↵
- Becker A, et al.

- ↵
- Schneider U, et al.

- ↵
- Sun F,
- Roderick ML,
- Farquhar GD

- ↵
- Zwiers FW,
- von Storch H

- ↵
- ↵
- Koster RD, et al.

- ↵
- Seneviratne SI, et al.

- ↵
- Nicholson SE,
- Tucker CJ,
- Ba MB

- ↵
- Verdon-Kidd DC,
- Kiem AS

*Geophys Res Lett*36:L22707. - ↵

## Citation Manager Formats

## Sign up for Article Alerts

## Article Classifications

- Physical Sciences
- Earth, Atmospheric, and Planetary Sciences