# Statistical significance of seasonal warming/cooling trends

See allHide authors and affiliations

Contributed by Hans Joachim Schellnhuber, January 31, 2017 (sent for review December 8, 2016; reviewed by Valerie Livina and Igor M. Sokolov)

## Significance

The question whether a seasonal climatic trend (e.g., the increase of spring temperatures in Antarctica in the last decades) is of anthropogenic or natural origin is of great importance because seasonal climatic trends may considerably affect ecological systems, agricultural yields, and human societies. Previous studies assumed that the seasonal records can be treated as independent and are characterized by short-term memory only. Here we show that both assumptions, which may lead to a considerable overestimation of the trend significance, do not apply to temperature data. Combining Monte Carlo simulations with the Holm–Bonferroni method, we demonstrate how to obtain reliable estimates of the statistical significance of seasonal climatic trends and apply our method to representative atmospheric temperature records of Antarctica.

## Abstract

The question whether a seasonal climate trend (e.g., the increase of summer temperatures in Antarctica in the last decades) is of anthropogenic or natural origin is of great importance for mitigation and adaption measures alike. The conventional significance analysis assumes that (*i*) the seasonal climate trends can be quantified by linear regression, (*ii*) the different seasonal records can be treated as independent records, and (*iii*) the persistence in each of these seasonal records can be characterized by short-term memory described by an autoregressive process of first order. Here we show that assumption *ii* is not valid, due to strong intraannual correlations by which different seasons are correlated. We also show that, even in the absence of correlations, for Gaussian white noise, the conventional analysis leads to a strong overestimation of the significance of the seasonal trends, because multiple testing has not been taken into account. In addition, when the data exhibit long-term memory (which is the case in most climate records), assumption *iii* leads to a further overestimation of the trend significance. Combining Monte Carlo simulations with the Holm–Bonferroni method, we demonstrate how to obtain reliable estimates of the significance of the seasonal climate trends in long-term correlated records. For an illustration, we apply our method to representative temperature records from West Antarctica, which is one of the fastest-warming places on Earth and belongs to the crucial tipping elements in the Earth system.

In the last decades, estimations of the magnitude of deterministic trends in natural records have become an important issue, due to anthropogenic global warming (1). Although the estimation of a trend by linear regression is an easy task, the estimation of its statistical significance and its error bar is complicated, because the natural persistence of the records also becomes an issue.

In the absence of persistence (white noise) as well as in short-term persistent records, the distribution of the trend follows a Student’s *t* distribution from which the significance *Methods*). In many natural records like temperature data, river flows, sea level heights, wind fields, midlatitude cyclones, or Antarctic sea ice extent, the assumption of white noise or short-term memory is not valid, due to strong long-term memory in the data (5⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓–27) (*Methods*).

Here we consider seasonal temperature records. A “season” can be a calendar day (without leap day), a week, a month, or combinations of months like meteorological winter, spring, summer, and autumn. Let us consider a daily mean temperature record with a length of

Seasonal climate trends are of great importance, because they may considerably affect ecological systems, agricultural yields, and human societies, thus creating major challenges for crop rotation management (28), river-borne transportation (29), and power generation (30), as well as for the control of pests and vector-borne diseases (31).

## Results and Discussion

Fig. 1 shows, for illustration, the four seasonal temperature trends at Byrd station between 1957 and 2013 (32), in austral autumn, winter, spring, and summer. The data were obtained from the Byrd Polar and Climate Research Center at The Ohio State University (33). The Byrd station is located inside West Antarctica, which is one of the fastest-warming regions on Earth and belongs to the crucial tipping elements in the Earth system (34, 35). The red line in each panel of Fig. 1 shows the warming trend obtained from linear regression. For the significance analysis, the relative trend *Methods*).

The conventional procedure for evaluating the statistical significance of seasonal climate trends is based on three assumptions (see, e.g., refs. 32 and 36⇓⇓⇓–40): (*i*) The magnitudes of the seasonal trends can be obtained by linear regression as in Fig. 1, (*ii*) the seasonal records are independent of each other, and (*iii*) the persistence in each seasonal record can be characterized by an autoregressive process of first order (AR1) between the same seasons over a year’s distance. Under these assumptions, the significance of seasonal climate trends has been obtained by applying Eqs. **4** and **5** to the underlying seasonal record.

It is obvious that assumption *ii* is not valid in monthly records where subsequent months are coupled such that the lag-1 autocorrelation function is different from zero; this happens, for example, when the record of interest can be described by an AR1 process or is long-term persistent. Assumption *iii* is not valid in long-term persistent records where events separated by large time spans are also correlated. However, as we show next, the conventional treatment cannot be applied even to white noise records where assumptions *ii* and *iii* are trivially satisfied.

To see this point, let us assume that the Byrd record can be modeled by white noise. Under this assumption, following the conventional treatment, the dependence of the significance **4**, for the same length *A*. From the figure, one can read off the *Methods*). One can see that, under the white noise assumption, spring has a highly significant trend, with a

To test if the *B*. One can see that the proper

The reason for the considerable underestimation of the *Multiple Testing Problem*). To demonstrate this, we test, in a gedankenexperiment, the statistical significance of the relative trends of the 365 calendar day subrecords, in a purely white noise surrogate temperature record, with a significance level

The necessity of taking into account multiple testing in sequential datasets is well known in genetics, particularly in genome-wide association studies, where associations between genetic variants like single-nucleotide polymorphisms and traits like diseases are investigated (for details, see ref. 41). Before multiple testing correction became the standard practice, most of the early results of these association studies could not be replicated. Multiplicity is also an issue in climate science when estimating the significance of trends in a large number of stations or grid points (see, e.g., refs. 42 and 43); a challenge is the proper handling of the spatial correlations between the records (44).

There are several procedures that take multiple testings into account (*Methods*). Within the Bonferroni approach (45), one of *A* are multiplied by a factor of 4. The result is shown in Fig. 2*C*. The figure shows that the Bonferroni curve is nearly identical with *B*, showing that the Bonferroni method yields an accurate description of the significance of the maximum seasonal trend in the white noise case where the data exhibit no memory.

A less conservative method is the Holm–Bonferroni method (46), which yields better upper bounds for the *m* season has the lowest relative trend, in absolute values. Next, one compares the

The result of this analysis is shown in Fig. 2*D*. By definition, the *C*, whereas the

Because our Monte Carlo method also yields upper bounds for the lower-ranked trends, the minimum value of both methods gives the better estimate. One can see that, for the second-largest trend, Holm–Bonferroni yields a smaller

Next we study how the situation changes when the data exhibit long-term memory characterized by a Hurst exponent *Methods*). As for the white noise case, we basically assume that a monthly temperature record characterized by a certain Hurst exponent can be modeled by long-term correlated surrogate data with the same Hurst exponent. In long-term correlated surrogate data, by construction, the Hurst exponents of the 12 monthly subrecords only vary statistically around their mean. To test, if this assumption holds also for temperature records, we have considered the historical millennium run of the Hamburg Atmosphere–Ocean Coupled Circulation Model ECHO-G (see, e.g., ref. 19). We found that, for all grid points considered, the Hurst exponents of the 12 monthly subrecords varied in the same statistical way as for the surrogate data, thus supporting our hypothesis.

For determining the statistical significance of the seasonal climate trends, we follow closely the prescription detailed above for the white noise case. First, we generate a large number of long-term correlated records with the same length *i*) the absolute values of their relative trends *ii*) their maximum relative trend. From *i*, we obtain the probability density function of all seasonal relative trends, and, from *ii*, we obtain the probability density function of the maximum seasonal relative trend, as discussed above. From both functions, we derive the corresponding trend significances.

Fig. 3*A* shows the resulting *A* is the analog of Fig. 2*A* for long-term persistent records. The figure shows the expected result that, due to the long-term persistence, the *B*–*D* is fully equivalent to Fig. 2 *B*–*D*. Fig. 3*B* shows that the *C* and *D* describes the application of the Bonferroni method and the Holm–Bonferroni method to the long-term persistent data. Fig. 3*C* shows that, due to the long-term memory, the Bonferroni method slightly exaggerates the *B* are lower than those in Fig. 3*D* and thus give the better estimate.

We also performed a similar analysis for the monthly warming trends at Byrd station. Fig. 4 summarizes our results for the warming trends of (*i*) the four seasons at Byrd station and (*ii*) the three months (March, September, and October) with the largest relative trend. Our final result (Fig. 4, *Bottom*) shows that only spring (SON) has a significant warming trend (

In contrast, when applying the conventional procedure (where the persistence in each seasonal record has been assumed to follow an AR1 process) to the seasonal records, one finds (32) that spring warming is highly significant at the 99.9% significance level (

In addition to the Byrd reconstruction, we have studied the four longest observational temperature records in West Antarctica (McMurdo, Rothera, Faraday-Vernadsky, and Bellingshausen). Fig. 5 summarizes our results for the warming trends of the four seasons and the 3 mo with the largest relative trends at the four West Antarctic stations. The figure shows that, for Rothera (1978–2013) and Bellingshausen (1968–2013), the monthly and seasonal warming trends are not significant. For Faraday-Vernadsky (1951–2013), May warming (

Finally, we also inspected the nine longest observational temperature records in East Antarctica (Halley, Syowa, Mawson, Davis, Mirny, Casey, Dumont d′Urville, Vostok, and Amundsen-Scott). It is remarkable that all monthly and seasonal warming trends were nonsignificant, with

In conclusion, we have shown that previous estimations of seasonal temperature trends strongly overestimated the statistical significance of the trend, because two effects, (*i*) multiple testing and (*ii*) long-term persistence, have been neglected. By using Monte Carlo simulations, we have shown explicitly how both effects must be taken into account. When one aims to study the significance of a seasonal trend in a multirecord case, e.g., temperature data on many grid points, one has to consider two kinds of multiplicity, (*i*) the known spatial multiplicity (42⇓–44) and (*ii*) the seasonal multiplicity considered here.

Our method is valid for all climate records that are characterized by linear long-term memory. As examples, we considered Antarctic temperature records. Our approach can be easily generalized to records characterized by autoregressive processes. For climate records with nonlinear correlations like precipitation and river flows, our approach can be considered as only a first-order approximation. For achieving more reliable results for these cases, one needs an accurate statistical model encompassing the proper linear and nonlinear correlations. At present, such a model is not available.

## Methods

### Significance of Trends: Conventional Method.

We consider the annual record

For assessing if an observed trend in a data set may be due to its natural variability or not, one needs to know the probability

If **2** and **3**,

For uncorrelated Gaussian data (white noise), the distribution *t* distribution,

For short-term persistent records described by AR1, Eq. **4** remains the same, only **5** is only valid for large records where the fluctuations of **4** and **5** can be applied to the seasonal records.

Apart from the purely statistical approach described above, physics-based (”deterministic”) models of the coupled atmosphere–ocean dynamics (AOGCMs) have been used for the detection and quantification of ”total” natural climate variability. The simulations have been used in the sense of sanity checks for data analysis, because the amplitude of variability needs to be consistent in observations and (ensemble) model runs (50, 51). Note, however, that AOGCM calculations are cumbersome and costly, so the accomplishment of (computed) statistical significance is quite a challenge.

### The Multiple Testing Problem.

Often, when analyzing a data set, several null hypotheses

There are several procedures to take multiple testings into account. In the Bonferroni approach (45), the significance level

A less conservative method is the Holm–Bonferroni method (46): First, the

### Long-Term Memory.

In records with long-term memory, the autocorrelation function

In refs. 23, 24, 26, 47, and 54, DFA2 has been applied to the monthly Antarctic records considered here. It has been shown explicitly (23, 24) that, for each record, the fluctuation function

When a record is fully characterized by a certain Hurst exponent **4**, but with different parameters

## Acknowledgments

We thank both reviewers for very helpful and constructive criticisms.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. Email: john{at}pik-potsdam.de.

Author contributions: J.L., A.B., and H.J.S. designed research; J.L. and A.B. performed research; and J.L., A.B., and H.J.S. wrote the paper.

Reviewers: V.L., National Physical Laboratory; and I.M.S., Humboldt University.

The authors declare no conflict of interest.

Freely available online through the PNAS open access option.

## References

- ↵.
- Intergovernmental Panel on Climate Change

- ↵.
- Bronstein I, et al.

- ↵.
- Wheatherhead EC, et al.

- ↵.
- Santer BD, et al.

- ↵.
- Hurst HE

- ↵.
- Mandelbrot BB,
- Wallis JR

- ↵.
- Bloomfield P,
- Nychka D

- ↵
- ↵.
- Malamud BD,
- Turcotte DL

- ↵.
- Blender R,
- Fraedrich K

- ↵.
- Eichner J,
- Koscielny-Bunde E,
- Bunde A,
- Havlin S,
- Schellnhuber HJ

- ↵.
- Livina V, et al.

- ↵.
- Vyushin D,
- Zhidkov I,
- Havlin S,
- Bunde A,
- Brenner S

- ↵
- ↵.
- Santhanam M,
- Kantz H

- ↵.
- Király A,
- Bartos I,
- Jánosi I

- ↵.
- Mudelsee M

- ↵
- ↵.
- Rybski DA,
- Bunde A,
- von Storch H

- ↵.
- Franzke C

- ↵.
- Lovejoy S,
- Schertzer D

- ↵.
- Dangendorf S, et al.

- ↵.
- Bunde A,
- Ludescher J,
- Franzke C,
- Büntgen U

- ↵.
- Ludescher J,
- Bunde A,
- Franzke CE,
- Schellnhuber HJ

- ↵.
- Blender R,
- Raible C,
- Lunkeit F

- ↵.
- Yuan N,
- Ding M,
- Huang Y,
- Fu Z

- ↵.
- Yuan N,
- Ding M,
- Ludescher J,
- Bunde A

- ↵.
- Troost C

- ↵.
- Caldwell H, et al.

- ↵.
- Rübbelke D,
- Vögele S

- ↵.
- Rao MS, et al.

*Spodoptera litura*Fab. On peanut during future climate change scenario. PLoS One 10(2):e0116762. - ↵.
- Bromwich DH, et al.

- ↵.
- Byrd Polar and Climate Research Center at The Ohio State University

- ↵.
- Lenton T, et al.

- ↵.
- Kriegler E,
- Hall JW,
- Held H,
- Dawson R,
- Schellnhuber HJ

- ↵.
- Chapman WL,
- Walsh JE

- ↵.
- Monaghan AJ,
- Bromwich DH,
- Chapman W,
- Comiso JC

- ↵
- ↵.
- O’Donnel R,
- Lewis N,
- McIntyre S,
- Condon J, et al.

- ↵.
- Jones PD,
- Lister HD

- ↵
- ↵.
- von Storch H

- ↵.
- Livezey RE,
- Chen WY

- ↵.
- DelSole T,
- Yang X

- ↵.
- Bonferroni CE

- ↵.
- Holm S

- ↵.
- Bromwich DH, et al.

- ↵
- ↵.
- Lennartz S,
- Bunde A

- ↵.
- Stocker TF, et al.

- Bindoff NL, et al.

- ↵.
- Imbers J,
- Lopez A,
- Huntingford C,
- Allen MR

- ↵.
- Lennartz S,
- Bunde A

- ↵
- ↵.
- Tamazian A,
- Ludescher J,
- Bunde A

## Citation Manager Formats

## Article Classifications

- Physical Sciences
- Earth, Atmospheric, and Planetary Sciences