Taylor’s law of fluctuation scaling for semivariances and higher moments of heavy-tailed data

Contributed by Joel E. Cohen, September 28, 2021 (sent for review April 28, 2021; reviewed by Svetlozar Rachev and Johannes Ruf)
November 12, 2021
118 (46) e2108031118
Commentary
Taylor’s law and heavy-tailed distributions
W. Brent Lindquist, Svetlozar T. Rachev

Significance

Many quantities are extremely large extremely rarely. Examples include income, wealth, financial returns, insurance losses, firm size, and city population size; earthquake magnitude, hurricane energy, tornado outbreaks, precipitation, and flooding; and pest outbreaks, infectious epidemics, and forest fires. When such a quantity is modeled as a nonnegative random variable with a heavy upper tail, the probability of an observation larger than some threshold falls as a small power (the “tail index”) of the threshold. When the tail index is small enough, the mean and all higher moments of the random quantity are infinite. Surprisingly, the sample mean and the sample higher moments obey orderly scaling laws, which we prove and apply to estimating the tail index.

Abstract

We generalize Taylor’s law for the variance of light-tailed distributions to many sample statistics of heavy-tailed distributions with tail index α in (0, 1), which have infinite mean. We show that, as the sample size increases, the sample upper and lower semivariances, the sample higher moments, the skewness, and the kurtosis of a random sample from such a law increase asymptotically in direct proportion to a power of the sample mean. Specifically, the lower sample semivariance asymptotically scales in proportion to the sample mean raised to the power 2, while the upper sample semivariance asymptotically scales in proportion to the sample mean raised to the power (2α)/(1α)>2. The local upper sample semivariance (counting only observations that exceed the sample mean) asymptotically scales in proportion to the sample mean raised to the power (2α2)/(1α). These and additional scaling laws characterize the asymptotic behavior of commonly used measures of the risk-adjusted performance of investments, such as the Sortino ratio, the Sharpe ratio, the Omega index, the upside potential ratio, and the Farinelli–Tibiletti ratio, when returns follow a heavy-tailed nonnegative distribution. Such power-law scaling relationships are known in ecology as Taylor’s law and in physics as fluctuation scaling. We find the asymptotic distribution and moments of the number of observations exceeding the sample mean. We propose estimators of α based on these scaling laws and the number of observations exceeding the sample mean and compare these estimators with some prior estimators of α.
Heavy-tailed nonnegative random variables with infinite moments, such as nonnegative stable laws with index α in (0,1), have theoretical and practical importance [e.g., Carmona (1), Feller (2), Resnick (3), and Samorodnitsky and Taqqu (4)]. Heavy-tailed nonnegative random variables with some or all infinite moments have been claimed to arise empirically in finance [operational risks in Nešlehová et al. (5)], economics [income distributions in Campolieti (6) and Schluter (7); returns to technological innovations in Scherer et al. (8) and Silverberg and Verspagen (9)], demography [city sizes in Cen (10)], linguistics [word frequencies in Bérubé et al. (11)], and insurance [economic losses from earthquakes in Embrechts et al. (12) and Ibragimov et al. (13)]. Partial reviews are in Carmona (1) and Ibragimov (14).
Brown et al. (15) (hereafter BCD) showed that when a random sample is drawn from a nonnegative stable law with index α(0,1), the sample variance is asymptotically (as the sample size n goes to ) proportional to the sample mean raised to a power that is an explicit function of α (Eqs. 11 and 13). This relationship generalizes to stable laws with infinite moments a widely observed power-law relationship between the variance and the mean in families of distributions with finite population mean and finite population variance. This power-law relationship is commonly known as Taylor’s law in ecology [Taylor (16, 17)] and as fluctuation scaling in physics [Eisler et al. (18)].
To the two ingredients combined by BCD (nonnegative stable laws with infinite moments and Taylor’s law), this paper adds two more ingredients. We establish scaling relationships that generalize the usual Taylor’s law, for light-tailed distributions, to many functions of the sample in addition to the variance, including all positive absolute and central moments, upper and lower semivariances, and several measures of risk-adjusted investment performance such as the Sortino, Sharpe, and Farinelli–Tibiletti ratios. In addition, based on these scaling relationships, we propose several estimators of the index α of a nonnegative stable law with infinite first moment.
Section 1 defines most of the sample functions studied here. Section 2 gives background on Taylor’s law, semivariances, and nonnegative stable laws, including key prior results from BCD. Section 3 establishes that the lower sample semivariance, the upper sample semivariance, the local lower sample semivariance, and the local upper sample semivariance are asymptotically each a power of the sample mean with explicitly given exponents. These results are the core of the paper. When investment returns obey a nonnegative heavy-tailed law with index α(0,1), these results reveal the asymptotic behavior of the Sharpe ratio, the Sortino ratio, and the Farinelli–Tibiletti ratio. Section 4 extends these results to higher central and noncentral moments and various indices of volatility. Section 5 analyzes the number of observations from a stable law or an approximately stable (i.e., regularly varying) law that exceed the sample mean. Section 6 proposes and compares estimators of α by simulation. SI Appendix gives all proofs of results stated in the text and additional numerical simulations.

1. Preliminary

Let d mean “converges in distribution to.” Let p mean “converges in probability to.” Let a.s. mean “converges almost surely to.”
Let X be a real-valued nonnegative random variable. Let n be a positive integer and assume that n > 1. For i=1,,n, let Xi be independent and identically distributed as X. For any real h0, the hth (raw) sample moment is defined as
Mh1ni=1nXih.
[1]
Thus, M1 is the sample mean. For any nonnegative integer h, the hth sample central moment is defined as
Mh1ni=1n(XiM1)h.
[2]
Clearly, M1=0, and M2 is the sample variance normalized by n. The sample variance normalized by n – 1 is defined as
vn1n1i=1n(XiM1)2.
[3]
Obviously, vn=M2n/(n1) and vn/M2a.s.1 as n.
The lower sample semivariance and the upper sample semivariance are defined as
vn1n1i:XiM1(XiM1)2,vn+1n1i:Xi>M1(XiM1)2,
[4]
so that vn=vn+vn+. Define Nn as the number of values of Xi that do not exceed the sample mean and Nn+ as the number of values of Xi that (strictly) exceed the sample mean:
Nn#{i:XiM1},    Nn+#{i:Xi>M1}.
[5]
Then, NnNn+>0 unless Xi=M1 for all i=1,,n. The local lower sample semivariance and the local upper sample semivariance are defined only when Nn>0 and Nn+>0, respectively, as
vn*1Nni:XiM1(XiM1)2,vn+*1Nn+i:Xi>M1(XiM1)2.
[6]
The local upper sample semivariance vn+* is the more mathematically challenging sequence to analyze because it depends on the asymptotic behavior of the number of observations that exceed the sample mean. Our result, Theorem 9, may be of independent interest in the study of heavy-tailed distributions.
For the remainder of this article, we assume two restrictions on X without further restatement. First, we assume that X takes only nonnegative values. Second, to assure that (Nn+=0)=0, we assume that X is not atomic [i.e., for all real a, we assume that (X=a)=0]. Then, (Nn+=0)=0 and conversely; for otherwise, if (X=a)>0 for some a, then (Nn+=0){(X=a)}n>0. Under the assumption that X is not atomic, (NnNn+>0)=1, and vn* and vn+* are well defined almost surely (a.s.); also, vn=Nnvn*/(n1),vn+=Nn+vn+*/(n1), and vn=(Nnvn*+Nn+vn+*)/(n1) a.s. The assumption that X is not atomic also plays an important role in Theorems 5 and 8(3), Remark 2, and Corollaries 6(3) and 8.
Alternatively, we could assume that X is not constant (i.e., not a degenerate random variable with all probability mass concentrated at a single value). If X is atomic but not a constant, then (NnNn+>0)1 as n, but (NnNn+>0)1. Nevertheless, similar asymptotic results could still be proved.
The infinite sequences of random variables defined in Eqs. 1 to 6 (one random variable for each n=1,2,) exist a.s., whether or not X has any finite moments. Our goal here is to show that, if X is a stable distribution (or an approximately stable distribution under Definition 1) with support (0,) and index α(0,1), then as n, the quantities in Eqs. 1 to 6 and other related quantities defined in section 3, when divided by some power b of the sample mean M1, converge in distribution, in probability or almost surely, depending on the case. Here, b may depend on α and on which quantity is being examined.

2. Background and Prior Results

Taylor’s law [Taylor (16)] says that the sample variance vn scales approximately in direct proportion to a nonzero power b (positive or negative) of the sample mean M1. Taylor’s law is a widely confirmed empirical pattern in ecology and other sciences [Taylor (17)], nearly always with b > 0 and often with b(1,2). Taylor’s law holds also for the mean and variance of some single-parameter probability distributions, in addition to holding for the sample mean and sample variance. For example, for varying values of the population mean μ, the population variance σ2 varies according to Taylor’s law σ2=aμb with a = 1, b = 1 for the Poisson distribution and a = 1, b = 2 for the exponential distribution.
The semivariances, especially the lower, have important applications in agricultural and financial economics [Berck and Hihn (19), Bond and Satchell (20), Hogan and Warren (21), Jin et al. (22), Liagkouras and Metaxiotis (23), Nantell and Price (24), Porter (25), Turvey and Nayak (26), and van de Beek et al. (27)]. We know no prior proofs that the sample semivariances of a nonnegative stable law satisfy Taylor’s law.
Higher moments include skewness and kurtosis in statistics and the Farinelli–Tibiletti ratio in finance. Power-law scaling relationships for moments other than the sample variance are generalized Taylor’s laws [Giometto et al. (28)]. Generalized Taylor’s laws are less widely studied empirically or theoretically.
Every stable random variable X with support (0,) has Laplace transform [Feller (2), pp. 448–449]
L(s)E(esX)=e(cs)α,
[7]
for s0,0<α<1, and c > 0. We say that X=dF(c,α) when the distribution of X has Laplace transform Eq. 7, and then we say that X has index α. We have X=dF(c,α)=dcF(1,α). Such a heavy-tailed distribution has an infinite mean. Consequently, the sample mean, sample variance, sample semivariances, and sample higher moments are not estimators of population moments, and the normal central limit theorem does not apply.
If X=dF(c,α) for some 0<α<1, c > 0, the survival function of X evaluated at t(0,) is defined as F¯(c,α)(t)1F(c,α)(t). By Feller (2, p. 448), if 0<α<1 and c > 0, then as t,
F¯(c,α)(t)/cαtαΓ(1α)1.
[8]
Many distributions on (0,) satisfy Eq. 8 but are not of the special form F(c,α) in Eq. 7.
Definition 1.
XdF(c,α) and FXdF(c,α) both mean that a nonnegative random variable X has a distribution function FX that satisfies Eq. 8: that is, as t,
{1FX(t)}/cαtαΓ(1α)1.
[9]
When Eq. 9 holds, we say that X is approximately stable.
For α(0,1) and real g>α,h>α, define
α(g,h)gαhα,  α*α(2,1)=2α1α.
[10]
If g > h, then α(g,h)>g/h. Consequently, α*>2. If g < h, then α(g,h)<g/h<1. Thus if, as we shall prove below, α(g,h) is the exponent b in Taylor’s law for a stable nonnegative law with index α(0,1) and if g2h or g < h, then the exponent b must fall outside the interval (1,2) that is commonly (although not universally) observed in many ecological applications [Cohen et al. (29, 30)].
Among other results, BCD (ref. 15, p. 663, proposition 2) showed that if X=dF(1,α), then as n,
WnvnM1α*dW,
[11]
where E(Wn)=1α,Var(Wn)={E(Wn)}2{1+2α/(n1)}, and the limiting random variable W has (0<W<)=1. W has a finite mean and a finite SD, both of which equal 1α. Moreover, for all h=1,2,,E(Wnh)E(Wh). The second and third moments of W are
E(W2)=2{E(W)}2,   E(W3)=(6α(52α)){E(W)}3,
[12]
while for an exponentially distributed random variable Y, E(Y3)=6{E(Y)}3 (ref. 15, p. 666).
For general c > 0 in Eq. 7, BCD showed that vn/M1α*dcα1αW, where W is the limiting random variable in Eq. 11. Consequently, for any c > 0, BCD showed that as n,
logvnlogM1pα*.
[13]
Thus, for large n, with arbitrarily high probability, (logvn)/(logM1) will be close to α*, regardless of c > 0. This scaling relationship is an asymptotic form of Taylor’s law with exponent b=α*>2. BCD further argued without detailed proofs that XdF(c,α) satisfies Eq. 13.
A common sample statistic used to compare the effectiveness of investments is the well-known Sharpe ratio [Sharpe (31)] (M1rf)/vn1/2 for the period rates of return of a security, where rf is a zero-risk reference: for example, the London interbank offered rate. In signal processing, the Sharpe ratio (with rf=0) is a useful but biased estimator of the signal-to-noise ratio [Miller and Gehr (32)]. In statistics, the reciprocal of the Sharpe ratio (with rf = 0) is called the coefficient of variation. If the period rate of return has a distribution XdF(c,α), where 0<c< and 0<α<1, then the Sharpe ratio converges in probability to zero as n. Why? Eq. 11 implies that, as n,M1α*/vnd1/W, so M1α*/2/vn1/2d1/W1/2. However, M1α*/2=M1×M1(α*/2)1, and because α*>2 (as noted just after Eq. 10), the second factor M1(α*/2)1 goes a.s. to . Therefore, the Sharpe ratio (M1rf)/vn1/2 must converge in probability to zero. Asymptotically, for large n, the Sharpe ratio reveals no information about the distribution.
Inspired by Taylor’s law in Eq. 13, one may consider log(M1rf)/logvn as a modified financial ratio, which converges to 1/α*=(1α)/(2α) in probability. Because (1α)/(2α) is decreasing in α over (0, 1), the smaller α is, the heavier the distribution, so the larger the risk. The original Sharpe ratio is quasiconcave, scale invariant, and distribution based [Eling et al. (33)]. The modified ratio is also distribution based and reveals the tail index α for large-enough n. Because of the logarithmic transformation, the modified ratio is not scale invariant. However, both numerator and denominator diverge to infinity. The effect of finite scaling becomes negligible for large sample sizes, and hence, the ratio is Fα-asymptotically scale invariant.* In other words, when XdF(c,α), the modified ratio is asymptotically invariant with respect to c. The modified Sharpe ratio is Fα-asymptotically quasiconcave. The proof is in SI Appendix. Thus, asymptotically with large sample size n, the modified Sharpe ratio inherits all the properties of the original Sharpe ratio. We discuss this using semivariances and partial moments for the financial ratios in the following sections.

3. Taylor’s Laws for Semivariances

A. Lower Semivariances and Sortino Ratio

The lower semivariance of any nonnegative random variable with infinite expectation is almost surely asymptotic to the square of the sample mean.
Theorem 1
(Taylor’s law for the lower semivariance). Let X be a nonnegative random variable with E(X)=. Then, as n,
vnM12a.s.1.
[14]
This theorem does not assume X is stable or approximately stable.
The Sortino ratio [Sortino and Price (34)] is another sample statistic used to compare the risks and rewards in some period of a set of investments such as individual equities, mutual funds, trading systems, or investment managers. It is defined as (M1rf)/sd, where M1 is the sample mean of the period rate of return X, rf is a threshold or reference point or target return, the zero-risk rate of return or minimal acceptable return, which we take to be zero, and sd(vn)1/2 is the downside risk, equal to the square root of the lower sample semivariance vn of the period rate of return [e.g., Sortino and Price (34) and Rollinger and Hoffman (35)]. Under our assumption that (0<X<) = 1, one might interpret X as the ratio of final price to initial price, so that 0<X<1 would represent a loss, while X > 1 would represent a gain. The possible use of n instead of n1 in the denominator of Eq. 4 is immaterial for large samples. Eq. 14 shows that if the period rate of return X is a nonnegative random variable with an infinite mean, then the Sortino ratio converges a.s. to one as n. When the mean is infinite, asymptotically, for large n, the Sortino ratio reveals no information about the distribution.
Similar to our modified Sharpe ratio for heavy-tailed distributions, for the Sortino ratio, we consider the ratio between the logarithm of the sample mean minus rf and the logarithm of the sample lower semivariance, namely log(M1rf)/logvn. Theorem 1 and Slutsky’s theorem imply that a power law with exponent 2 relates the lower semivariance to the sample mean. So Taylor’s law holds between the sample mean and the lower semivariance.
Corollary 1.
Let X be a nonnegative random variable with E(X)=. As n,
logvnlogM1a.s.2.
[15]
The modified Sortino ratio is Fα-asymptotically quasiconcave and Fα-asymptotically scale invariant, like the original Sortino ratio; proofs are in SI Appendix. However, from Corollary 1, the limiting value of the modified Sortino ratio is independent of the tail index α.
We now extend Taylor’s law to the local lower semivariance vn*. The local lower semivariance differs from the lower semivariance by a factor equal to the ratio Nn/n. We show that Nn/n1 almost surely if E(X)=.
Lemma 1.
Let X be a nonnegative random variable with E(X)=. Then, with Nn defined in Eq. 5, as n,
Nnna.s.1.
[16]
Corollary 1 and Lemma 1 imply that a power law with exponent 2 relates the local lower semivariance to the sample mean.
Corollary 2.
Let X be a nonnegative random variable with E(X)=. Then, as n,
logvn*logM1a.s.2.
[17]
If X is approximately stable with infinite expectation, then Lemma 1 and Corollaries 1 and 2 imply further results that will be useful later for studying the local upper semivariance and upper semivariance.
Corollary 3.
Let XdF(1,α),0<α<1. Let α*(2α)/(1α) as defined in Eq. 10. Then, as n,
vnM1α*a.s.0 and vn*M1α*a.s.0.
[18]

B. Upper Semivariances

Although the asymptotic values of the ratios in Eqs. 15 and 17 are both two, which is independent of α, if one replaces the lower or local lower semivariances by the upper or local upper semivariances, respectively, Taylor’s law continues to hold, and it depends on α.
Theorem 2.
Let XdF(1,α),0<α<1. Then, as n,
logvn+logM1pα* andlogvn+*logM1pα*+α=2α21α.
[19]
Inspired by Taylor’s law in Eq. 19, one may consider ratios between the logarithm of the sample mean minus rf and the logarithm of either the sample upper or local upper semivariances, namely log(M1rf)/logvn+ and log(M1rf)/logvn+*, respectively, which converge in probability to 1/α*=(1α)/(2α) and (1α)/(2α2), respectively. Because (1α)/(2α) and (1α)/(2α2) are both decreasing in α, the smaller α is, the heavier the distribution is, and the larger these ratios are asymptotically. The asymptotic properties and proofs are in SI Appendix, Proposition D.3.

4. Fluctuation Scaling for Higher Moments

In this section, we show that the sample higher moments are proportional to a power of the sample mean. These relations imply power-law relations between sample higher moments used in financial ratios such as the Farinelli–Tibiletti ratio (36).

A. Higher Sample Moments, Skewness, and Kurtosis

Theorem 3.
If XdF(1,α),0<α<1, and h>α, then, as n,
Mh(M1)α(h,1)d{Γ(1α)}h11αUhVα(h,1),
where the random vector (Uh,V) has the joint Laplace transform
E(esUhtV)=exp{0{rh(y,s,t)}αeydy},
for s,t,y>0, and rh(y,s,t) is the unique positive root of the equation sxh+txy=0.
The ratio in Theorem 3 may not be a practically useful financial ratio since α is usually unknown. However, the following Theorem 4 and its corollaries heavily depend on it. The following remark uses the joint moment-generating function to give the marginal distributions of Uh and V.
Remark 1.
In the joint Laplace transform defined in Theorem 3, if we set t = 0, then rh(y,s,0)=(y/s)1/h and
E(esUh)= exp{0{(y/s)1/h}αeydy}.
Hence, Uh follows the distribution F({Γ(1α/h)}h/α,α/h). On the other hand, if we set s = 0, then rh(y,0,t)=y/t and
E(etV)= exp{0{(y/t)}αeydy}.
Hence, V follows the distribution F({Γ(1α)}1/α,α).
These results follow Albrecher et al. (ref. 37, remark 2.1) by the arguments in their proof. The following theorem shows that Taylor’s law holds for raw moments.
Theorem 4.
If XdF(1,α),0<α<1,h1>α, and h2>α, then as n,
logMh2logMh1pα(h2,h1).
In particular, for h>α, as n,
logMhlogM1pα(h,1).
For a positive integer h > 1, the ratio between the central moment Mh and the α(h,1) power of the sample mean M1 converges to a distribution given in Corollary 4.
Corollary 4.
If XdF(1,α),0<α<1, and h > 1 is a positive integer, then as n,
Mh(M1)α(h,1)d{Γ(1α)}h11αUhVα(h,1),
where the random vector (Uh,V) is specified in Theorem 3.
Theorem 5.
If XdF(1,α),0<α<1, and h > 1 is a positive integer, then as n,
log|Mh|logM1pα(h,1).
For any positive integers h1>1 and h2>1, as n,
log|Mh2|log|Mh1|pα(h2,h1).
For the raw moments, we have generalized Theorem 3 for the ratio of two raw moments with orders both larger than α.
Theorem 6.
If XdF(1,α),0<α<1, and both h1,h2>α, then as n,
Mh2(Mh1)α(h2,h1)d{Γ(1α)}h2h1h1αUh2(Uh1)α(h2,h1),
where (Uh1,Uh2) has the joint Laplace transform
E(esUh2tUh1)=exp{0{rh2,h1(y,s,t)}αeydy},
with y > 0, s > 0, t > 0, and rh2,h1(y,s,t) is the unique positive root x of sxh2+txh1y=0. Moreover, as n,
logMh2logMh1pα(h2,h1).
Corollary 5.
If XdF(1,α),0<α<1, and h2h1>1 are positive integers, then as n,
nh1h2h1Mh2(Mh1)h2/h1dUh2(Uh1)h2/h1,
where (Uh1,Uh2) is defined in Theorem 6.
Remark 2.
From Corollary 5, it is clear that the skewness M3/(vn)3/2 and the kurtosis M4/(vn)2 diverge to infinity, yet the scaled skewness and the scaled kurtosis have distributions, asymptotically as n,
M3n1/2(vn)3/2dU3(U2)3/2  and  M4n(vn)2dU4(U2)2,
where the joint distributions of (U2U3) and (U2,U4) are defined in Theorem 6. The limiting distribution of M4/{n(vn)2} matches the result derived in Cohen et al. (ref. 38, equation 3.9). Moreover, by Slutsky’s theorem, as n,
log|M3|log[(vn)3/2]p23α(3,2)  and  logM4log[(vn)2]p12α(4,2).

B. Central Lower and Local Lower Partial Moments

Definition 2.
Define c+max{0,c} for cR. For h > 0, define
Mh1ni=1n[(M1Xi)+]h, Mh*nMhNn.
Theorem 7.
Let X be a nonnegative random variable with E(X)=, and let h > 0. Then, as n,
Mh/(M1)ha.s.1 andlogMhhlogM1a.s.0.
Corollary 6.
Let X be a nonnegative random variable with E(X)=. Then, as n,
1)
M1/M1a.s.1;
2)
for h > 1, Mh/(M1)α(h,1)a.s.0;
3)
for h > 0,
logMhlogM1a.s.handlogMh*logM1a.s.h.

C. Central Upper Moments and Local Upper Moments

Definition 3.
For h > 0, define the hth central upper moments and central local upper moments:
Mh+1ni=1n[(XiM1)+]h, Mh+* nMh+Nn+.
Theorem 8
(central upper moments). Let XdF(1,α), 0<α<1. Then, as n,
1)
for 0<h<1,Mh+/(M1)hp0;
2)
for h1,
Mh+(M1)α(h,1)d{Γ(1α)}h11αUhVα(h,1),
where the random vector (Uh,V) has the joint Laplace transform defined in Theorem 3;
3)
for h1,
logMh+logM1pα(h,1) andlogMh+*logM1phα21α.
[20]

D. Omega Index, Upside Potential Ratio, and Farinelli–Tibiletti Ratio

Farinelli–Tibiletti (36) extended the Sharpe ratio to an index including asymmetrical information on the volatilities above and below the benchmark rfR. Their index ΦFT is defined by
ΦFT(rf,p,q)[E[(Xrf)+]p]1/p[E[(rfX)+]q]1/q.
The Omega index, introduced by Cascon et al. (39), is ΦFT(rf,1,1) with p=q=1. The upside potential index, introduced by Sortino et al. (40), is ΦFT(rf,1,2) with p = 1 and q = 2. The ratio ΦFT(rf,p,q) may not be well defined since the expectations may not exist for the heavy-tailed distributions. However, one can define an empirical version of the Farinelli–Tibiletti ratio by
ΦFTn(rf,p,q)[1ni=1n[(Xirf)+]p]1/p[1ni=1n[(rfXi)+]q]1/q.
The following corollary shows that both ΦFTn(rf,p,q) and ΦFTn(M1,p,q) converge to in probability.
Corollary 7.
If XdF(1,α),0<α<1,rf>0, p > 1, and q > 1, then as n,ΦFTn(rf,p,q)p and ΦFTn(M1,p,q)p.
A modification of the usual Farinelli–Tibiletti ratio might have the ratio of the logarithm of the numerator to the logarithm of the denominator in ΦFT(rf,p,q). However, for a fixed rf>0, the numerator converges to infinity in probability, while the denominator is bounded above with probability one. Therefore, this ratio diverges to infinity.
We propose as an alternative to the Farinelli–Tibiletti ratio:
ΦFTlog(p,q)plogMq/(qlogMp+),
which is the ratio of the logarithm of the numerator to that of the denominator in ΦFT(M1,p,q). The following corollary describes generalized Taylor’s laws for the ratio of the logarithm of the upper central moment to the logarithm of the lower central moment.
Corollary 8.
If XdF(1,α),0<α<1,p1, and q1, then as n,
logMp+logMqppαq(1α).
Corollary 8 implies that
ΦFTlog(p,q)pp(1α)/(pα),
which is decreasing in α for p1,q1. Therefore, the smaller α is, the heavier the distribution is, and the larger the risk is. Our modified Farinelli–Tibiletti ratio ΦFTlog(p,q) is asymptotically scale invariant and distribution based, like the original Farinelli–Tibiletti ratio, and satisfies Fα-asymptotic quasiconcavity (SI Appendix).

5. Number of Observations Exceeding Sample Mean of Stable Law

A. Asymptotic Distributions and Moments of Nn+/nα

In a sample of size n from an approximately stable law with index α(0,1), asymptotically the number of observations above the sample mean scales as nα and has a distribution given by Theorem 9. To prove this result, we use Einmahl (ref. 41, corollary 2.1) together with SI Appendix, Lemma C.1.
Theorem 9.
If XdF(1,α),0<α<1, and U=dF(1,α), then as n,
Nn+nαdVUαΓ(1α).
The asymptotic moments of Nn+/nα are the moments of V defined in Theorems 9 and 10.
Theorem 10.
Let U=dF(1,α),0<α<1,VUα/Γ(1α), and ε=dExp(1) (an exponential random variable with mean and parameter 1), where ε is independent of U.
1)
Uαεα=dExp(1).
2)
For integer K > 0,
E[UKα]=K!Γ(1+Kα),E[VK]=K!Γ(1+Kα){Γ(1α)}K.
Specifically, when K = 1, then E[Uα]={Γ(1+α)}1 and E[V]={Γ(1+α)Γ(1α)}1; when K = 2, then E[U2α]=2{Γ(1+2α)}1,E[V2]=2{Γ(1+2α){Γ(1α)}2}1. Hence
Var(Uα)=2Γ(1+2α)1{Γ(1+α)}2,Var(V)=1{Γ(1α)}2Var(Uα).
3)
SD(V)<E[V]. For example, when α=1/2,E[V2]=2/π,E[V]=2/π,Var(V)=2π(12π). Numerically, SD(V)0.48097,E[V]0.63662, where here ” means the numerical approximation is inexact.
4)
For K2,E[VK]<K!(E[V])K.
5)
Vstε [i.e., by the definition of the stochastic ordering st,(V>t)(ε>t) for all tR].
Part 1 of Theorem 10 is not well known. The moment results in part 2 of Theorem 10 are derived using fractional calculus by Wolfe (42). Because the logarithm of the moment-generating function of a nonnegative random variable is a convex function of the moment (by Artin’s theorem) [Marshall and Olkin (ref. 43, theorem B.8)], it follows that logE(Uxα)=logΓ(1+x)logE(Wx) is concave in x[1,).
The distribution of Uα approximates the standard exponential distribution Exp(1) when α0.
Corollary 9.
Let U=dF(1,α). Then, as α0,
UαdExp(1).

6. Numerical Experiments

A. Tail Estimators

The preceding results describe the asymptotic ratio of the logarithm of the sample mean to the logarithm of various forms of the sample variance, such as the ordinary sample variance vn, the upper semivariance vn+, the local upper semivariance vn+*, and the lower semivariance vn when a random sample is from an approximately stable F(1,α) satisfying Eq. 9. Most of these ratios (apart from that for the lower semivariance) depend asymptotically only on α. Based on these results, we propose estimators of the index α. We define the ratios R1,R2,R3, and RL where
R1logvnlogM1p2α1α,R2logvn+logM1p2α1α,R3logvn+*logM1p2α21α,RLlogvnlogM1a.s.2.
The results generalize to F(c,α) for c > 0 because as noted after Eq. 9, X/cdF(1,α) if and only if XdF(c,α) for c > 0. Applying the continuous mapping theorem to the above results for the variance, the upper semivariance, and the local upper semivariance yields three consistent estimators of α:
B12R11R1,B22R21R2,B3R3R324(R32)2.
The Hill estimator [Hill (44)] is a traditional tail-index estimator, which requires the largest k observations where k and k/n0 as n. However, k depends on the unknown parameters such as α and the series representation of the survival function [Hall (45)]. In practice, the number k is based on the “stable” point in the Hill plot, which may not always be available. Gomes and Guillou (46) give a comprehensive review.
Theorem 9 implies that Nn+/n converges to zero in probability, which motivates the choice of k=Nn++1 in the Hill estimator:
(1ki=nk+1nlog(X(i))log(X(nk+1)))1,
where X(i) is the ith-order statistic, 1in. We evaluate this choice of k=Nn++1 in the Hill estimator, denoted by HI.N, numerically. We also replace the smallest (nk) order statistics in the original Hill estimator by the sample mean M1 to obtain a new Hill-type estimator:
HI.M(1Nn+Xi>M1log(Xi/M1))1.
From Bergström (47), the survival function of the stable law for 0<α<1 is
F¯(1,α)(x)= x1πk=1(1)kk!(sinπαk)Γ(ak+1)tak+1dt= 1πk=1(1)k+1k!(sinπαk)Γ(ak)xak= Cxα[1+Dxα+o(xα)],
where C > 0 and D0. From Hall (45), it is optimal to choose k tending to infinity at a rate of order n2α/(2α+α)=n2/3. We also consider this choice k=n2/3 for another Hill-type estimator, denoted by HI.Opt, and we compare the behavior with other estimators.
In our simulations, we generate 104 independent random samples, each with sample size n, from F(1,α) by using the rstable function from the R package stabledist with arguments for the tail-index parameter alpha =α, the skewness parameter beta =1, the scale parameter gamma =|1itan(πα/2)|1/α, the location parameter delta =0, and parameterization pm =1. Setting pm =1 specifies that we use the parameterization of stable laws in Samorodnitsky and Taqqu (4). For each random sample, we calculate the six estimators B1,B2,B3, HI.N, HI.M, and HI.Opt. Then, we estimate the bias as the average of the 104 differences between each estimator of α and the true α. We estimate the mean squared error (MSE) as the average of 104 squared differences between each estimator of α and the true α.
In Table 1 for bias and Table 2 for MSE, the sample size is n=104. According to the bias estimates in Table 1, B1 tends to underestimate α, while B2 and B3 reduce the bias from B1 by introducing the upper semivariance, which focuses more on larger numbers. B3 has smaller bias than B2 for most of the α except α=0.7 and 0.8. In Table 2, B3 has smaller MSE than B1 and B2. Estimators HI.N and HI.M do not perform as well as B3.
Table 1
 Bias (×103; average of [estimate minus true α]) for tail-index estimators B1,B2,B3, HI.N, HI.M, HI.Opt, and MHB3 with sample size n=104 from F(1,α)
αB1B2B3HI.NHI.MHI.OptMHB3
0.1–5.24–3.87–3.0010.25135.16–0.92–5.82
0.2– 11.96– 6.88– 3.79– 9.3173.52– 1.73– 9.65
0.3–19.43–8.55–2.38–25.6030.89–2.05–12.03
0.4– 27.72– 9.750.63– 32.824.87– 1.54– 13.44
0.5–35.03–8.915.96–29.40–5.301.42–12.56
0.6–43.76–10.449.41–24.21–8.216.67–10.26
0.7–50.27–11.2812.19–10.060.1319.37–3.26
0.8–53.49–12.5511.4831.5837.8251.807.30
0.9–50.31–13.695.46204.26208.27153.445.45
Table 2
 MSE (×103) (mean squared [estimate minus true α]) for tail-index estimators B1,B2,B3, HI.N, HI.M, HI.Opt, and MHB3 with sample size n=104 from F(1,α)
αB1B2B3HI.NHI.MHI.OptMHB3
0.10.140.150.112.6120.060.020.10
0.20.530.580.354.739.130.090.31
0.31.131.230.717.336.310.190.57
0.41.861.961.169.156.390.340.85
0.52.602.661.769.126.760.541.15
0.63.473.202.327.936.130.841.53
0.74.153.382.606.465.301.521.94
0.84.323.052.286.976.674.352.13
0.93.592.101.3357.5158.2626.361.33
The estimator HI.Opt with the optimal choice of k=n2/3 for the Hill estimator has the smallest bias, when α0.6, and MSE, when α0.7. However, B3 from Taylor’s law of the local semivariance has better performance, especially much smaller bias, than HI.Opt for α0.8. Since HI.Opt tends to overestimate α, especially when α0.7, we defined the estimator MHB3 to be the minimum of B3 and HI.Opt. This MHB3 not only reduces the bias dramatically but also improves the MSE of B3 for α close to 1.
The advantages of B3 and MHB3 gradually vanish when sample size increases because k=n2/3 is an asymptotically optimal choice. However, for sample sizes smaller than 104, B3 and MHB3 can improve HI.Opt even more. More comparisons are in SI Appendix for sample sizes n=102,103, and 105. On the other hand, although the behavior of B1, B2, and B3 depends on c in F(c,α), one sees similar patterns in bias and MSE. B3 and MHB3 still have better bias and MSE for α0.8 for small sample sizes. More comparisons are in SI Appendix for F(2,α) and F(0.5,α).
Tables in SI Appendix also show that both bias and MSE decrease when sample size increases, as expected of consistent estimators and as proved in Corollary 1.

B. Asymptotic Distribution of Nn+/nα

To illustrate Theorem 9, we generate 103 independent random samples from F(1,α) with sample size n=106 and calculate Nn+/nα for each random sample. We use the 103 values of Nn+/nα to estimate the distribution of Nn+/nα. To estimate the distribution of Uα/Γ(1α), we generate 103 independent random values U1,,U103 from F(1,α) and calculate the corresponding Uiα/Γ(1α) for i=1,,103. Then, we use the 103 values of Uiα/Γ(1α) to estimate the distribution of Uα/Γ(1α). The histograms and quantile–quantile plots of Nn+/nα and Uα/Γ(1α) with α=0.25 and α=0.5 are in Figs. 1 and 2, respectively. The histograms mostly overlap. The P values of the two-sample Kolmogorov–Smirnov (KS) test are 0.1995 and 0.9135, respectively. These observations support the convergence of Nn+/nα in distribution.
Fig. 1
Histogram and quantile–quantile plot of Nn+/nα and Uα/Γ(1α) for α=0.25. The P value of the KS test is 0.1995.
Fig. 2
Histogram and quantile–quantile plot of Nn+/nα and Uα/Γ(1α) for α=0.50. The P value of the KS test is 0.9135.
As expected, the speed of convergence of Nn+/nα in Theorem 9 depends on α. Similarly, the speeds of convergence of the moment ratios in Theorems 3 and 6 also depend on both α and the orders of the moments. We discuss the sample sizes required to see the convergence in distributions in Theorems 3, 6, and 9 in SI Appendix. From our simulation results, smaller α and higher-order moments result in faster convergence in distribution for the ratios of the moments.

Notes

*
Fα-asymptotic scale invariance is defined in SI Appendix, section D.
Fα-asymptotic quasiconcavity is defined in SI Appendix, section D.

Data Availability

Computer code has been deposited in GitHub (https://github.com/cftang9/TLHM). Readers can generate the tables and figures using the R code there.

Acknowledgments

J.E.C. thanks Roseanne Benjamin for help during this work. S.C.P.Y. acknowledges financial support from Hong Kong General Research Fund Grants HKGRF-14300717 “New Kinds of Forward-Backward Stochastic Systems with Applications,” HKGRF-14300319 “Shape-Constrained Inference: Testing for Monotonicity,” and HKGRF-14301321 “General Theory for Infinite Dimensional Stochastic Control: Mean Field and Some Classical Problems.”

Supporting Information

Appendix 01 (PDF)

References

1
R. Carmona, “Heavy tail distributions” in Statistical Analysis of Financial Data in R (Springer, New York, NY, ed. 2, 2014), chap. 2, 69–120.
2
W. Feller, An Introduction to Probability Theory and Its Applications (John Wiley & Sons, Inc., New York, NY, 1971), vol. 2.
3
S. I. Resnick, Heavy-Tail Phenomena: Probabilistic and Statistical Modeling (Springer Science & Business Media, 2007).
4
G. Samorodnitsky, M. S. Taqqu, Stable Non-Gaussian Random Processes (Chapman & Hall, New York, NY, 1994).
5
J. Nešlehová, P. Embrechts, V. Chavez-Demoulin, Infinite mean models and the LDA for operational risk. J. Oper. Risk 1, 3–25 (2006).
6
M. Campolieti, Heavy-tailed distributions and the distribution of wealth: Evidence from rich lists in Canada, 1999–2017. Physica A 503, 263–272 (2018).
7
C. Schluter, Top incomes, heavy tails, and rank-size regressions. Econometrics 6, 10 (2018).
8
F. M. Scherer, D. Harhoff, J. Kukies, “Uncertainty and the size distribution of rewards from innovation” in Capitalism and Democracy in the 21st Century, D. C. Mueller, U. Cantner, Eds. (Springer, 2001), pp. 181–206.
9
G. Silverberg, B. Verspagen, The size distribution of innovations revisited: An application of extreme value statistics to citation and value measures of patent significance. J. Econom. 139, 318–339 (2007).
10
Y. Cen, “City size distribution, city growth and urbanisation in China,” PhD thesis, University of Birmingham, Birmingham, United Kingdom (2015).
11
N. Bérubé, M. Sainte-Marie, P. Mongeon, V. Larivière, Words by the tail: Assessing lexical diversity in scholarly titles using frequency-rank distribution tail fits. PLoS One 13, e0197775 (2018).
12
P. Embrechts, C. Klüppelberg, T. Mikosch, Modelling Extremal Events: For Insurance and Finance (Springer Science & Business Media, 2013), vol. 33.
13
R. Ibragimov, D. Jaffee, J. Walden, Nondiversification traps in catastrophe insurance markets. Rev. Financ. Stud. 22, 959–993 (2009).
14
R. Ibragimov, Portfolio diversification and value at risk under thick-tailedness. Quant. Finance 9, 565–580 (2009).
15
M. Brown, J. E. Cohen, V. H. de la Peña, Taylor’s law, via ratios, for some distributions with infinite mean. J. Appl. Probab. 54, 657–669 (2017).
16
L. R. Taylor, Aggregation, variance and the mean. Nature 189, 732–735 (1961).
17
R. A. J. Taylor, Taylor’s Power Law: Order and Pattern in Nature (Elsevier Academic Press, Cambridge, MA, 2019).
18
Z. Eisler, I. Bartos, J. Kertész, Fluctuation scaling in complex systems: Taylor’s law and beyond. Adv. Phys. 57, 89–142 (2008).
19
P. Berck, J. M. Hihn, Using the semivariance to estimate safety-first rules. Am. J. Agric. Econ. 64, 298–300 (1982).
20
S. A. Bond, S. E. Satchell, Statistical properties of the sample semivariance. Appl. Math. Finance 9, 219–239 (2002).
21
W. W. Hogan, J. M. Warren, Toward the development of an equilibrium capital-market model based on semivariance. J. Financ. Quant. Anal. 9, 1–11 (1974).
22
H. Jin, H. Markowitz, X.Y. Zhou, A note on semivariance. Math. Finance. 16, 53–61 (2006).
23
K. Liagkouras, K. Metaxiotis, The constrained mean-semivariance portfolio optimization problem with the support of a novel multiobjective evolutionary algorithm. J. Softw. Eng. Appl. 6, 22–29 (2013).
24
T. J. Nantell, B. Price, An analytical comparison of variance and semivariance capital market theories. J. Financ. Quant. Anal. 14, 221–242 (1979).
25
R. B. Porter, Semivariance and stochastic dominance: A comparison. Am. Econ. Rev. 64, 200–204 (1974).
26
C. G. Turvey, G. Nayak, The semivariance-minimizing hedge ratio. J. Agric. Resour. Econ. 28, 100–115 (2003).
27
C. Z. van de Beek, H. Leijnse, P. J. J. F. Torfs, R. Uijlenhoet, Climatology of daily rainfall semi-variance in The Netherlands. Hydrol. Earth Syst. Sci. 15, 171–183 (2011).
28
A. Giometto, M. Formentin, A. Rinaldo, J. E. Cohen, A. Maritan, Sample and population exponents of generalized Taylor’s law. Proc. Natl. Acad. Sci. U.S.A. 112, 7755–7760 (2015).
29
J. E. Cohen, M. Xu, W. S. F. Schuster, Allometric scaling of population variance with mean body size is predicted from Taylor’s law and density-mass allometry. Proc. Natl. Acad. Sci. U.S.A. 109, 15829–15834 (2012).
30
J. E. Cohen, M. Xu, W. S. F. Schuster, Stochastic multiplicative population growth predicts and interprets Taylor’s power law of fluctuation scaling. Proc. Biol. Sci. 280, 20122955 (2013).
31
W. F. Sharpe, Mutual fund performance. J. Bus. 39, 119–138 (1966).
32
R. E. Miller, A. K. Gehr, Sample size bias and Sharpe’s performance measure: A note. J. Financ. Quant. Anal. 13, 943–946 (1978).
33
M. Eling, S. Farinelli, D. Rossello, L. Tibiletti, One-size or tailor-made performance ratios for ranking hedge funds? J. Deriv. Hedge Funds 16, 267–277 (2011).
34
F. A. Sortino, L. N. Price, Performance measurement in a downside risk framework. J. Invest. 3, 59–64 (1994).
35
T. N. Rollinger, S. T. Hoffman, Sortino: A ‘Sharper’ Ratio (Red Rock Capital, Chicago, IL, 2013).
36
S. Farinelli, L. Tibiletti, Sharpe thinking in asset ranking with one-sided measures. Eur. J. Oper. Res. 185, 1542–1547 (2008).
37
H. Albrecher, S. A. Ladoucette, J. L. Teugels, Asymptotics of the sample coefficient of variation and the sample dispersion. J. Stat. Plan. Inference 140, 358–368 (2010).
38
J. E. Cohen, R. A. Davis, G. Samorodnitsky, Heavy-tailed distributions, correlations, kurtosis and Taylor’s Law of fluctuation scaling. Proc. R. Soc. Lond. A Math. Phys. Sci. 476, 20200610 (2020).
39
A. Cascon, C. Keating, W. F. Shadwick, An Introduction to Omega (The Finance Development Centre, Fuqua-Duke University, 2002).
40
F. A. Sortino, R. Van Der Meer, A. Plantinga, The Dutch triangle. J. Portfol. Manage. 26, 50–57 (1999).
41
J. H. J. Einmahl, The empirical distribution function as a tail estimator. Stat. Neerl. 44, 79–82 (1990).
42
S. J. Wolfe, “On moments of probability distribution functions” in Fractional Calculus and Its Applications, B. Ross, Ed. (Springer-Verlag, Berlin, Germany, 1975), pp. 306–316.
43
A. W. Marshall, I. Olkin, Inequalities: Theory of Majorization and Its Applications (Academic Press, New York, NY, 1979).
44
B. M. Hill, A simple general approach to inference about the tail of a distribution. Ann. Stat. 3, 1163–1174 (1975).
45
P. Hall, On some simple estimates of an exponent of regular variation. J. R. Stat. Soc. B 44, 37–42 (1982).
46
M. I. Gomes, A. Guillou, Extreme value theory and statistics of univariate extremes: A review. Int. Stat. Rev. 83, 263–292 (2015).
47
H. Bergström, On some expansions of stable distribution functions. Ark. Mat. 2, 375–378 (1952).

Information & Authors

Information

Published in

Go to Proceedings of the National Academy of Sciences
Proceedings of the National Academy of Sciences
Vol. 118 | No. 46
November 16, 2021
PubMed: 34772810

Classifications

Data Availability

Computer code has been deposited in GitHub (https://github.com/cftang9/TLHM). Readers can generate the tables and figures using the R code there.

Submission history

Accepted: September 29, 2021
Published online: November 12, 2021
Published in issue: November 16, 2021

Change history

November 23, 2021: The SI Appendix has been updated.

Keywords

  1. stable law
  2. semivariance
  3. Pareto
  4. Taylor’s law
  5. power law

Acknowledgments

J.E.C. thanks Roseanne Benjamin for help during this work. S.C.P.Y. acknowledges financial support from Hong Kong General Research Fund Grants HKGRF-14300717 “New Kinds of Forward-Backward Stochastic Systems with Applications,” HKGRF-14300319 “Shape-Constrained Inference: Testing for Monotonicity,” and HKGRF-14301321 “General Theory for Infinite Dimensional Stochastic Control: Mean Field and Some Classical Problems.”

Notes

Reviewers: S.R., Texas Tech University; and J.R., London School of Economics.
See online for related content such as Commentaries.

Authors

Affiliations

Department of Statistics, Columbia University, New York, NY 10027;
Laboratory of Populations, The Rockefeller University, New York, NY 10065-6399;
Earth Institute, Columbia University, New York, NY 10027;
Department of Statistics, Columbia University, New York, NY 10027;
Department of Statistics, University of Chicago, Chicago, IL 60637;
Chuan-Fa Tang2,1 [email protected]
Department of Mathematical Sciences, University of Texas at Dallas, Richardson, TX 75080;
Sheung Chi Phillip Yam2,1 [email protected]
Department of Statistics, Chinese University of Hong Kong, Hong Kong

Notes

2
To whom correspondence may be addressed. Email: [email protected], [email protected], [email protected], or [email protected].
Author contributions: M.B., J.E.C., C.-F.T., and S.C.P.Y. designed research, performed research, contributed new reagents/analytic tools, analyzed data, and wrote the paper.
1
M.B., J.E.C., C.-F.T., and S.C.P.Y. contributed equally to this work.

Competing Interests

The authors declare no competing interest.

Metrics & Citations

Metrics

Note: The article usage is presented with a three- to four-day delay and will update daily once available. Due to ths delay, usage data will not appear immediately following publication. Citation information is sourced from Crossref Cited-by service.


Citation statements




Altmetrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

    Loading...

    View Options

    View options

    PDF format

    Download this article as a PDF file

    DOWNLOAD PDF

    Get Access

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Personal login Institutional Login

    Recommend to a librarian

    Recommend PNAS to a Librarian

    Purchase options

    Purchase this article to access the full text.

    Single Article Purchase

    Taylor’s law of fluctuation scaling for semivariances and higher moments of heavy-tailed data
    Proceedings of the National Academy of Sciences
    • Vol. 118
    • No. 46

    Media

    Figures

    Tables

    Other

    Share

    Share

    Share article link

    Share on social media