On the path to extinction

  1. Peter Jagers*,,
  2. Fima C. Klebaner, and
  3. Serik Sagitov*
  1. *Department of Mathematical Sciences, Chalmers University of Technology, SE-412 96 Gothenburg, Sweden; and
  2. School of Mathematical Sciences, Monash University, Clayton, Victoria 3800, Australia
  1. Communicated by David Cox, University of Oxford, Oxford, United Kingdom, January 4, 2007 (received for review September 15, 2006)

Abstract

Populations can die out in many ways. We investigate one basic form of extinction, stable or intrinsic extinction, caused by individuals on the average not being able to replace themselves through reproduction. The archetypical such population is a subcritical branching process, i.e., a population of independent, asexually reproducing individuals, for which the expected number of progeny per individual is less than one. The main purpose is to uncover a fundamental pattern of nature. Mathematically, this emerges in large systems, in our case subcritical populations, starting from a large number, x, of individuals. First we describe the behavior of the time to extinction T: as x grows to infinity, it behaves like the logarithm of x, divided by r, where r is the absolute value of the Malthusian parameter. We give a more precise description in terms of extreme value distributions. Then we study population size partway (or u-way) to extinction, i.e., at times uT, for 0 < u < 1, e.g., u = 1/2 gives halfway to extinction. (Note that mathematically this is no stopping time.) If the population starts from x individuals, then for large x, the proper scaling for the population size at time uT is x into the power u − 1. Normed by this factor, the population u-way to extinction approaches a process, which involves constants that are determined by life span and reproduction distributions, and a random variable that follows the classical Gumbel distribution in the continuous time case. In the Markov case, where an explicit representation can be deduced, we also find a description of the behavior immediately before extinction.

General branching processes are defined through individuals who give birth independently of each other. We limit ourselves to single-type such populations, all individuals having the same reproduction and, more generally life-path, distribution. Benchmark cases of such processes are the simple Galton–Watson, birth-and-death, and Markov branching processes. Even though such Markovian structures dominate the probabilistic literature, more general processes (allowing aging, various distributions of life spans and fertility periods, as well as repeated litters of varying sizes) are relevant for biological modelling (1). A presentation from a mathematical point of view can be found in ref. 2 (multi-type populations) or ref. 3 (single-type processes).

A multi-type framework would have been still more general but also far less accessible. The not-so-mathematical reader will undoubtedly note that even this presentation is demanding at certain points. However, conclusions should be clear, and they, rather than the work to derive them, form the message of this paper: intrinsic stable extinction follows a simple and beautiful pattern. The purpose is thus general understanding, rather than paving the way for specific applications. For those, multi-type theory may be useful, but at the bitter end they will tend to need custom-made models of narrow relevance.

A (single-type) branching process is subcritical if the expected number m of children per individual is less than one. Let μ(a) denote the expected number of children by age a. Clearly, m = μ(∞). In the lattice case, births can occur only at multiples of some time unit, e.g., yearly. For simplicity, our focus is on the nonlattice case.

We consider so-called Malthusian processes. These are defined by the requirement that there exists a number α, the Malthusian parameter, such that Formula This is always fulfilled in the critical (m = 1) and supercritical (1 < m < ∞), and under very mild conditions (always met with in the real world) also in subcritical cases. Then, α < 0 and since −α will play an important role, we shall write r = −α > 0. In the case of Galton–Watson processes, r = |1n m|.

Let Ztx be a shorthand for the number of individuals alive at time t, provided the population started from x individuals at time 0, taken as newborn then, for preciseness. Zt without a suffix, is Z t 1, the case of one ancestor. If L denotes the life span distribution, any nonlattice Malthusian process will then satisfy Formula (ref. 3, p. 156). We write C for the proportionality constant in this, i.e., 𝔼[Zt] ∼ Ce αt = Ce−rt and recall that in the lattice case the enumerator integral is replaced by a sum. (To the nonmathematical reader, the importance of this is not the special formula for C, but rather the fact that the constant can be calculated from demographic facts.) For Galton–Watson and Markov branching processes C = 1, and expressions are exact, 𝔼[Zt] = mt for integer t in the Galton–Watson case.

The expected population size thus has the same asymptotic form in all three cases, α >, =, < 0. For variances, though, asymptotics take different forms in the sub- and supercritical cases. Assume that m < 1 and that Malthusianness is strengthened to a second order property: let ξ denote the reproduction point process, so that μ = 𝔼[ξ], and write Formula it is required that 𝔼[ξ̂(α)2] < ∞. Then, Formula for some constant B. This can be checked from the convolution equation holding for second moments (ref. 3, p. 136). In the more often studied supercritical case, the leading term is proportional to e t.

We impose second order Malthusianness throughout and assume that the life span distribution L and reproduction measure μ satisfy Formula respectively. As a consequence, the so called x log x-condition holds, guaranteeing the Kolmogorov and Yaglom theorems on survival chances and conditional population size given nonextinction (see ref. 4, p. 170, for the case of age-dependent processes and ref. 3, p. 159, and ref. 5, for general processes). In the nonlattice formulation, they tell us that Formula for some 0 < c < 1, Formula exist, Σk bk = 1, and Formula It is worth noting that these two theorems do not require the full strength of branching process independence assumptions (see ref. 6 for population-size-dependent branching).

Even though we shall not embark upon the subtleties of measuring the population in other ways than by just counting its individuals, it turns out that the proof of our results is considerably facilitated by use of counting with a particular random characteristic (ref. 3, p. 167 ff.). The characteristic we shall use will be one that records all events to come in an individual's life and daughter process. It is thus not “individual” like the ones treated in op. cit. but still well behaved (7).

The first and second moment results quoted above for processes just counting the number of individuals alive, clearly extend to processes counted with characteristics χ such that ert𝔼[χ(t)] is directly Riemann integrable, and so do the Kolmogorov-Yaglom theorems (5).

The Time to Extinction

Now consider processes with x ancestors, Ztx.

By 2 the time to extinction, Formula satisfies Formula for some ct tending to c, as t → ∞. As a consequence, Formula where γx → Euler's γ.

To verify 6 note that for any fixed number v, 0 < v < ∞ Formula On the other hand, Formula where suptv |ct − c| < ε for any ε > 0, given that v is sufficiently large. Thus it remains to observe that as x → ∞ Formula where o(1), denotes a remainder term, vanishing as x → ∞.

Extreme value theory handily delivers distributional forms of this, though again distinction should be made between continuous-time and lattice cases. In the former, Kolmogorov's theorem yields the exponential tail that implies a classical Gumbel limit (ref. 8, p. 17). Indeed, for each initial number x, define ηx by Formula Then, as x → ∞, Formula This extends Pakes's (9) result for Markov branching processes.

In Between Dawn and Demise

Let ZuTxx denote the population size at some time uTx, 0 ≤ u ≤ 1, between its inception at size Z 0 x = x and extinction. We consider nonlattice processes, and to ease exposition, write tx = ln x, so that Txtx/r, by 7 and 8. Since 𝔼[Ztx] ∼ xCe−rt, as t → ∞, also xu−1 𝔼[Z u(tx+t)/r x] ∼ Ce−ut, x → ∞, and it is natural to guess that x1−u provides the right norming and that Formula in distribution, as the initial population size x → ∞, at least for fixed 0 < u < 1.

A first check of this conjecture can be made by simulation. Consider a binary splitting Markov branching process, i.e., birth-and-death process, with the probability p 0 = 0.75 for zero and p 2 = 0.25 for two offspring, and expected life span = 1. For birth-and-death, as pointed out, C = 1 and further b can be shown to equal p 0/(2p 0 − 1), which is 1.5 in the present case.

Fig. 1 displays the simulation results. The two upper panels show that the process ZuTxx, 0 < u < 1 displays much more of variation than does Ztx. The lower panels indicate that the amount of variation is just right to allow nontrivial normalisation. They also point at convergence towards a process with a rather constant mean and a variance increasing with u.

Fig. 1.

Markov branching with p 0 = 0.75, p 2 = 0.25, life expectancy = 1. (Upper) Forty simulations with initial number x = 1,000, population size on the vertical axis, time (Left) and proportion of time u to extinction (Right) horizontally. (Lower) Forty simulations of normed size with x = 1,000 (Left) and x = 10,000 (Right), proportion u of time to extinction on the horizontal axis.


The expectation and variance of the proposed limiting population size are Formula in terms of the classical Gamma function. Thus, for b/C close to one (it cannot be smaller), the Gamma function exerts a strong influence on the behavior of expected size, and since Γ(1) = Γ(2) = 1 the overall picture is that of a fairly constant mean. For large b/C it increases practically exponentially. This may seem intriguing, but is explained by the norming through x 1−u. The limiting variance always increases with u, from 0 to b 2.

Fig. 2 shows these expectations and standard deviations for different values of b assuming C = 1. The situation for b = 1.5 thus seems to fit simulations.

Fig. 2.

Mean (upper line) and standard deviation (lower line) of the limit process bue−uη for b = 1.1, 1.5, 5, 0 ≤ u < 1 on the horizontal axis. The upper right panel shows 40 simulations of (1.5)u e−uη, to be compared with the 40 simulations in Fig. 1.


The Path-to-Extinction Theorem and Its Proof

Our proof of the convergence follows a three-step programme, which requires acquaintance with weak convergence theory (10). The reader interested in the basic pattern of extinction, rather than mathematical technique is thus advised to jump to Theorem 1.

  1. For fixed 0 < u < 1, write ζx(t) = xu−1 Z u(tx+t)/r x and prove that Formula as x → ∞, the arrow indicating weak convergence in the Lindvall–Skorohod topology extended to the whole real line.

  2. Consider ζx at the random argument τx : = ln c + ηx → τ : = ln (C/b) + η, where η is Gumbel by 7 and 8, and check conditions for ζxx) → Ce−uτ = C1−u bu e−uη in distribution for fixed u.

  3. Study the asymptotics of ζxx)(u) = xu−1 ZuTxx as a function of u varying inside the unit interval.

Step 1.

We already noted that Formula Similarly, Formula We can conclude that for any fixed t, ζx(t) → Ce−ut in mean square. The finite dimensional convergence Formula is obvious.

Turning to tightness, we note that for any v ∈ ℝ and a > 0 Formula But the total progeny Y of a population from one newborn ancestor clearly dominates the maximum of the numbers ever simultaneously present, and the sum of the maxima majorises the supremum of the sum. Since 𝔼[Y] = 1/(1 − m), it follows that Formula We can conclude that Formula for some constant K.

Now consider the population counted with a characteristic χ (ref. 3, p. 167) that for any h > 0 gives to each individual present the value of

  1. an indicator of dying the next h/r time units plus

  2. the total number of progeny of the individual born within the same next time span of length h/r.

Denote the process thus counted by Dhx(t). Clearly, for any t ∈ ℝ, Formula Write δ = h/r, and denote the total progeny within t time units stemming from one newborn individual by Yt. Then, Formula For nonlattice processes it follows that, as t → ∞, Formula If we assume that reproduction is L 2-Lipschitz in the sense that the second moment of the number of births between ages t and t + h, vh(t), satisfies Formula then a similar, but prohibitive, analysis of the convolution equation for second moments, shows that also Formula This Lipschitz-property is broadly satisfied, e.g., if the number of children in short intervals is bounded. Indeed, if B is a bound, then vh(t) ≤ B(μ(t + h) − μ(t), so that such qualities of the reproduction function μ transfer.

By Chebyshev's inequality, therefore, for a suitable K, Formula for h little enough and K′ some constant.

Tightness follows along the lines of the Corollaries in ref. 10 (p. 83 and p. 142). The process ζx(t) thus has the weak nonrandom continuous limit Ce−ut, t ∈ ℝ, as x → ∞. The modification from there is that those corollaries concern processes on the unit interval, whereas our processes are defined on the whole line, with the natural Lindvall–Skorohod topology of convergence on all finite subintervals.

Step 2.

In the nonlattice case, the pair (τx, ζx) is also weakly convergent and by the continuity of the exponential limit function and the Gumbel limit, Formula in distribution, as claimed (see ref. 10, p. 151, or ref. 11, Section 2.2).

Step 3.

The preceding two steps can now be repeated for any linear combination Formula yielding the finite dimensional distributional convergence Formula where η is standard Gumbel.

We summarize the situation for nonlattice populations:

Theorem 1.

Consider subcritical Malthusian general, single-type, nonlattice branching processes with finite reproduction variances. Assume conditions 1 and 9. Then the finite dimensional distributional convergence 10 holds.

It remains to note the discontinuity at the endpoints of the unit interval in that xu−1 ZuTxx equals 1 at u = 0 and 0 at u = 1. The former disappears in the Markov case, where age distribution does not matter and C = 1. The latter mirrors the drastic fall of population size from u close to but < 1 to u = 1, at which we proceed to have a closer look.

Markov Branching Processes on the Eve of Extinction

Markov branching processes, i.e., splitting processes (children are only born at their mother's death) where individuals have exponentially distributed life spans, have been much used in mathematical biology, even though they are of limited biological relevance, since exponential life spans imply the absence of aging. Being nonlattice, Markov branching processes are covered by our general results on extinction time and path to extinction.

However, they also allow a direct approach. The latter is mathematically interesting, using a conditioning argument at a nonstopping time. For the Markov case, it recovers Theorem 1 under slightly relaxed conditions, and adds new knowledge because it renders it possible to study the process immediately before extinction, viewing the discontinuity from the left, as u → 1, through a magnifying glass, as it were (Theorem 2).

Now, let Ztx be a Markov branching process in which particles have exponentially distributed lifetimes with parameter a. Then Formula exactly. Hence, as pointed out, C = 1; also r = a(1 − m), as any textbook would tell you.

Further, consider a function v(t) < t. A total probability argument yields Formula In words, the last equation holds because, conditionally on Z v(t) x = y, the probability of the original population, initiated by x ancestors, dying out at dt is the same as the probability that a population of y individuals at v(t) dies out after time tv(t). Note that this argument applies quite generally to Markov processes. What is special in the branching case is the simple relation Formula valid for any positive integer y and helpful in calculating an integral representation of the probability generating function Formula Analysis of the latter for the case v(t) = ut, 0 < u < 1 yields Theorem 1 for the Markov branching case, as mentioned under somewhat less restrictive conditions: the finite reproduction moment can be replaced by the x log x condition Formula where p k is the probability of splitting into k children.

In the present context, it is more interesting that the case v(t) = tu, 0 < u < t, yields distributional convergence of ZTxxu, as x → ∞, the limiting process Yu, thus displaying the properties of extinct populations on the eve of their disappearance.

Indeed, write πy, y = 1, 2, … for the unique solution (up to multiplicative factors) of the system Formula the stationary measure of the process (ref. 12, p. 24). Then, the following can be proved to hold:

Theorem 2.

Consider a subcritical Markov branching process Ztx starting from x individuals, having parameters a and {pk} satisfying the x log x condition, as above. Then, as x → ∞, Formula in distribution for fixed u > 0, and finite-dimensionally. The limiting process is Markov. It starts from Y 0 = 1, stays at any position y = 1, 2, 3, … an exponentially distributed time, with parameter ay, and then jumps to a position z with probability Formula In the long run, it increases exponentially: as u → ∞, Formula in distribution. Here X is an exponentially distributed random variable, with mean b.

Acknowledgments

We thank a referee for taking readability seriously. This work has been supported by the Swedish and Australian Research Councils.

Footnotes

  • To whom correspondence should be addressed. E-mail: jagers{at}chalmers.se
  • Author contributions: P.J. designed research; P.J., F.C.K., and S.S. performed research; P.J., F.C.K., and S.S. contributed new reagents/analytic tools; and P.J., F.C.K., and S.S. wrote the paper.

  • The authors declare no conflict of interest.

  • Freely available online through the PNAS open access option.

References

« Previous | Next Article »Table of Contents
OPEN ACCESS ARTICLE