## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Mammalian phylogeny reveals recent diversification rate shifts

Edited by David M. Hillis, University of Texas at Austin, Austin, TX, and approved March 2, 2011 (received for review November 9, 2010)

## Abstract

Phylogenetic trees of present-day species allow investigation of the rate of evolution that led to the present-day diversity. A recent analysis of the mammalian phylogeny challenged the view of explosive mammalian evolution after the Cretaceous–Tertiary (K/T) boundary (65 Mya). However, due to lack of appropriate methods, the diversification (speciation minus extinction) rates in the more recent past of mammalian evolution could not be determined. In this paper, I provide a method that reveals that the tempo of mammalian evolution did not change until ∼33 Mya. This constant period was followed by a peak of diversification rates between 33 and 30 Mya. Thereafter, diversification rates remained high and constant until 8.55 Mya. Diversification rates declined significantly at 8.55 and 3.35 Mya. Investigation of mammalian subgroups (marsupials, placentals, and the six largest placental subgroups) reveals that the diversification rate peak at 33–30 Mya is mainly driven by rodents, cetartiodactyla, and marsupials. The recent diversification rate decrease is significant for all analyzed subgroups but eulipotyphla, cetartiodactyla, and primates. My likelihood approach is not limited to mammalian evolution. It provides a robust framework to infer diversification rate changes and mass extinction events in phylogenies, reconstructed from, e.g., present-day species or virus data. In particular, the method is very robust toward noise and uncertainty in the phylogeny and can account for incomplete taxon sampling.

It has been a long-standing question whether the rise of present-day mammals began following the mass extinction event at the K/T boundary (1, 2). The hypothesis of a significant rise in mammals following the K/T boundary was challenged when the mammalian phylogeny, including 4,510 present-day species, became available (3). This phylogeny is ∼83% complete at the species level (4). It allowed detection of diversification rate (speciation rate minus extinction rate) shifts throughout the evolutionary past of almost all present-day mammals. No shift was detected at the K/T boundary; however, a peak in diversification rates at ∼93 Mya was detected (3).

Estimating the time and amount of diversification rate changes is a key toward understanding evolutionary patterns (5, 6). For example, reliable diversification rate estimates can have the power to decide if species richness is typically due to intrinsic causes (the ancestor evolved a new feature) or extrinsic causes (the environment changed) or a complex combination of both. Already in 1996, Sanderson and Donoghue (ref. 5, p. 1) wrote “Few issues in evolutionary biology have received as much attention over the years [ … ] as those involving evolutionary rate.” However, until now there has been no general framework available to estimate diversification rates and their changes through time.

Detecting diversification rate changes is typically done by detecting changes in the slope of the lineages-through-time (LTT) plot of the present-day species (3, 7–9, 10). The LTT plot of the mammals and the main mammalian subgroups are displayed in Fig. 1. For constant speciation rate λ and constant extinction rate μ, the slope of the semilog LTT plot is λ − μ in the distant past and λ in the recent past (11), with the change in slope occurring ∼1/(λ − μ) time units ago. This change in slope is also called the “pull of the present” (7). A departure from a straight line in the distant past indicates a diversification rate change.

I argue that a slope change before the pull of the present slope change is not necessarily caused by a diversification rate change. Suppose a very high speciation rate λ_{1} and a very high extinction rate μ_{1} in the distant past, until say 25 Mya. The LTT plot of the species extant at 25 Mya has first a constant slope λ_{1} − μ_{1}, followed by slope λ_{1} when approaching 25 Mya (pull of the present, where the present is 25 Mya). Suppose more recent than 25 Mya, there was a constant speciation rate λ_{0} and no extinction, meaning that all species alive at 25 Mya have present-day species descending. This observation yields an LTT plot of the present-day species with slope λ_{1} − μ_{1} in the distant past, λ_{1} close to 25 Mya, and λ_{0} more recent than 25 Mya. The diversification rate analysis based on slope changes in LTT plots for the time period before 25 Mya (as done for the mammalian phylogeny) (3) would have detected a period with diversification rate λ_{1} − μ_{1} followed by a period with diversification rate λ_{1} before 25 Mya, whereas there was actually only one period with diversification rate λ_{1} − μ_{1} before 25 Mya.

An approach that estimates simultaneously all rates throughout the evolution of a considered phylogeny, instead of looking for local slope changes, is required to avoid such misleading results.

## A New Likelihood Approach

I formulate an evolutionary model, the *birth–death-shift* process, where the speciation and extinction rates can change through time (*Methods*). When no rate change occurs, my model simplifies to the well-studied constant rate birth–death processes (12–16). By deriving the likelihood function (*Methods*, Theorem 2.7) of a phylogenetic tree under the birth–death-shift model, maximum-likelihood time intervals together with each interval's diversification rate (speciation rate minus extinction rate) and turnover (extinction rate divided by speciation rate) can be estimated. Likelihood-ratio tests are used to decide how many shifts are best supported. Using simulations (17), I show in *SI Text* that my analytical likelihood approach is powerful in detecting rate shifts accurately and is robust toward noise in data, such as uncertainty in divergence dates and unresolved polytomies. My likelihood parameter estimation method is available as an R package on CRAN (18).

## Tempo of Mammalian Diversification

Using this unique likelihood approach, I estimated the maximum-likelihood speciation and extinction rates together with the shift times for the mammalian phylogeny (3) (*SI Text S1*). I detect four significant diversification rate shifts throughout mammalian evolution (Fig. 2 and Tables S1, S2, and S3). My method reveals that diversification rates were low (0.05 per million years) until ∼33 Mya. Then a diversification rate peak (0.16) lasted for 3 Myr. After this peak, diversification rate dropped to 0.10 and remained constant until ∼8.55 Mya. The rate declined at 8.55 Mya to a value of 0.06 and at 3.35 Mya to a value of 0.02 (Table S3). I further show that there is no support for a rate shift at the K/T boundary (*SI Text S1*).

The turnover of mammals is estimated to be close to 0, but the confidence intervals are wide (Fig. 2). The difficulty of estimating turnover has been already observed for small trees (19–21). However, I verify through simulations that in large trees, turnover close to 0 can be distinguished from high turnover (*SI Text S2.1*), meaning that the mammals had a low turnover throughout their evolution. The turnover was higher during the diversification peak at 33−30 Mya, meaning that both mammalian speciation and extinction accelerated.

I estimated diversification rate shifts also for the mammalian subgroups placentals and marsupials, as well as for the six largest placental subgroups: rodents, chiroptera (bats), eulipotyphla (shrews, moles, hedgehogs), cetartiodactyla (whales), carnivores, and primates (Fig. 3). At the 99% level, placentals, rodents and cetartiodactyla show a significant increase in diversification rates ∼33 Mya; however, only rodents and cetartiodactyla show a peak (i.e., the diversification rates drop again significantly within a few million years). When relaxing the likelihood-ratio test to reject at the 95% level, marsupials also show a significant peak ∼33−30 Mya (Fig. 3, dotted line). When allowing for one more shift in placentals (corresponding to an 83% level instead of a 99% level), placentals also show a peak ∼33−30 Mya (Fig. 3, dotted line). Placentals show a diversification rate peak ∼12 Mya that is mainly driven by the diversification rate peak of the eulipotyphla. Cetartiodactyla is the only group with a diversification rate peak ∼8 Mya, and chiroptera is the only group with a diversification rate peak ∼3.35 Mya. All analyzed groups but eulipotyphla, cetartiodactyla, and primates show a significant decrease in rates in the very recent past.

I verify that the likelihood method is very robust for inferring diversification rates by performing a variety of simulations (*SI Text S2*). In particular, all four shifts in the mammalian phylogeny can be recovered accurately (Fig. 4).

## Discussion

### Mammalian Evolution.

The mammalian phylogeny reveals no major diversification rate shifts for mammals or mammalian subgroups before 33 Mya. In particular, I do not find a significant shift in rates at 93 Mya unlike previous work (3). As there were only 10 lineages present at 93 Mya, I argue that it is hard to distinguish between a stochastic effect and a real change in rates. I confirm the previous finding (3) that there were no major shifts in rates at the K/T boundary.

The previous work (3) based on the mammalian phylogeny did not detect the peak at 33−30 Mya, which my results suggest. I argue that their method based on LTT plot slope changes is not powerful in detecting rate changes when a large number of species are present (which is the case at 33 Mya): The method looks simply at slope changes, ignoring that small slope changes might be significant when many species are present, whereas large slope changes might be stochastic factors when few species are present and might therefore be nonsignificant.

Up to now, the recent past (the last 25 Myr) could not be analyzed on the basis of the mammalian phylogeny, as the available methods could not distinguish between the pull of the present and a real rate change (11, 3). Two studies (22, 23) based on fossil data investigated the diversification rates of mammals between 30 and 5 Mya. Both studies found a major increase in species richness during 17−14 Mya. When I allow for at least six shifts, a diversification rate increase ∼15.85 Mya (Fig. S1) is detected. Note, however, that the additional shifts are nonsignificant (*P* value = 0.28).

I detect a decline in diversification rates for all mammals and most mammalian subgroups ∼3.35 Mya. To my knowledge, no existing study investigated the diversification rates of mammals during this time.

My study provides maximum-likelihood diversification rate estimates on the basis of the whole mammalian phylogeny. However, it remains controversial why the rates were shifting.

A study on the response of mammals to global warming (ref. 24, p. 1) states “Faunal turnover nears 100% and species diversity may increase when warm temperatures last hundreds of thousands to millions of years, because speciation takes place and faunal changes initiated by a variety of shorter-term processes accumulate.” The two studies discussed above (22, 23) find accelerated diversification during only a single warm period, 17–14 Mya. My study does not suggest higher diversification rates during long warm periods. In particular, during the Early-Eocene climatic optimum at 53–51 Mya, when temperatures reached a long-term maximum, I detect no diversification rate increase. Further, I do not detect a significant rate increase during the warmest part of the Mid-Miocene climatic optimum at 17–14 Mya.

On the other hand, I find an increase in diversification rate and turnover during cooling. Around 34 Mya, the earth climate changed from warm to cool (25) and remained cool until today. Antarctica began to be glaciated, grasslands expanded, presumably many herbivores increased considerably in population sizes, and, as I show, mammals underwent rapid diversification at 33−30 Mya, in particular rodents, cetartiodactyla, and marsupials. To disentangle whether increased mammalian diversification rates were caused by climate cooling or expanding grassland or neither, I suggest analyzing nonmammalian herbivore phylogenies with my maximum-likelihood method. Note that the increased diversification rate for cetartiodactyla indicates that not only grassland expansion caused the rate increase in mammals.

The statistically most significant rate shift observed in my study, at 3.35 Mya, occurred in an era of high tectonic activity: At 3.5 Mya, the great American biotic interchange took place as the new land bridge between North and South America was formed. However, I find a decrease in diversification rate, as opposed to an increase in rate suggested during times of high tectonic activity (23).

My results are based on the previously published phylogeny of mammals (3). This phylogeny was inferred using a supertree approach that combines reconstructed mammalian subtrees into one phylogeny of all mammals. The accuracy of such supertree approaches is limited (26), and the analyzed mammalian tree will certainly be refined in the future, possibly having an effect on some of the conclusions I draw. In particular, the mammalian supertree is poorly resolved in the recent past, during which I detect two significant rate shifts (3.35 and 8.55 Mya). However, because I show, using simulations, that my method is robust toward unbiased uncertainties in speciation time estimates and unresolved polytomies, I argue that my observed general rate pattern (including the recent shifts) actually reflects the pattern of mammalian evolution. As my method is available in the R package TreePar (18), my hypotheses can be easily checked and corrected with new mammalian phylogenies being inferred. In particular, new mammalian phylogenies will reveal if my assumption of the uncertainties in Bindinda's mammalian phylogeny being unbiased is appropriate.

My diversification rate estimates do not provide any support for the hypotheses of accelerated diversification during warming or during an era of high tectonic activity. However, with the increasing number of reconstructed phylogenies, my likelihood method can be used to obtain a more complete picture of the evolutionary past of present-day species. Such analyses can reveal common diversification patterns across different groups of species and, together with paleontological and paleoclimatic data, can help link environmental patterns and diversification patterns.

### Advantages, Extensions, and Limitations of the Presented Likelihood Method.

My likelihood method for detecting rate shifts opens the possibility for analyzing all kinds of phylogenetic trees more accurately. The method is not tailored specifically to species phylogenies but can also be used for, e.g., virus phylogenies. The method can detect in addition to rate shifts also mass extinction events (*Methods*).

My approach, which uses one general model throughout evolutionary time and infers the maximum-likelihood rate estimates, has several advantages over methods using only parts of the phylogeny by looking at local slope changes in the LTT plot:

*i*) As discussed in the Introduction, slope changes do not necessarily correspond to rate changes; my method accounts for such slope changes without predicting a false rate change.*ii*) Rates in the recent and distant past can be inferred, for both complete phylogenies and incomplete phylogenies assuming random sampling.*iii*) Simulations reveal that the likelihood method is robust to inaccuracies in speciation time estimates. In particular, unresolved nodes (polytomies) do not pose a problem, whereas the available regression methods require well-dated edges (3).*iv*) When half of the lineages speciate in a short time interval, a rate shift is suggested by standard regression methods. However, if the number of lineages is small, the speciation pattern can easily be explained by stochastic effects. Likelihood methods using the whole tree account for such stochastic effects.*v*) Regression methods for estimating diversification rates require setting some parameters a priori (3) (e.g., basis dimension, gamma parameter, and time steps) and it is unclear how different settings influence the results. A likelihood framework requires no input besides the dated tree.

Previous studies derived likelihood approaches to infer diversification rates under special cases of the birth–death-shift process. Under a model without rate changes, i.e. constant rates throughout the whole evolutionary time, a likelihood approach is well established (14). The maximum-likelihood parameter estimation for the constant rate birth–death process was generalized to changing birth and death rates (14). However, only integral equations were available, which have to be evaluated numerically. Numerical integration has been done, e.g., on an *Enallagma* phylogeny (27). However, their 90% complete trees contained only 20 species and they allowed for only one rate shift. For big phylogenetic trees where several rate shifts can be expected, direct analytical methods become crucial to analyze the data.

A previous study (6) assumes the birth–death-shift model with one rate shift (and constraints on the shift time and rates). The parameters are estimated assuming that the tree before the rate shift time is independent of the tree after the rate shift time. However, this assumption is not valid, because extinction in a later interval affects species that appeared in an earlier interval and survived until the later interval.

Further, a mass extinction event produces an LTT plot with a constant slope interrupted by a plateau (11, 28). However, results were purely based on simulations as no likelihood function was available. The likelihood approach developed in this study accounts for mass extinction events.

My simulations revealed that diversification rates can be estimated with high confidence. However, the turnover cannot be estimated accurately, even in the case of no rate shift. For obtaining very accurate extinction rates, fossil data are necessary (29).

Models have been proposed where speciation and extinction rates are more complex than under the birth–death-shift model. For example, extinction rates might be heritable (30) (e.g., if intrinsic factors determine speciation rate and these intrinsic factors are heritable), or speciation rates might be density dependent (31). Up to now, such models can be investigated only with simulations. Maximum-likelihood methods to infer rates under these more complex models will require new analytic techniques. Once such methods become available, statistical tests can be applied to address the validity of models with changing rates at discrete points in time (presented in this paper) or models with continuously changing rates through time.

The presented likelihood equations can of course also be used in a Markov chain Monte Carlo (MCMC) approach to estimate the posterior parameter distribution for a given phylogeny, if one is willing to assume prior distributions for the parameters.

Also, the birth–death-shift likelihood can be used as an a priori distribution for reconstructing phylogenies in an MCMC framework. This method will be in particular interesting for reconstructing virus phylogenies. When using the birth–death-shift model for viruses, the birth rate corresponds to the transmission rate of a virus. As the transmission rate undergoes major declines when new treatment strategies become available (as treated patients usually cannot transmit), a model with rate shifts is very appropriate for viruses.

## Methods

### 1) Birth–Death-Shift Model.

To detect shifts of diversification rates in a phylogeny, we need to define an evolutionary model accounting for such shifts. I define the model very generally, allowing for rate shifts as well as mass extinction events. In this way, the method is applicable to a wide range of datasets.

For modeling the shift of rates and mass extinction events, we define a *birth–death-shift process* as follows. The vector *t* = (*t*_{0}, … , *t _{m}*), with 0 =

*t*

_{0}<

*t*

_{1}< … <

*t*, determines the times (before the present) of rate shift and mass extinction events, where

_{m}*t*

_{0}= 0 denotes the present. We set

*t*

_{m+}_{1}= ∞. The vector λ = (λ

_{0}, … , λ

*), where λ*

_{m}*> 0 (*

_{i}*i*= 0, 1, … ,

*m*) defines the speciation rates. λ

*is the speciation rate in time interval (*

_{i}*t*,

_{i}*t*

_{i+}_{1}]. The vector μ = (μ

_{0}, … , μ

*), where λ*

_{m}*> μ*

_{i}*≥ 0 (*

_{i}*i*= 0, 1, … ,

*m*) defines the extinction rates. μ

*is the extinction rate in time interval (*

_{i}*t*,

_{i}*t*

_{i+}_{1}]. The vector ρ = (ρ

_{0}, … , ρ

*), where 1 ≥ ρ*

_{m}*> 0 (*

_{i}*i*= 0, 1, … ,

*m*) defines the survival probabilities: ρ

*for*

_{i}*i*> 0 is the probability of surviving a mass extinction event at time

*t*. Note that ρ

_{i}*= 1 corresponds to a rate shift from λ*

_{i}*, μ*

_{i}*to λ*

_{i}

_{i−}_{1}, μ

_{i−}_{1}, but no extinction at time

*t*. The probability ρ

_{i}_{0}is the probability of sampling an extant species. ρ

_{0}= 1 corresponds to complete sampling of extant species.

The birth–death-shift process starts with one species at time *x*_{0} in the past and is stopped at time *t*_{0} = 0. At time point *t*, where *t _{i+}*

_{1}>

*t*>

*t*, each species gives birth to a new species with rate λ

_{i}*; the two subtending lineages are denoted*

_{i}*l*and

*r*such that we can distinguish between them. Each species dies with rate μ

*. At time*

_{i}*t*, each species survives the mass extinction with probability ρ

_{i}*. The process is stopped at time 0, the present. Each present-day species is sampled with probability ρ*

_{i}_{0}.

Such a process induces a tree (Fig. 5, *Left*). Suppressing all extinct and all nonsampled lineages yields the *sampled tree with root edge* (Fig. 5, *Right*). The *sampled tree* is obtained by deleting the edge subtending the origin. Note that the sampled tree is a binary tree with each leaf having the same distance from the root. Each internal vertex has a lineage *l* and a lineage *r* descending. In the sampled tree, we subdivide the edge from *t*_{s} to *t*_{e} with time *t*_{s} > *t _{i}* >

*t*

_{e}at time

*t*. So at time

_{i}*t*, we have a degree-two vertex. The

_{i}*n*− 1 speciation times in a sampled tree with

*n*present-day species are

*x*

_{1}>

*x*

_{2}> … >

*x*

_{n−}_{1}.

Note that a phylogeny inferred from data does not have the orientation labels *l* and *r* but each leaf has a unique label. We considered oriented trees and calculate their probability density because of convenient notation. The probability density of a labeled tree (each leaf is assigned a unique label, and the orientations *l* and *r* are dropped) follows from the equations for an oriented tree by multiplying with as each labeling and each orientation is equally likely (32). Note that in maximum-likelihood parameter estimations on a fixed tree, the factor is constant, and therefore it can be discarded. Therefore, throughout the rest of the paper, we consider only oriented trees.

### 2) Deriving the Likelihood of a Tree.

The aim of this section is to calculate the probability density of a sampled tree T under the birth–death-shift process. We do so by first calculating the probability of a species having 0 descendants [*2.1) Probability of 0 extant descendants*]. Using this probability, we calculate the density of a tree conditioned on the time of origin [*2.2) Probability density of a tree conditioned on x _{0}*] and the time of the most recent common ancestor [

*Section 2.3) Probability density of a tree conditioned on the time of the most recent common ancestor*].

Having the general expression for the probability density of any sampled tree allows us to calculate the probability density of a data phylogeny (which is a particular sampled tree). This probability density is maximized over the parameter space to obtain the maximum-likelihood parameter estimates for the phylogeny.

#### 2.1) Probability of 0 extant descendants.

**Theorem 2.1.** *The probability that a species alive at time t before today has no sampled extant descendants with t _{i}* ≤

*t*≤

*t*

_{i}_{+1}(

*i*= 0, … ,

*m with t*

_{m}_{+1}= ∞)

*is*

*with*

*and c*

_{0}= 1 − ρ

_{0}.

*Proof*. The master equation for the probability is, for *t _{i}* ≤

*t*≤

*t*

_{i+}_{1},

By plugging Eq. **1** into the master equation, one can verify that it is a solution of the master equation. For showing the uniqueness, define

As are continuous, uniqueness follows by the uniqueness theorem for ordinary differential equations (e.g., ref. 33, p. 211). □

**Remark 2.2.** Note that for *i* = 0, we havewhich was already established (14).

In the next section, we need Eq. **1** to calculate the density function of a sampled tree T.

#### 2.2) Probability density of a tree conditioned on x_{0}.

First note that the time of the start of the process, *x*_{0}, is a parameter of the birth–death-shift process, and therefore we cannot state a tree probability density independent of *x*_{0} unless we assume a prior distribution for *x*_{0}. In the following, we simply condition on the time *x*_{0} being fixed. In the subtending section, we then calculate the probability density of a tree conditioned on the time of the most recent common ancestor of the extant and sampled species: This tree can be considered as two trees with time of origin at *x*_{0}. The time of the most recent common ancestor of a clade is the age of the clade. This value is known for well-studied clades and therefore conditioning on this value is reasonable and generally done when estimating rates (19).

For *t _{i}* ≤

*t*≤

*t*

_{i+}_{1}, let

*g*

_{i}_{,e}(

*t*) be the probability density that the species corresponding to edge

*e*at time

*t*evolved between

*t*and the present as observed in T.

**Theorem 2.3.** *The unique solution of the ordinary differential equation for g _{i}*

_{,e}(

*t*)

*is*

*with*

*Note that* *for all i* ≥ 0.

*Proof*. The master equation for *g _{i}*

_{,e}(

*t*) along an edge with starting time

*t*

_{s}and ending time

*t*

_{e}is, with

*t*

_{s}≥

*t*≥

*t*

_{e},

with the initial values at *t* = *t*_{e},

By plugging Eq. **2** into the master equation, one can verify that it is a solution of the master equation. For showing the uniqueness, define

As are continuous, uniqueness follows by the uniqueness theorem for ordinary differential equations (e.g., ref. 33, p. 211).□

**Corollary 2.4.** *Let* *be the probability that a species alive at time t before today has one sampled extant descendant with t _{i}* ≤

*t*≤

*t*

_{i}_{+1}(

*i*= 1, … ,

*m with t*

_{m+}_{1}= ∞).

*Then*

*Proof*. The proof follows from Theorem 2.3 by noting that is the probability of observing a tree with one sampled individual. □

**Remark 2.5.** For *i* = 0, we have

which was established in Remark 3.2 in ref. 16.

The density of a sampled tree with root edge, T_{or}, given the time of origin, *f* [T_{or} | *t*_{or} = *x*_{0}], can be calculated by recursively using the formula for *g _{e,i}*(

*t*), starting with

*t*=

*x*

_{0}.

For obtaining a closed-form solution, note that each *x _{i}* with

*i*> 0 is twice a starting point and once an ending point of an edge.

*x*

_{0}is only a starting point. The leaves are only ending points.

Further, for *t _{i+}*

_{1}>

*t*≥

*t*, define

_{i}*l*(

*t*) :=

*i*. Further, let

*n*be the number of edges in T

_{i}_{or}that go through time

*t*. Using this notation, we obtain

_{i}Therefore, we established the following theorem.

**Theorem 2.6.** *The density of a sampled tree with root edge*, T_{or}, *conditioned on the time of origin being x*_{0}, *is*

#### 2.3) Probability density of a tree conditioned on the time of the most recent common ancestor.

We next provide the density for an oriented tree conditioned on the time of the most recent common ancestors of the extant sampled species. This formula is used for the maximum-likelihood parameter estimation in phylogenies.

**Theorem 2.7.** *The density of a sampled tree* T, *conditioned on the time of the most recent common ancestor being x*_{1}, *is*

*Proof*. When conditioning on the time of the most recent common ancestor of the sampled species, we implicitly condition that the two descendants at the time of the most recent common ancestor have sampled descendants, and we call *S* the event that a species has at least one sampled extant descendant. Let be the left subtree of T and be the right subtree of T descending from the most recent common ancestor of T. So

This establishes the theorem. □

### 3) Implementation of the Maximum-Likelihood Method.

The probability density in Theorem 2.7 is used for maximum-likelihood rate estimation in phylogenetic trees inferred from (species) data. A model with *m* rate shifts (or mass extinction events) can be tested against a model with *m* + 1 rate shifts (or mass extinction events) using the likelihood-ratio test. The method is implemented in the R package TreePar (18). For the maximization of the likelihood function, I use the R package subplex (34) available on CRAN. The likelihood estimations were performed on the ETH Brutus cluster.

The maximization performs well for fixed rate shift times, but terminates in local optima instead of the global optimum when estimating simultaneously the rate shift times and the speciation and extinction rates. I therefore use a fine grid on shift times and optimize the speciation and extinction rates for the fixed rate shift times on the grid.

When analyzing the mammalian phylogeny, I first allow for one shift and estimate rates in 0.1-Myr steps between 2.05 and 100.05 Mya. Then I allow for two shifts by fixing the maximum-likelihood time shift from the one-shift estimation and using for the second shift again a grid of 0.1-Myr steps between 2.05 and 100.05 Mya. I proceed in this way to add more shifts. Note that estimating all time shifts simultaneously would be favorable over the greedy approach, but it is computationally not feasible. I show through simulations that the greedy approach recovers the rate shifts reliably (*SI Text S2*).

I showed in a previous paper (17) that it is not possible to distinguish a mass extinction event from constant rates interrupted by a period of stasis (i.e., basically no speciation and extinction), as both events leave a similar footprint on the phylogeny. Distinguishing between mass extinction and rate shift events requires additional paleontological data. If only the phylogenetic tree is available, then for each time interval, we need to know one of the three parameters λ* _{i}*, μ

*, and ρ*

_{i}*or we have to assume dependencies, like λ*

_{i}*= λ*

_{i}

_{i−}_{1}. Note that requiring one of the three parameters λ

*, μ*

_{i}*, and ρ*

_{i}*to be known is already necessary if no shift and no mass extinction occur (35).*

_{i}In the mammal phylogeny, I investigate whether and when rate shift events occurred; i.e., I set ρ* _{i}* = 1 for all

*i*. I investigate the accuracy of the maximum-likelihood rate estimates through simulating trees with

*n*species and then estimating their rates. For the simulations, I use the R package TreeSim (36).

Trees with polytomies can be analyzed with my method. Polytomies are considered by the method as a random binary resolution with edges of length 0 (note that each binary resolution of the polytomy has the same likelihood).

## Acknowledgments

I thank Alexandre Antonelli, Marcelo Sanchez, Helen Alexander, Jan Engelstaedter, Roland Regoes, Olin Silander, the editor, and the two anonymous reviewers for very helpful comments. Funding for this work came from Eidgenössiche Technische Hochschule, Zurich.

## Footnotes

- ↵
^{1}E-mail: tanja.stadler{at}env.ethz.ch.

Author contributions: T.S. designed research, performed research, contributed new reagents/analytic tools, analyzed data, and wrote the paper.

The author declares no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1016876108/-/DCSupplemental.

## References

- ↵
- Asher RJ,
- et al.

- ↵
- ↵
- ↵
- Wilson D,
- Reeder D

- ↵
- ↵
- ↵
- Nee S,
- Holmes EC,
- May RM,
- Harvey PH

- Paradis E

- ↵
- Pybus OG,
- Harvey PH

- ↵
- ↵
- ↵
- ↵
- Bailey N

- ↵
- Nee SC,
- May RM,
- Harvey PH

- ↵
- ↵
- ↵
- Stadler T

- ↵
- Stadler T

- ↵
- ↵
- ↵
- ↵
- Barnosky A,
- Carrasco M

- ↵
- Finarelli J,
- Badgley C

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- Alroy J

- ↵
- Rabosky DL

- ↵
- Rabosky DL,
- Lovette IJ

- ↵
- Ford D,
- Matsen FA,
- Stadler T

- ↵
- King A,
- Billingham J,
- Otto S

- ↵
- King A

- ↵
- ↵
- Stadler T

## Citation Manager Formats

## Sign up for Article Alerts

## Article Classifications

- Biological Sciences
- Evolution