Phylogenomics reveals rapid, simultaneous diversification of three major clades of Gondwanan frogs at the Cretaceous–Paleogene boundary
- aState Key Laboratory of Biocontrol, College of Ecology and Evolution, School of Life Sciences, Sun Yat-Sen University, Guangzhou 510006, China;
- bDepartment of Natural History, Florida Museum of Natural History, University of Florida, Gainesville, FL 32611;
- cDepartment of Integrative Biology and Biodiversity Collections, University of Texas, Austin, TX 78712;
- dMuseum of Vertebrate Zoology and Department of Integrative Biology, University of California, Berkeley, CA 94720
See allHide authors and affiliations
Contributed by David B. Wake, June 2, 2017 (sent for review March 22, 2017; reviewed by S. Blair Hedges and Jonathan B. Losos)

Significance
Frogs are the dominant component of semiaquatic vertebrate faunas. How frogs originated and diversified has long attracted the attention of evolutionary biologists. Here, we recover their evolutionary history by extensive sampling of genes and species and present a hypothesis for frog evolution. In contrast to prior conclusions that the major frog clades were established in the Mesozoic, we find that ∼88% of living frogs originated from three principal lineages that arose at the end of the Mesozoic, coincident with the Cretaceous–Paleogene (K–Pg) mass extinction event that decimated nonavian dinosaurs 66 Mya. The K–Pg extinction events played a pivotal role in shaping the current diversity and geographic distribution of modern frogs.
Abstract
Frogs (Anura) are one of the most diverse groups of vertebrates and comprise nearly 90% of living amphibian species. Their worldwide distribution and diverse biology make them well-suited for assessing fundamental questions in evolution, ecology, and conservation. However, despite their scientific importance, the evolutionary history and tempo of frog diversification remain poorly understood. By using a molecular dataset of unprecedented size, including 88-kb characters from 95 nuclear genes of 156 frog species, in conjunction with 20 fossil-based calibrations, our analyses result in the most strongly supported phylogeny of all major frog lineages and provide a timescale of frog evolution that suggests much younger divergence times than suggested by earlier studies. Unexpectedly, our divergence-time analyses show that three species-rich clades (Hyloidea, Microhylidae, and Natatanura), which together comprise ∼88% of extant anuran species, simultaneously underwent rapid diversification at the Cretaceous–Paleogene (K–Pg) boundary (KPB). Moreover, anuran families and subfamilies containing arboreal species originated near or after the KPB. These results suggest that the K–Pg mass extinction may have triggered explosive radiations of frogs by creating new ecological opportunities. This phylogeny also reveals relationships such as Microhylidae being sister to all other ranoid frogs and African continental lineages of Natatanura forming a clade that is sister to a clade of Eurasian, Indian, Melanesian, and Malagasy lineages. Biogeographical analyses suggest that the ancestral area of modern frogs was Africa, and their current distribution is largely associated with the breakup of Pangaea and subsequent Gondwanan fragmentation.
A robust, reliable phylogeny is essential to understand the role of macroevolutionary processes in generating biodiversity. However, resolution of evolutionary relationships among certain groups has been persistently difficult because of sparse genotypic and phenotypic data. Frogs (Anura) are one such example; they are one of the most diverse groups of tetrapods, and currently comprise 6,775 described species, 446 genera, and 55 families (1) that are well represented on all continents. They exhibit great adaptive diversity within a highly constrained phenotype estimated to be 200 My old. Evolutionary convergence in body form, life history, and behavioral traits is widespread in frogs, including forms reflecting different microhabitat use by arboreal, aquatic, and fossorial species. These features make frogs a challenging but fascinating model for addressing fundamental questions of morphological, developmental, and biogeographical evolution. However, despite intensive molecular phylogenetic studies (2⇓⇓⇓⇓–7), areas of uncertainty and disagreement persist among clades that are crucial for interpreting broad-scale macroevolutionary patterns. In addition, a general consensus on divergence times of the major anuran lineages is also lacking (7, 8).
The poor resolution for many nodes in anuran phylogeny is likely a result of the small number of molecular markers traditionally used for these analyses. Previous large-scale studies used 6 genes (∼4,700 nt) (4), 5 genes (∼3,800 nt) (5), 12 genes (6) with ∼12,000 nt of GenBank data (but with ∼80% missing data), and whole mitochondrial genomes (∼11,000 nt) (7). In the larger datasets (e.g., ref. 6), most data (>50%) are from the 12S and 16S mitochondrial ribosomal genes. The limited amount of data also causes a wide range of estimates of divergence times for many nodes in the tree. For example, age estimates for the last common ancestor of extant Neobatrachia, often referred to as “modern frogs” and containing 95% of extant anuran species, span ∼100 Mya (5, 7⇓⇓⇓–11). Furthermore, divergences time estimates among the earliest neobatrachian clades, such as the Heleophrynidae, Myobatrachidae, Calyptocephalellidae, Nasikabatrachidae, and Sooglossidae, range from the Late Jurassic to early Cretaceous (∼150–100 Mya) and have wide CIs (5, 7⇓⇓⇓–11). In addition to these species-poor groups of neobatrachians, there are two species-rich clades: Ranoidea (39% of extant anuran species, mostly Old World) and Hyloidea (54%; mostly New World). The estimated ages of each clade range from the Late Jurassic to the end of the Cretaceous, spanning ∼100 My, and relationships of family-level taxa within each clade remain poorly resolved.
In this study, we increased gene sampling by using a recently developed nuclear marker toolkit (12). Our new data include ∼88,000 nt of aligned sequences from 95 nuclear protein-coding genes covering 164 species (156 anuran species and 8 outgroups) from 44 of 55 frog families; to our knowledge, this is the largest source of new data for anuran phylogenetics. In addition, we enlarged this dataset to a total of 301 anuran species by incorporating previously published RAG1 and CXCR4 sequences so that all 55 extant frog families were included. Our goal was to propose a robust hypothesis of phylogenetic relationships and divergence times of the major lineages. Our results resolve previously intractable relationships, generate divergence times with narrow CIs, and provide perspectives on the evolutionary history and historical biogeography of frogs.
Results and Discussion
Data Characteristics.
We assembled a de novo 164-species dataset by using 95 nuclear genes (Table S1) and 88,302 nt from 156 frog species and 8 outgroups; this matrix is 89.6% complete. To increase coverage of anuran families, we added sequences of RAG1 and CXCR4 from GenBank of 145 additional anuran species. This 309-species dataset contains 88,386 nt and is 48.2% complete. The 164-species and 309-species matrices are available from the Dryad Digital Repository (doi:10.5061/dryad.12546).
Descriptive statistics for the 95 loci used in this study
Higher-Level Phylogenetic Relationships of Frogs.
Maximum-likelihood (ML) and Bayesian analyses of concatenated genes in the 164-species dataset produced identical trees except for two nodes with low support (Figs. S1 and S2). The ASTRAL species tree differed from the ML tree at eight poorly supported nodes (Fig. S3). Overall support is high: 94% of 155 nodes within frogs have a bootstrap value (BS) ≥70% (Fig. S1), and 97% have Bayesian posterior probabilities (BPPs) ≥0.95 (Fig. S2). The ASTRAL species-tree method produced BSs ≥70% at 84% of the nodes (Fig. S3). Although ML analysis of the 309-species dataset has weaker support (78% of nodes have BSs ≥70%; Fig. S4), the basic topology of the ML tree is similar. Therefore, we used the ML tree as our primary hypothesis (Fig. 1A) for estimating the chronogram.
Time-calibrated phylogenetic tree of frogs and the pattern of net diversification rate across time. (A) Evolutionary chronogram based on 95% nuclear genes and 20 fossil age constraints. Gray bars represent the 95% credibility interval of divergence time estimates. Divergence time estimates and corresponding 95% credibility intervals for all nodes are provided in Table S2. Note that the initial diversification of the three major frog clades: Hyloidea (blue), Microhylidae (purple), and Natatanura (green) took place simultaneously near the KPB (dashed red line). (B) Rate-through-time plot of extant frogs indicates an increase in diversification rate at the end of the Cretaceous.
Archaeobatrachian relationships are identical to those of most recent studies (refs. 2, 5, 6, 13; but see refs. 4, 10), including an analysis of mitogenomes of 90 anuran species (7). Relationships among the early branches of Neobatrachia are identical to the mitogenomic phylogeny (7). We corroborate the placement of Heleophrynidae, which is from extreme southern Africa, as the sister taxon to all other neobatrachian frogs (Fig. 1A), as found by some authors (4⇓⇓–7, 9), but not by others (10, 11). We find that Sooglossidae (known only from the Seychelles Islands) is the sister taxon of Ranoidea (BS = 100%, BPP = 1.0; Figs. S1–S3) in the 164-species analyses, and that Sooglossidae + Nasikabatrachidae (known only from the Western Ghats of peninsular India) form the Sooglossoidea, which is the sister group of Ranoidea (BS = 100%; Fig. S4) in the 309-species analyses. Sooglossoidea is placed as the sister taxon of all other neobatrachians (11), all neobatrachian frogs to the exclusion of Heleophrynidae (6), Hyloidea + Myobatrachidae + Calyptocephalellidae (4), and Ranoidea (3, 5, 7), but no previous studies recovered the placement of Sooglossoidea with strong statistical support, including a mitogenome phylogeny (7).
Taxa within the superfamily Ranoidea have been consistently grouped into three clades (4⇓⇓–7): Microhylidae, Afrobatrachia (i.e., epifamily Brevicipitoidae, which includes Brevicipitidae, Hemisotidae, Hyperoliidae, and Arthroleptidae), and Natatanura (Fig. 1A). We found that Microhylidae is the sister group of Afrobatrachia + Natatanura (BS = 72%, PP = 1.0; Figs. S1 and S2); in contrast, many previous studies placed Microhylidae as the sister group of Afrobatrachia (4⇓–6, 14) or of Natatanura (7). Relationships within the Afrobatrachia mirror those found in other studies (5⇓–7, 14).
Natatanura is a large clade of extant anurans (24% of species) and mainly found in the Old World. Our ML and Bayesian topologies of Natatanura are identical. All nodes in the Bayesian tree have a BPP of 1.0, and only three nodes in the ML tree have BSs <90%. The 309-species topology is identical, but with low support among the deeper branches, likely because of missing data. Notably, we found that endemic African continental lineages (Conrauidae, Odontobatrachidae, Petropedetidae, Phrynobatrachidae, Ptychadenidae, Pyxicephalidae) form a clade that is the sister group to the clade of the remaining North American, Eurasian, Melanesian, and Malagasy lineages (Ceratobatrachidae, Dicroglossidae, Mantellidae, Rhacophoridae, and Ranidae; Fig. 1A and Figs. S1–S4). This African clade has low bootstrap support (56%) but high Bayesian support (1.0). The clade of the remaining non-African families is strongly supported (BS = 100%), and the internal branches are strongly supported (BS = 100%, BPP = 1.0), although they are short. In other studies, this group of African lineages is not monophyletic (6, 7, 10, 14, 15). The phylogenetic position of the continental African lineages has important biogeographic significance (as detailed later).
Relationships among the subfamilies of Microhylidae (one of the largest anuran families, including 8.8% of all species), which have significant radiations on most continents and the large islands Madagascar and New Guinea, have proven difficult to resolve (4, 6, 7, 15⇓⇓⇓–19). In contrast, our Bayesian tree has strong support; all 10 of the deepest nodes have a BPP of 1.0 (Fig. S2). The ML topology is identical, but 3 of 10 nodes have a BS <90% (Fig. S1). ML analysis of our 309-species dataset (Fig. S4) recovered the same topology, but support is weaker. Nonetheless, our tree is more strongly supported than others except for a phylogenomic analysis (19) of 66 anchored loci for 48 taxa and 7 Sanger-sequenced loci for 142 taxa.
Significantly, our study resolves relationships among one of the most diverse clades in the anuran phylogeny: Hyloidea, which contains 54% of extant anuran species. ML and Bayesian analyses strongly support 53 of 55 of nodes in Hyloidea (BS > 80% and PP = 1.0; Figs. S1 and S2). This is particularly unexpected because even the most species- and character-rich studies of hyloids (5⇓–7) have recovered poorly resolved topologies. Our expanded 309-species topology (Fig. S4) is similar to our primary tree, even though many nodes have lower support, which is consistent with the higher degree of missing data for this dataset of more species.
Two arrangements within hyloids are noteworthy. The deepest divergences in Hyloidea are among southern South American taxa: Rhinodermatidae in the temperate beech forests of Chile, and Alsodidae and Batrachylidae in Patagonia (Figs. S1–S4). Similar relationships were previously reported (5, 20), albeit with weak support and a smaller sample of species. These relationships support a southern South American origin of Hyloidea. Moreover, most previous studies (4, 6, 7, 21) supported Terrarana, a large New World tropical clade (15% of extant anuran species), rather than southern South America groups, as the sister group of all other hyloids. This difference in the placement of Terrarana may result from long-branch attraction of mtDNA sequences, because Terrarana species apparently have higher rates of mtDNA evolution than other hyloid lineages (22).
In summary, our analyses corroborate many of the deeper neobatrachian nodes inferred by other studies, but with greater support (all BS values = 100%). Furthermore, we find strong support for many shallower nodes that are weakly supported in other studies; only 4 of 155 nodes in our Bayesian tree have a posterior probability <1.0 and only 9 of 155 nodes have bootstrap support <75%. Relationships among the deepest branches in the Hyloidea, Microhylidae, and Natatanura, for which previous studies have conflicting results, are now strongly supported in Hyloidea and Natatanura, and most nodes are well-supported in Microhylidae. Some clades that were not well-supported in the mitogenomic phylogeny (7) are rejected in this study. Our findings include the position of southern South American taxa as the earliest branches of hyloids, and the identification of a clade of endemic African taxa as the sister group of all other Natatanura.
A New Timescale for Extinction and Diversification.
Our extensive sample of nuclear loci produced not only a strongly supported phylogeny for frogs but also younger divergence time estimates with smaller CIs than previous studies. The estimated divergence times in the 164-species and the 309-species data sets are similar (Figs. S5 and S6); the average time deviation among comparable nodes between the two analyses is ∼5%. Following a thorough literature review, we selected 20 fossil calibration points, of which 13 are within crown Anura (Fig. S7). Jackknife removal of each fossil resulted in highly congruent sets of date estimates. The overall average time differences were less than 1% except for the two outlier fossils (the salamander Chunerpeton and the salientian Triadobatrachus), and even the two outliers only led to overall average time differences of 3–5.4% (Fig. S8).
For this discussion, we focus on the 164-species tree (Fig. 1A). Overall, our divergence times are notably younger than those found by most other studies (5, 10, 23, 24) (Fig. 1A and Table S2), and especially much younger than those in the mitogenome phylogeny (7). For example, we estimate the last common ancestor of crown-group Anura to be during the Upper Triassic at 210.0 Mya (95% CI, 199.1–220.4 Mya), and the age of crown Neobatrachia in the Lower Cretaceous at 142.1 Mya (95% CI, 132.8–149.8 Mya). In contrast, many other estimates place the age of crown Anura between the Permian and Middle Triassic and the origin of Neobatrachia between Upper Triassic and Upper Jurassic (Fig. S9).
Detailed node information of the concatenated analyses and divergence analyses
A striking pattern is the synchronous origin of three species-rich neobatrachian clades—Hyloidea, Microhylidae, and Natatanura—at the Cretaceous–Paleogene (K–Pg) boundary (KPB;Fig. 1A, dashed red line). The diversification synchronicity of the three frog clades still existed when all calibration constraints in frogs were excluded (Fig. S10), suggesting that this pattern is unlikely the result of the choice of calibration points. The K–Pg boundary (KPB), dated precisely at 66 Mya (25), marks one of Earth's great extinctions, largely attributed to the impact of the Chicxulub bolide. However, other factors such as climate warming and the Indian Deccan volcanism, possibly related to the bolide impact, likely contributed to this extinction event (26). Although a near-KPB origin was separately reported for Hyloidea (5, 8, 10, 20, 27) and Microhylidae (10, 16) (Fig. S9), our study found that these three major clades originated at the same time, very near the KPB. The contemporaneous origin of the three large clades is highlighted in our results by the narrow 95% CIs for each and their overlap with the KPB. Nine of the 10 deepest nodes of Hyloidea have relatively narrow CIs that overlap the KPB (9.6–14.0 My; Fig. 1A and Table S2). Similarly, within Microhylidae, the CIs of the four deepest nodes encompass the KPB, and the CIs are similarly narrow (11.1–11.9 My). Finally, in Natatanura, the CIs of the six deepest nodes all overlap the KPB, also with similarly narrow CIs (10.4–10.7 My). The use of internal calibration points for these clades is unlikely to be driving this pattern because there was only one internal calibration point (within the Natatanura) used within these three clades.
There are several reasons why our study may have produced younger divergence times compared with previous studies. First, we included diverse outgroups, including lungfish, coelacanth, salamanders, caecilians, and amniotes. Second, we used a new set of calibration points rather than uncritically recycling calibration points from other studies; these calibration points have been carefully reexamined based on the fossil record. The choice of calibration points has significant impact on divergence time estimation for frogs and may be an important reason that older divergence times were obtained in previous studies (27). Third, our phylogenetic estimate is based on a much larger dataset (88 kb; 62% of sites are variable) derived solely from 95 nuclear loci (rather than primarily from mitochondrial loci) that exceeds other studies by at least sevenfold. We attribute the greater precision (i.e., smaller CIs) to the large amount of information within our dataset. A similar increase in precision was found for plethodontid salamanders by using the same nuclear marker toolkit (28).
Rate-through-time analysis indicates that a surge in net diversification rates of frogs occurred immediately following the KPB (Fig. 1B), which suggests a clear impact of the KPB extinction on frog diversification. The rapidity of the diversification is reflected in the short branches of the deepest nodes in each clade. In Hyloidea, all but one branch connecting the 10 deepest nodes are <4.6 My in duration. In Microhylidae, the branches connecting the four deepest nodes are <4.0 My long. In Natatanura, the branches connecting the six deepest nodes are <2.1 My. Even though these branches are short, they are strongly supported. Another line of evidence demonstrates three mass extinctions preceding explosive speciation. The branches subtending the Hyloidea, Natatanura, and Microhylidae are long: 47.3, 32.1, and 33.8 My, respectively (Fig. 1A and Table S2). The lack of other extant lineages originating from these three stem lineages corroborates our suggestion that an extinction event simultaneously decimated these lineages near the KPB.
Even though the fossil record indicates mass extinction at the KPB for several taxa, including birds, squamates, and mammals, and subsequent radiation post-KPB (29⇓–31), molecular data often indicate that many clades had diversified, at least in the sense of lineage splitting, before the KPB event (32⇓⇓⇓–36). Unlike nonavian dinosaurs, whose demise is well documented in the rocks, the fossil record of frogs is so far largely uninformative about survival/extinction across the KPB (37⇓⇓–40). In the present study, the three parallel combinations of precise node ages overlapping the KPB, preceded by a long branch and followed by very short but strongly supported branches, indicate three extinctions followed by rapid divergence. These three radiations, which are coincident with the late Cretaceous extinction, account for 88% of extant frog species. This molecular phylogenetic perspective strongly suggests that frogs experienced a major extinction at the KPB.
Historical Biogeography of Frogs.
We performed a biogeographic analysis on the 309-species data set using BioGeoBEARS (41). The DEC+J model performs significantly better than the DEC model (Akaike information criterion; Table S3), indicating the importance of the J parameter, which models long-distance or “jump” dispersal. Our interpretation is that dispersal into a new area is accompanied by near-instantaneous speciation. The practical effect of adding the J parameter to the model is that ancestral ranges often comprise one area rather than several.
Models and parameters of ancestral range estimation of frogs
The most recent common ancestor (MRCA) of extant frogs was distributed in Eurasia + North America + Australia or North America + Australia (Fig. 2 and Fig. S11), a Pangaean origin. In contrast, neobatrachians (i.e., modern frogs) originated in Gondwanaland, most likely in Africa (relative probability, 93.6%; Fig. 2). The major neobatrachian lineages are also Gondwanan in origin: Hyloidea in South America and Ranoidea, Afrobatrachia, and Natatanura in Africa (relative probability, >90%; Fig. 2).
Ancestral-area estimates for 69 terminal taxa (families, subfamilies, and genera) of extant frogs using the DEC+J model in BioGeoBEARS. Circles on nodes represent the set of possible ancestral areas, and the color is associated with the area legends. The probabilities are given next to circles for the most probable ancestral area. Circles without values indicate that the probability of the ancestral area is >99%. Three important landmass breakup events are indicated: (A) the break-up of Pangea with division into Laurasia and Gondwana in the Middle Jurassic coincident with the origin of neobatrachian frogs; (B) the separation of Africa and South America in the Early Cretaceous coincident with the divergence of Ranoidea and Hyloidea as well as between the African and New World pipids; and (C) the separation of the Seychelles and India in the Late Cretaceous coincident with the divergence between the Sooglossidae and Nasikabatrachidae. Anuran taxa that contain at least some arboreal species are indicated in green and a tree icon (we do not imply that the last common ancestor of each of these families was arboreal). Note that all clades containing arboreal frogs originated after the KPB.
Breakup of land masses may be associated with three divergence events in the evolution of frogs. The first of these was the split of neobatrachians from ancestral groups (Fig. 2A), dated at 172–181 Mya, which agrees with the breakup of Pangaea into Laurasia and Gondwana in the Early Jurassic (∼180 Mya) (42). This breakup event, which isolated the MRCA of Neobatrachia in Gondwana, is associated with the initial diversification of the clade. The second breakup event caused the split between the two major lineages of neobatrachians, Procoela (Hyloidea and Myobatrachoidea, mostly South American) and Diplasiocoela (Ranoidea + Sooglossoidea, mostly African), as well as the split between the African and South American pipids (Fig. 2B). Spreading of the South Atlantic Ocean sea floor started in the Early Cretaceous (135 Mya), but the final physical separation between Africa and South America took place approximately 105 Mya (42). Our estimates for timing of these divergences, 125–130 Mya and ∼120 Mya, respectively, are highly congruent with this rifting process. The third breakup event took place when the Sooglossidae (Seychelles) and the Nasikabatrachidae (India) diverged (Fig. 2C). The Seychelles/India land mass continued to exist until 66 Mya, when new rifting severed the Seychelles from India (43). We estimate Sooglossidae split from Nasikabatrachidae at ∼66 Mya, which is considerably younger than the previous estimates (77–130 Mya) (10, 11, 23), in congruence with the geological event.
The MRCA of Hyloidea and of Myobatrachidae + Calyptocephalellidae occurred in South America (Fig. 2). The divergence between Myobatrachidae (Australia) and Calyptocephalellidae (South America) and the split between Phyllomedusinae (South America) and Pelodryadinae (Australia and New Guinea) took place ∼100 Mya and ∼50 Mya, respectively (Fig. 2). During this period, South America and Australia were distantly separated but connected intermittently via Antarctica (42). From the late Cretaceous to the early Tertiary, the Earth experienced a period of global warming (44). Climate data from plant fossils, sediments, and geochemical indicators show that the mean annual temperature of the Antarctic Peninsula region was 10–20 °C from 100 to 50 Mya (45), which is sufficiently warm to allow dispersal of frogs through this region. Thus, the origin of the Australian myobatrachids and pelodryadine hylids is most likely explained by dispersal from South America to Australia through Antarctica, and later extinction in Antarctica because of the formation of ice sheets. The role of Antarctica in dispersal of frogs among Gondwanan land masses has received little attention (but see ref. 46), but our time-calibrated phylogeny predicts that paleontological research in the Cretaceous and early Cenozoic of Antarctica could shed light on the early evolution and dispersal, especially of hyloids.
The nearly worldwide distribution of Microhylidae presents a longstanding and challenging biogeographic puzzle. Vicariant origin and long-distance oceanic dispersal have been proposed to explain the current distribution of this family (10, 15, 16, 18, 23). Which scenario dominates interpretation depends largely on the time and location of origin for the Microhylidae. Our analyses place the initial divergence of the Microhylidae at ∼66 Mya, with a relative probability of 79.5% for an African origin (Fig. 2). By this time, Gondwana was already highly fragmented and Africa was separated from South America and Madagascar by ocean. Accordingly, overseas dispersal of the major microhylid lineages on Gondwanan landmasses is required.
The origin and diversification of the second large clade, Natatanura, is controversial. Two hypotheses have been proposed. The Out-of-Africa hypothesis (47), argues that the origin of the Natatanura lies in Africa, with subsequent dispersal to other continents. The Out-of-India hypothesis (48) postulates that the ancestor of Natatanura originated on the Indian plate and was confined there until the plate collided with Asia approximately 55 Mya. We find that Natatanura originated in Africa (relative probability, 92.5%; Fig. 2). Considering that Natatanura is the sister group of Afrobatrachia (apparently of African origin) and the basal split within the Natatanura phylogeny separates all African continental lineages from other Asian, Indian, and Madagascar lineages (Fig. 2), an African origin seems more likely. The endemic Indian natatanurans (Nyctibatrachidae, Micrixalidae, and Ranixalidae) are nested within Asian lineages, and the divergences between them and their closest Asian relatives occurred between 55 and 60 Mya (Fig. 2), which is consistent with the timing of the India–Asia collision (∼55 Mya) (49). Intriguingly, the divergence between the Asian tree frogs (Rhacophoridae) and the Malagasy mantellids also happened during this period (Fig. 2), implying that the Indian plate served as a stepping stone for the long-distance dispersal from Asia to Madagascar.
The three long branches of similar duration leading to Hyloidea, Microhylidae, and Natatanura suggest that the K–Pg mass extinction event may have triggered explosive radiations by emptying ecological space. Notably, all terminal taxa (family and subfamily ranks) with arboreal species originate after the K–Pg boundary (Fig. 2). The rebounding of forests after the massive loss of vegetation (50) at the KPB may have provided new ecological opportunities for the subsequent radiation of largely arboreal groups such as Hylidae, Centrolenidae, and Rhacophoridae. The observation that no lineages of frogs originating before the KPB (archaeobatrachians, Heleophrynidae, Myobatrachoidea, and Sooglossoidea) have truly arboreal species, and that all origins of arboreality (e.g., within hyloids or natatanurans) follow the KPB, supports the hypothesis that the K–Pg mass extinction shaped the current diversity of frogs.
Materials and Methods
Taxon Sampling and Data Collection.
Our sampling included 156 frog species from 44 of the 55 recognized families (1). Eight outgroup species, including two salamanders, one caecilian, one bird, one crocodile, one mammal, and two lobe-finned fishes, were used in all phylogenetic analyses. Total DNA was isolated from frozen or ethanol-preserved tissues (liver or muscle) by proteinase K digestion followed by standard salt extraction protocol. Previously published PCR primers and PCR protocols (12) were used to amplify 95 unlinked nuclear protein-coding genes (including RAG1 and CXCR4) from the DNA extracts in 96-well plates. The amplification products were sequenced using a next-generation sequencing (NGS) strategy as described by Feng et al. (51). Briefly, all amplification products from a single species were pooled together and purified. The amplification product pool of a sample was then randomly sheared to small fragments (200–500 bp), ends were repaired, and a species-specific barcode linker was added. All indexed amplification product pools were mixed together. A sequencing library was constructed with the pooled DNA by using the TruSeq DNA Sample Preparation kit and sequenced on a Illumina HiSEq.2500 sequencer. Approximately 3 GB of Illumina HiSeq paired-end 90-bp reads were obtained. These reads were bioinformatically sorted by barcode sequence and assembled into consensus sequences. NGS protocol of library construction and bioinformatic analyses have been described previously (51). All sequences were compared by using BLAST against GenBank to ensure that the target sequences were amplified. The remaining sequences were further checked for frame shift or stop codons. GenBank accession numbers for the new sequences are given in Dataset S1.
GenBank Data.
To provide a comprehensive phylogeny at the family level, we retrieved RAG1 and CXCR4 sequences from GenBank for 145 additional species of frogs (Dataset S1). These data plus our new sequences comprise a combined dataset that contains 301 frog species from all recognized anuran families.
Phylogenetic Analyses.
The 95 nuclear protein coding genes were aligned by using the ClustalW algorithm as implemented in MEGA v6 (52). Ambiguously aligned regions were culled using GBlocks v.0.91b (53) with the “codon” model (−t = c), smaller block (−b4 = 3), and all gaps allowed (−b5 = a). All refined alignments were then concatenated into a concatenated supermatrix. Two supermatrices were built: 164-species dataset (156 frogs and 8 outgroups) and 309-species dataset (301 frogs and 8 outgroups). PartitionFinder v.1.1.1 (54) was used to select models and partitioning schemes for the two supermatrices according to the Bayesian information criterion. A total of 285 data partitions were selected as the best partitioning scheme that corresponded to the three separate codon positions for each of the 95 genes (Table S4). The ML tree was estimated by using RAxML version 8.0 (55) with the GTR + Γ + I model assigned to each partition. Support for nodes in the ML tree was assessed with a rapid bootstrap analysis (option –f a) with 1,000 replicates. The Bayesian tree was inferred using MrBayes 3.2 (56) using the models and partitions identified by PartitionFinder. Two Markov chain Monte Carlo (MCMC) runs were performed with one cold chain and three heated chains (temperature set to 0.1) for 50 million generations and sampled every 1,000 generations. Chain stationarity was visualized by plotting likelihoods against the generation number by using TRACER v1.6 (beast.bio.ed.ac.uk/Tracer). The effective sample sizes were greater than 200 for all parameters after the first 10% of generations were discarded. Species tree analysis without gene concatenation was performed by using the Accurate Species TRee ALgorithm (ASTRAL) (57) under the coalescent model. The individual input gene trees were inferred from partitioned ML analyses by using RAxML with the same GTR + Γ + I model assigned to each codon position of each gene. The species tree analysis was conducted by using ASTRAL under the multilocus bootstrapping option with 200 replicates (−r = 200).
Comparisons of partitioning schemes using AIC, AICc, and BIC
Divergence Time Analyses.
Divergence time estimation was conducted by using the program MCMCTREE in the PAML package (58). The ML topology was used as the reference tree. Twenty calibration points were used to calibrate the clock (Fig. S7). The ML estimates of branch lengths for each of the 95 nuclear genes were obtained by using BASEML (in PAML) programs under the GTR + Γ model. Based on the mean estimate from the 95 genes using a strict molecular clock with a 450-Mya root age (the divergence between Latimeria and Protopterus; ref. 59), the prior for the overall substitution rate (rgene gamma) was set at G (1, 11.96, 1). The prior for the rate-drift parameter (sigma2 gamma) was set at G (1, 4.5, 1). The 20 calibration points were specified with soft boundaries by using 2.5% tail probabilities above and below their limits; this is a built-in function of MCMCTREE. The independent rate model (clock = 2 in MCMCTREE) was used to specify the rate priors for internal nodes. The MCMC run was first executed for 10,000,000 generations as burn-in, then sampled every 1,000 generations until a total of 10,000 samples was collected. Two MCMC runs using random seeds were compared for convergence, and similar results were found.
Rate-Through-Time Analysis.
We investigated the diversification tempo of frogs using the program Bayesian Analysis of Macroevolutionary Mixtures (BAMM), v2.5 (60). The 309-species chronogram was used as the input tree. To account for incomplete taxon sampling, the time tree was pruned to family level, and the sampling fraction of each family was calculated based on the number of species following AmphibiaWeb. The BAMM analysis was run for 100 million generations at a temperature increment parameter of 0.01 and sampled event data every 1,000 generations. The first 20% samples were removed as burn-in. The rate-through-time plot was summarized and visualized by using BAMMtools (61) from the remaining 80% event data samples.
Biogeographic Analyses.
Based on the current distribution pattern of frogs, we defined seven biogeographic areas: Africa, Eurasia (Europe and Asia with exception of Indian plate), India (including Sri Lanka and the Seychelles), Madagascar, North America (northern Mexico, United States, and Canada), South America, and Australia (Australia, New Zealand, and New Guinea; Fig. 2). Connectivity of these biogeographic areas was modeled with three dispersal probability categories: 0.01 for well-separated areas, 0.5 for moderately separate areas, and 1.0 for well-connected areas. Area connectivity and dispersal probability were modeled in seven time slices: 0–30 Mya, 30–66 Mya, 66–90 Mya, 90–120 Mya, 120–160 Mya, 160–200 Mya, and 200–270 Mya (Table S5).
BioGeoBEARS dispersal multipliers for seven slices
Biogeographic analyses were performed by using BioGeoBEARS (41). We used the 309-species chronogram generated by our divergence time analyses as the input phylogeny. The current distribution of each frog species was assigned based on data from AmphibiaWeb (Table S6). The maximum number of ancestral areas allowed at each node was set to four. We compared the DEC and DEC+J models to determine the influence of founder-event dispersal on biogeographic patterns. The AIC criterion selected the DEC+J as the best-fitting model (Table S3), and this was subsequently used to infer the most likely biogeographic history of anurans.
Taxonomic information and geographical distribution used for biogeographical analysis
Acknowledgments
This work was supported by National Natural Science Foundation of China Grants 31672266 and 31372172 (to P.Z.), National Youth Talent Support Program Grant W02070133 (to P.Z.), National Science Fund for Excellent Young Scholars of China Grant 31322049 (to P.Z.), and National Science Foundation Grant DEB-1202609 (to D.C.B.) and DEB-1441652 (to D.B.W.).
Footnotes
- ↵1To whom correspondence may be addressed. Email: wakelab{at}berkeley.edu, catfish{at}utexas.edu, or zhangp35{at}mail.sysu.edu.cn.
Author contributions: D.C.B., D.M.H., D.B.W., D.C.C., and P.Z. designed research; D.C.B., D.M.H., D.B.W., D.C.C., and P.Z. designed and carried out taxon sampling; D.C.B. and D.C.C. selected and vetted calibration points; Y.-J.F. and D.L. performed laboratory research; Y.-J.F., D.L., and P.Z. analyzed data; and Y.-J.F., D.C.B., D.L., D.M.H., D.B.W., D.C.C., and P.Z. wrote the paper.
Reviewers: S.B.H., Temple University; and J.B.L., Harvard University.
The authors declare no conflict of interest.
Data deposition: The sequences reported in this paper have been deposited in the GenBank database. For a list of accession numbers, see Dataset S1.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1704632114/-/DCSupplemental.
Freely available online through the PNAS open access option.
References
- ↵.
- University of California, Berkeley
- ↵.
- Roelants K,
- Bossuyt F
- ↵
- ↵
- ↵.
- Roelants K, et al.
- ↵
- ↵.
- Zhang P, et al.
- ↵.
- Hedges SB,
- Kumar S
- Bossuyt F,
- Roelants K
- ↵
- ↵.
- Frazão A,
- da Silva HR,
- Russo CA
- ↵
- ↵.
- Shen XX,
- Liang D,
- Feng YJ,
- Chen MY,
- Zhang P
- ↵
- ↵.
- Bossuyt F,
- Brown RM,
- Hillis DM,
- Cannatella DC,
- Milinkovitch MC
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Heinicke MP, et al.
- ↵
- ↵.
- Hedges SB,
- Duellman WE,
- Heinicke MP
- ↵.
- Pyron RA
- ↵.
- Pyron RA
- ↵.
- Renne PR, et al.
- ↵.
- Keller G
- ↵.
- Duellman WE,
- Marion AB,
- Hedges SB
- ↵.
- Shen XX, et al.
- ↵.
- Longrich NR,
- Bhullar BAS,
- Gauthier JA
- ↵.
- Longrich NR,
- Tokaryk T,
- Field DJ
- ↵.
- O’Leary MA, et al.
- ↵.
- Cooper A,
- Penny D
- ↵.
- Benton MJ
- ↵.
- Meredith RW, et al.
- ↵.
- dos Reis M, et al.
- ↵.
- dos Reis M,
- Donoghue PCJ,
- Yang Z
- ↵.
- Blain HA,
- Canudo JI,
- Cuenca-Bescós G,
- López-Martínez N
- ↵.
- Fastovsky DE,
- Bercovici A
- ↵
- ↵.
- Mercier GK,
- Demar DG,
- Wilson GP
- ↵.
- Matzke NJ
- ↵.
- Blakey RC
- ↵
- ↵.
- Zachos J,
- Pagani M,
- Sloan L,
- Thomas E,
- Billups K
- ↵
- ↵.
- Goin CJ,
- Goin OB
- ↵.
- Vial JL
- Savage JM
- ↵.
- Bossuyt F,
- Milinkovitch MC
- ↵
- ↵.
- Vajda V,
- Raine JI,
- Hollis CJ
- ↵.
- Feng YJ,
- Liu QF,
- Chen MY,
- Liang D,
- Zhang P
- ↵.
- Tamura K,
- Stecher G,
- Peterson D,
- Filipski A,
- Kumar S
- ↵.
- Castresana J
- ↵.
- Lanfear R,
- Calcott B,
- Ho SYW,
- Guindon S
- ↵.
- Stamatakis A
- ↵.
- Ronquist F, et al.
- ↵.
- Mirarab S, et al.
- ↵.
- Yang Z
- ↵.
- Benton MJ, et al.
- ↵
- ↵
Citation Manager Formats
Article Classifications
- Biological Sciences
- Evolution