Skip to main content
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Accessibility Statement
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian
  • Log in
  • My Cart

Main menu

  • Home
  • Articles
    • Current
    • Special Feature Articles - Most Recent
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • List of Issues
  • Front Matter
  • News
    • For the Press
    • This Week In PNAS
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Editorial and Journal Policies
    • Submission Procedures
    • Fees and Licenses
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Accessibility Statement
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian

User menu

  • Log in
  • My Cart

Search

  • Advanced search
Home
Home

Advanced Search

  • Home
  • Articles
    • Current
    • Special Feature Articles - Most Recent
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • List of Issues
  • Front Matter
  • News
    • For the Press
    • This Week In PNAS
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Editorial and Journal Policies
    • Submission Procedures
    • Fees and Licenses

New Research In

Physical Sciences

Featured Portals

  • Physics
  • Chemistry
  • Sustainability Science

Articles by Topic

  • Applied Mathematics
  • Applied Physical Sciences
  • Astronomy
  • Computer Sciences
  • Earth, Atmospheric, and Planetary Sciences
  • Engineering
  • Environmental Sciences
  • Mathematics
  • Statistics

Social Sciences

Featured Portals

  • Anthropology
  • Sustainability Science

Articles by Topic

  • Economic Sciences
  • Environmental Sciences
  • Political Sciences
  • Psychological and Cognitive Sciences
  • Social Sciences

Biological Sciences

Featured Portals

  • Sustainability Science

Articles by Topic

  • Agricultural Sciences
  • Anthropology
  • Applied Biological Sciences
  • Biochemistry
  • Biophysics and Computational Biology
  • Cell Biology
  • Developmental Biology
  • Ecology
  • Environmental Sciences
  • Evolution
  • Genetics
  • Immunology and Inflammation
  • Medical Sciences
  • Microbiology
  • Neuroscience
  • Pharmacology
  • Physiology
  • Plant Biology
  • Population Biology
  • Psychological and Cognitive Sciences
  • Sustainability Science
  • Systems Biology
Research Article

Placing confidence limits on the molecular age of the human–chimpanzee divergence

Sudhir Kumar, Alan Filipski, Vinod Swarna, Alan Walker, and S. Blair Hedges
PNAS December 27, 2005 102 (52) 18842-18847; https://doi.org/10.1073/pnas.0509585102
Sudhir Kumar
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alan Filipski
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Vinod Swarna
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alan Walker
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
S. Blair Hedges
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  1. Contributed by Alan Walker, November 8, 2005

  • Article
  • Figures & SI
  • Info & Metrics
  • PDF
Loading

Abstract

Molecular clocks have been used to date the divergence of humans and chimpanzees for nearly four decades. Nonetheless, this date and its confidence interval remain to be firmly established. In an effort to generate a genomic view of the human–chimpanzee divergence, we have analyzed 167 nuclear protein-coding genes and built a reliable confidence interval around the calculated time by applying a multifactor bootstrap-resampling approach. Bayesian and maximum likelihood analyses of neutral DNA substitutions show that the human–chimpanzee divergence is close to 20% of the ape–Old World monkey (OWM) divergence. Therefore, the generally accepted range of 23.8–35 millions of years ago for the ape–OWM divergence yields a range of 4.98–7.02 millions of years ago for human–chimpanzee divergence. Thus, the older time estimates for the human–chimpanzee divergence, from molecular and paleontological studies, are unlikely to be correct. For a given the ape–OWM divergence time, the 95% confidence interval of the human–chimpanzee divergence ranges from –12% to 19% of the estimated time. Computer simulations suggest that the 95% confidence intervals obtained by using a multifactor bootstrap-resampling approach contain the true value with >95% probability, whether deviations from the molecular clock are random or correlated among lineages. Analyses revealed that the use of amino acid sequence differences is not optimal for dating human–chimpanzee divergence and that the inclusion of additional genes is unlikely to narrow the confidence interval significantly. We conclude that tests of hypotheses about the timing of human–chimpanzee divergence demand more precise fossil-based calibrations.

  • Bayesian analysis
  • molecular clock
  • hominid
  • fossil
  • primate

Determining the age of the most recent common ancestor of humans and their closest African ape relatives has been the subject of scientific inquiry for over a century. This age is important for assessing the evolutionary rate of morphological and molecular changes in humans, assigning key fossils in hominoid phylogeny, and estimating when the common ancestor of all humans lived. The paleontological approach has been hampered by the paucity of fossils of some lineages. The first chimpanzee fossil, for example, was only just reported in 2005 (1). As recently as the mid-1960s (2), the African apes were considered to be distant relatives of the human lineage, but subsequent molecular phylogenetic analyses have shown that the chimpanzee and human are sister species and have led to a revision of the age of their divergence (3–13). Currently, the earliest unequivocal upright hominids at 4.2 millions of years ago (Ma) provide the minimum age for human–chimpanzee divergence (14), and some recently proposed early hominids dated between 5 and slightly more than 6 Ma are thought to provide an estimate close to the actual species divergence (15–19). However, a divergence time of 12.5 Ma for humans and chimpanzees has been entertained recently as well (20), and so the fossil record and its interpretation give a range of 4.2–12.5 Ma.

To employ molecular data to test alternative hypotheses derived from the fossil record, we require credible confidence intervals (C.I.s) of the molecular age of human and chimpanzee divergence. However, studies using molecular data have yielded disparate values as well (3–13 Ma), because of differences in the number of genes used, types of substitutions (synonymous, noncoding, and nonsynonymous) analyzed, calibration points used, and statistical methods used (3–13). Furthermore, current estimates of C.I.s of molecular divergence times fail to consider a comprehensive set of factors contributing to variance, such as a limited number of genes (gene sampling error), a limited number of sites for each gene (variance contributed by sequence divergence-estimation procedures), rate differences among lineages, and inherent uncertainty in the time used for calibrating lineage-specific and relaxed molecular clocks (21, 22). These factors point to the need to use a large number of genes and a closely positioned calibration point to reduce the statistical variance of the estimated time and increase our power in testing alternative fossil-based hypotheses.

Therefore, we have assembled a data set containing the largest number of nuclear genes available to estimate the timing of human–chimpanzee divergence in reference to the ape–Old World monkey (OWM) divergence. To build credible C.I.s, we have developed a multifactor bootstrap-resampling (MBR) approach that can incorporate the variances mentioned above. Furthermore, we have conducted computer simulations under random and correlated deviations from a constant-rate model to examine the statistical validity of C.I.s generated by the MBR approach when using a large number of genes for a few species to date very recent speciation events. This evaluation is particularly important for the analysis of closely related species, because tests of rate constancy are often powerless in rejecting genes in which different lineages evolve at different rates (5, 23). In the following, we have also compared results obtained by using different types of point substitutions in the same set of protein-coding genes, including amino acid sequence differences, nucleotide substitutions at the second and third codon positions, and neutral substitutions at 4-fold-degenerate sites (with and without hypermutable CpG positions).

Materials and Methods

Data Collection. To maximize the number of useable genes and employ a close primate calibration species for estimating human–chimpanzee divergence time, we assembled 167 orthologous protein sequence sets for human (Homo sapiens), chimpanzee (Pan troglodytes), macaque (Macaca mulatta), and mouse (Mus musculus), considering their phylogenetic relationships (Fig. 1a ) (for details see Data Collection in Supporting Text, which is published as supporting information on the PNAS web site). Observed distributions of protein lengths (Fig. 1b ) and rates (Fig. 1c , open bars) show considerable variation, as expected. The protein sequence alignments were used as guides (for codon boundaries) to generate coding DNA sequence alignments and evolutionary rates at the third codon positions (Fig. 1c , solid bars). We identified all third codon positions that were 4-fold-degenerate in all species for each gene for analysis as well. Because CpG dinucleotides mutate 7–10 times faster than other dinucleotides, 4-fold-degenerate sites were separated into those that were involved in CpG dinucleotides and those that were not (24, 25).

Fig. 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 1.

Characteristics of the data. (a) Phylogenetic relationships of human, chimpanzee, macaque, and mouse. (b) Histogram showing the length distribution of proteins used. (c) Distributions of the average evolutionary rates of amino acids (open bars) and third codon positions (solid bars) for 167 protein-coding genes analyzed in this study. Evolutionary rates were estimated assuming a 90 Ma date for primate–rodent divergence (5, 22, 46). (d) Schematic showing how lineage-specific molecular clocks were used to estimate the human–chimpanzee divergence time by using the ML distances between species pairs. The human–chimpanzee divergence time is given by the fraction ([h + c]/2)/(a + [h + c]/2) of the time assumed for ape–OWM divergence. (e) Distribution of evolutionary rates among sites for amino acids (dashed curve; shape parameter = 0.83) and third codon positions (solid curve; shape parameter = 1.65) obtained in ML analyses of concatenated sequences (53,008 codons) of 167 protein-coding genes. In each case, modeling the rate variation among sites by using a gamma distribution produced a significantly better fit (P < 0.005).

Estimation of Divergence Times Using Local Clocks. We used a lineage-specific method with maximum likelihood (ML) estimates of sequence divergence generated by using paml 3.14 (26) and the Bayesian timing method implemented in the multidivtime software (21). For the ML method (referred to as the ML-distance approach), we concatenated all sequences into a supergene, estimated branch lengths in the four-species tree using paml 3.14, and inferred divergence times in a lineage-specific manner (Fig. 1d ). To estimate evolutionary distances in paml 3.14, a general time reversible (GTR) model with a gamma (Γ) distribution describing the rate variation among sites (GTR+Γ) was used for DNA sequences and a Jones–Taylor–Thornton (JTT)+Γ model with a discrete gamma distribution (with five rate categories) was used for the amino acid sequences. In the Bayesian analysis, we used the same supergene described above, with a F84+Γ model for DNA and a JTT+Γ model for amino acid sequences, based on the most sophisticated models available in multidivtime software (21).

Estimation of C.I.s. Although the ML-distance and Bayesian methods provide C.I.s and standard errors for estimated times, they do not simultaneously incorporate the variances introduced by the estimation of genetic distances from individual genes, sampling error due to the use of a limited number of genes, rate variation among evolutionary lineages (in ML-distance), and the distribution of uncertainty in calibration times. To do this, we developed a MBR approach for generating C.I.s in which these variances can be explicitly incorporated when one uses ML, Bayesian, or other methods as estimators of time. We chose a bootstrap, rather than an analytical, approach because it is less dependent on specific assumptions about the distribution of the data, and analytical formulations are often too cumbersome to incorporate phylogenetic correlations (27). The algorithm for this process and its properties are given in Supporting Text. We also conducted computer simulations to evaluate the statistical accuracy of the MBR method in generating an appropriate C.I. when used in conjunction with the ML-distance and Bayesian methods for a large number of genes and for only four species (see Supporting Text).

Results and Discussion

We first used the ML-distance method in which a lineage-specific molecular clock is applied to estimate the human– chimpanzee divergence time (see Fig. 1 and Materials and Methods). This method requires a priori knowledge of at least one species divergence time to calibrate the local molecular clock. We took a conservative approach and used a minimum date of 23.8 Ma (boundary of the Oligocene and Miocene) for the ape–OWM divergence (28) (see Supporting Text). We focused on the minimum time of divergence because the inference of the true (mean) time of divergence of any two lineages requires a more robust fossil record for calibration than is generally available for most groups of organisms, and especially these primates. Most past molecular clock studies have also estimated minimum divergence times for this reason (29). In recent years, many methods have been developed that attempt to estimate mean times of divergence by specifying both minimum and maximum calibration points. However, whereas most minimum calibrations are robust, this is not true for maximum calibrations or the implicit assumption about the probability density assumed for fossil calibration uncertainty (29). It is difficult to provide convincing evidence that a phylogenetic divergence event did not occur earlier than a particular point in time. Errors in assignment of maximum constraints, whether too young or too old, will potentially bias the mean time estimate.

Analysis of Third Codon Positions. A likelihood ratio analysis of third codon positions from 167 genes (53,008 codons) indicated significant rate variation among sites (Fig. 1e ), so we used a GTR+Γ model in all further analyses. Using a 23.8 Ma minimum date for ape–OWM divergence, we obtained an ML-distance estimate of 4.74 Ma, which is older than the age of 4.2 Ma for the earliest unequivocal postcranial evidence of upright hominids (14) but younger than some recently proposed early hominids dated from between 5 and ≈6 Ma (15–19).

However, the absolute dates derived above depend directly on the time used to calibrate the molecular clock, and it is important to focus on the size of the C.I. relative to the estimate, rather than on the point estimate of time alone. Using the MBR approach, we obtain a 95% C.I. of 3.88–5.65 Ma for the minimum date. This C.I. spans –18% to +19% of the estimated time. Our computer simulations under equal, random, and correlated models of rate evolution show that the 95% C.I. generated by MBR indeed contains the true value ≥95% of the time when all third positions are resampled, regardless of gene boundaries (Fig. 2).

Fig. 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 2.

The percent of simulation replicates in which the 95% C.I. generated by the MBR method includes the simulated true value using the ML-distance (filled bars) and Bayesian approaches (open bars). (See Supporting Text for a description of the rate-variation models and how the data were parameterized.) We simulated 1,000 167-gene equivalent data sets under each rate-variation regime and conducted MBR analyses to obtain C.I.s for the ML-distance (GTR+Γ) and Bayesian (F84+Γ) approaches. The results show that 95% C.I.s contain the true value more often than required, particularly for the ML-distance method. This finding shows that modeling-rate variation among lineages (as in the Bayesian method) should lead to narrower C.I.s when the underlying model for correlated evolutionary rates among lineages is satisfied (constant and correlated rate cases), as compared with the ML-distance approach, in which rate variation is incorporated into the C.I. instead of being modeled. Interestingly but not unexpectedly, Bayesian and ML methods perform similarly when the assumptions of the Bayesian model are not satisfied (random rate-variation case). This finding is consistent with a study showing that a Bayesian approach with lognormal rate-variation model does not accommodate uncorrelated rate variation (47).

Of all third codon positions, the 4-fold-degenerate sites are expected to be under the least amount of natural selection. We divided these sites into those that were involved in CpG dinucleotides and those that were not, because the 4-fold-degenerate sites involved in CpG dinucleotide configurations mutate much faster (24, 25). Analysis of 19,311 non-CpG 4-fold-degenerate sites produced an estimate of 4.75 Ma, which is almost identical to that given by third codon positions, but the 95% C.I. was wider (–25% to +31%). Because the data set size (in terms of the number of sites) was the largest for the third codon positions, we considered only third codon positions in all further DNA sequence analyses.

Analysis of Nonsynonymous Substitutions. Unexpectedly, ML-distance analyses of amino acids and second codon positions yielded estimates of human–chimpanzee divergence that were >40% larger than those obtained from the third codon positions and 4-fold-degenerate sites. In fact, neither the estimate obtained using amino acids (6.80 Ma), nor the third-position estimate (4.74 Ma) fell within the 95% confidence limit of the other (4.78–8.86 Ma and 3.88–5.65 Ma for amino acids and third positions, respectively). This finding is surprising because both DNA and protein substitutions are expected to provide concordant results, especially when using a large number of protein-coding genes.

A cursory examination of the time estimates based on the rate of amino acid substitutions revealed a tendency toward larger estimates for very slowly evolving proteins. The 40 most slowly evolving proteins (based on evolutionary divergence of human and mouse proteins) provide an estimate more than three times higher than the 40 fastest evolving proteins. In contrast, the third codon position estimates are similar whether obtained from genes corresponding to proteins evolving with low, medium, or fast rates and coincide with the estimates obtained from fast-evolving proteins (Fig. 3). These patterns suggest a disproportionate inflation of sequence divergence for closely related species as compared with distantly related species, an effect that would be the greatest for the most slowly evolving proteins. The skewed inflation of sequence divergence estimates for closely related sequences may be attributed to sequencing errors, presence of ancestral polymorphisms (30), persistence of slightly deleterious polymorphisms (31, 32), and masking of deleterious mutations in heterozygotes (33). Although these factors affect all amino acids and third codon positions, the most slowly evolving sequences (e.g., highly conserved proteins) are affected the most relative to their rate and the time of sequence divergence.

Fig. 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 3.

Dependence of human–chimpanzee divergence time estimate on the rate of amino acid sequence evolution when using amino acids (open symbols) and third codon positions (solid symbols). Each value represents the time estimate obtained from an analysis of 40 concatenated proteins and is plotted at the median amino acid distance between human and mouse for that set of proteins. Proteins were sorted in ascending order based on the human–mouse amino acid sequence divergence, and ML-distance analyses were performed on amino acids (JTT+Γ) and third codon positions (GTR+Γ) on sets of 40 proteins in a sliding window that shifted by one protein at a time. The gray region shows the additional time obtained in computer simulations when sequencing errors and within-species polymorphisms contributed three amino acid sequence changes per 1,000 aa, and ML-distance analyses were conducted on the resulting amino acid sequence alignment following the procedure described above. The human–chimpanzee divergence times estimated are minimum times because a 23.8 Ma date for ape–OWM divergence was assumed (see text); however, the shape of the distribution remains unchanged irrespective of the clock-calibration time used. Similar results were obtained when using the Bayesian methods (results not plotted).

Our computer simulations showed that if the cumulative effect of nonneutral polymorphism due to factors described above produced, at an average, three differences per 1,000 aa, then an overestimation pattern similar to the magnitude and trend observed for the real data would be obtained (Fig. 3, gray area). Furey et al. (34) have already reported sequencing error rates of 0.5 bp per 1,000 nucleotides for human GenBank data, which translates to one error per 1,000 aa (because >75% of base pair errors in three codon positions result in an error at the amino acid sequence level). Although genomic sequences have a lower error rate, within-species comparisons in humans indicate a 1 in 1,000 bp difference at the genome level between individuals (35). This variation contributes nonfixed differences between species and affects the estimate of human–chimpanzee divergence in the same way as the sequencing errors. Sequencing error rate for available chimpanzee sequences is higher than that for humans (4 times versus 10 times coverage; see Ensembl, www.ensembl.org), and chimpanzees are known to have a much higher within-population polymorphism (33), both of which would lead to an overestimation in the present case.

In addition to the above factors, the fixation of lineage-specific amino acid substitutions due to positive selection (36) may also make the use of amino acid sequences problematic. However, it is unlikely to be the major contributing factor in the present case, because none of the slowest-evolving proteins in our data set were found to be involved in host defense, immunity, or olfaction (36, 37). On the other hand, the most rapidly evolving proteins, which show estimates similar to those obtained from the third codon positions, contain a large proportion of genes known to be candidates for positive selection (which show nonsynonymous to synonymous substitution ratios of >0.5). In any event, because of the observed artifact mentioned above, we do not further consider the estimates based on amino acid sequence analysis.

Bayesian Inference of the Minimum Time for Human–Chimpanzee Divergence. The relative rate analysis shows that human third codon positions are 17% more divergent than those of chimpanzee and that the 4-fold-degenerate non-CpG sites are evolving ≈20% more slowly in hominoids than in OWM. Both of these rate differences are insignificant [P > 0.05 in Tajima's test (23)] and are in agreement with those reported elsewhere (33). The difference between hominoids and OWM is similar in magnitude to that reported in an analysis of large data sets by Yi et al. (38) and Kumar and Subramanian (39). It is one-half of that reported by Steiper et al. (40), whose estimates of relative rates depend on fossil-based divergence times for within-hominoid and within-cercopithecoid species, although an outgroup was not used in their analyses.

In any case, the observation of any rate differences in the hominoid–OWM comparison means that the ancestral human–chimpanzee lineage may not be evolving at the same rate as those leading to humans and chimpanzees, violating the assumptions made in the ML-distance approach. Therefore, we applied Bayesian analysis as implemented in multidivtime (21) that models rate variation among lineages to generate a better estimate of human–chimpanzee divergence. We used 23.8 Ma for the ape–OWM divergence time as the root-to-tip mean (RTTM = 23.8) in the multidivtime Bayesian analyses to make results directly comparable to the ML-distance method (see Supporting Text for parameters used for Bayesian analysis). This produced an estimate of 4.98 Ma for the third codon positions and of 5.17 for the 4-fold-degenerate non-CpG sites. These estimates are marginally higher than those for the ML-distance estimates, but the discrepancy becomes smaller when we compare the ratio of the times of the target (human–chimpanzee) and calibration (ape–OWM) divergence events. Even though we provide an explicit time (RTTM) for the ape–OWM split, multidivtime infers time estimates for both human–chimpanzee and ape–OWM divergences. These estimates give consistent ratios of 0.21 (4.98/24.00) and 0.21 (5.17/24.37) for the third codon positions and the 4-fold-degenerate non-CpG sites, respectively. In the ML-distance method, the time ratio is 0.20 for the third codon positions (4.74/23.8) and also 0.20 (4.75/23.8) for the 4-fold-degenerate non-CpG sites. Therefore, the time ratios for the Bayesian estimates are only 5% higher than are those obtained using the ML approach. Bayesian methods also produced a narrower 95% C.I. (–12% to +19%) as compared with the C.I. (–18% to +19%) of the ML-distance approach for the third codon position data. Our computer simulations confirmed that the MBR C.I.s contain the true value with a frequency of >95% in the Bayesian approach as well (Fig. 2). Because the C.I.s generated using the MBR approach with the Bayesian analysis are smaller, we use them below.

Comparison with Previous Studies. Our minimum estimates of 4.74 and 4.98 Ma from ML-distance and Bayesian methods, respectively, based on the minimum 23.8 Ma date for ape–OWM divergence, are smaller than many previous molecular time estimates. There are three primary reasons for this difference. First, many studies have presented (and sometimes preferred) the higher divergence times estimated by using the protein sequences or by using all three codon positions (e.g., refs. 5 and 10). However, as shown above, the inclusion of slowly evolving amino acid sequences (or first and second codon positions) is likely to bias time estimates for the human–chimpanzee divergence. This pattern is also seen in the Bayesian analysis of amino acid sequences, which produces an ≈60% larger time estimate as compared with the third codon positions, with a 95% C.I. that does not overlap with that obtained from the analysis of third codon positions.

Second, different studies have used vastly different divergence times (20–35 Ma) for the ape–OWM split to calibrate molecular clocks. A comparison of the estimated and assumed time ratios resolves this discrepancy. For example, our ratios from 0.21 to 0.20 are consistent with the 0.21 value reported in analyses conducted by using >150,000 bp of noncoding data by Stieper et al. (40), 97 protein-coding genes by Wildman et al. (9), and complete mitochondrial DNA by Schrago et al. (ref. 41; see also ref. 7). Therefore, in the absence of a firmly established calibration date, it is better to consider all results as ratios of human–chimpanzee to ape–OWM divergence times.

Third, most previous studies were based on the analysis of fewer genes. In those cases, the true C.I. accounting for the variance from gene sampling, multiple-hit correction, and evolutionary rate differences among species is expected to be rather large. This effect is evident from the lower and upper limits of the 95% C.I.s obtained by using smaller random subsets of 167 genes (Fig. 4). When the number of genes is <10 (e.g., ref. 5), the 95% C.I. is nearly as large as the time estimate itself. The C.I. declines to ≈50% if 50 genes are used (e.g., refs. 7 and 10). With the use of 100 or more genes, uncertainties contributed by rate variation among lineages become predominant, and the C.I. width declines only slowly. Thus, little is to be gained by adding data from more genes, but using more species may help narrow the 95% C.I. width in the future.

Fig. 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 4.

The relationship of the lower and upper limits of the 95% C.I.s (as a percentage of the estimated time) with the number of genes analyzed by using the ML-distance approach (filled circles) and the Bayesian analysis (open circles). Each point is an average based on computed C.I.s of 10 random subsets of genes. The dashed line illustrates the trend of the Bayesian interval. Similar results were obtained by using the data generated by computer simulation based on the evolutionary parameters of the third positions in different genes.

Incorporating Multiple Fossil Bounds in the Bayesian Inference. Bayesian methods allow more fossil constraints to be used in species divergence time estimation, such as a minimum constraint of 4.2 Ma (14) for the human–chimpanzee divergence. After adding this constraint and keeping RTTM = 23.8, we found that the human–chimpanzee divergence time increased by 53%. However, the ratio of estimated times for ape–OWM and human–chimpanzee divergences was 0.21, which is the same ratio obtained without placing the minimum bound on the human–chimpanzee divergence. Setting RTTM (time for the common ancestor of human, chimpanzee, and macaque) to 35 Ma in the Bayesian analyses again produced the same time ratio (0.21), even though the estimate of human–chimpanzee time became even larger.

To determine which estimates are likely to be correct, we conducted analyses with and without the human–chimpanzee constraint of 4.2 Ma for the simulated data under equal, random, and correlated evolutionary rate variation. [In the random rate case, the rate at each branch is randomly selected from a uniform distribution of rates so as to introduce a ±40% random noise in evolutionary rate independently for each gene. In the correlated rate, the assignment of lineage-specific rate uses a stationary lognormal distribution in which the rates vary from branch to branch as a random walk for a given gene, so that rates drift up or down along the branches of any lineage (42, 43). See Supporting Text.] In all cases, time ratios were recovered with high accuracy, but the absolute times estimated were strongly dependent on the calibration constraints used in the Bayesian analyses, as observed in the analysis of the real data.

At this stage, it is important to revisit the distinction between estimating minimum and true times of species divergence (see discussions in ref. 29). Clearly, the exclusive use of minimum calibration times, derived from the fossil record, will produce only minimum time estimates of the target divergence and its C.I., irrespective of the sophistication of the methods used (29). Focusing on minimum time estimates provides at least two advantages. For one, asserting a minimum time constitutes a definite, falsifiable statement, namely that a divergence occurred before some specified date. Alternatively, we could specify upper and lower bounds on calibration times (44) in an effort to estimate “true” divergence time, but doing so would require the incorporation of invariably subjective specifications of calibration time uncertainty. The results of our application of Bayesian methodologies to real and simulated data, discussed above, indicate that the absolute times of human–chimpanzee divergence inferred by using a four-species construct are strongly affected by the upper and lower bounds used, but these methods are successful in correctly estimating ratios of time estimates for different nodes in the phylogeny.

In fact, the estimated C.I.s from Bayesian analyses are unlikely to be around the “true” divergence time, unless a probability distribution around the range of times inferred from the fossil record can be assigned (29). For example, if we have established minimum and maximum values for calibration time but have reason to believe from fossil evidence that the true divergence time is closer to the former than the latter, then the time distribution may be triangular or lognormal (29). In other cases, the best we can do is to associate a standard deviation with the calibration time and, if necessary, assume a normal or uniform distribution. In some instances we may have evidence pointing toward a certain shape of distribution, in which case the MBR method can be used to explicitly specify this distribution. For example, if we assume a linearly declining probability distribution with 23.8 Ma as minimum (and most probable) estimate of ape–OWM and 35 Ma as its maximum, we obtain a 95% C.I. of 4.17–7.13 Ma. In this case, the C.I., rather than the actual date, should be reported because the ad hoc form of distribution used strongly determines the central value of the time estimated rather than any biological knowledge.

Conclusions

Even when mean or true times of divergence between species are difficult to estimate with molecular clocks, because of uncertainties in fossil calibrations, minimum and relative times of divergence can still be obtained. Here we have shown that the divergence of chimpanzees and humans is such a case, because of the current lack of robust calibration points. Our genome-scale analyses show that the minimum time of that divergence was very close to one-fifth of the ape–OWM divergence, with a 95% C.I. from –12% to +19% of that time. This result can be used to test hypotheses that arise in the literature. For example, if claims of late Miocene (≈6 Ma) fossil hominids (15–17) are correct, then the ape–OWM divergence occurred at least 24–25 Ma or earlier based on the 95% C.I. (Table 1). This date extends the age of that latter divergence only slightly from the accepted 23.8 Ma. Conversely, if fossils of hominoids or OWM were discovered and dated to 27 Ma, this would imply a minimum time of divergence of between 4.9 and 6.6 Ma for chimpanzees and humans. Because large gaps in the fossil record are not implied in either case, they would be compatible with current understanding of the primate fossil record. In contrast, the recent suggestion that some ape-like fossil teeth from Africa might be from the African ape clade and that the human–chimpanzee divergence time might be even older than 12.5 Ma (20) seems to be extremely unlikely. This inference is because extrapolation of the upper 95% confidence limit from Table 1, using this date, implies a minimum ape–OWM divergence time of ≈55 Ma, >30 million years earlier than supported by current fossil evidence. Similarly old molecular clock dates (13.5 Ma, based on mitochondrial DNA and using nonprimate calibration points) for the human–chimpanzee divergence (12, 45) present the same problem. By enabling tests such as these, new fossil discoveries and better radiometric dating of existing fossils provide the best prospects for improved understanding of the timing of events in hominid evolution.

View this table:
  • View inline
  • View popup
Table 1. Confidence limits on the human–chimpanzee divergence time for different calibration times for ape–OWM divergence

Acknowledgments

We thank Graziela Valente for assistance in data preparation and Drs. Anne Stone, Brian Verrelli, and Sankar Subramanian for comments on an earlier version of the manuscript. Thanks are also due to Drs. Jeffrey Thorne, Alan Cooper, and Claudia Russo for critically important comments. Financial support was provided by the National Institutes of Health (to S.K.) and the National Aeronautics and Space Administration Astrobiology Institute (to S.B.H.).

Footnotes

  • ↵ ‡ To whom correspondence may be addressed at: The Biodesign Building A-240, Arizona State University, Tempe, AZ 85287-5301. E-mail: s.kumar{at}asu.edu. **To whom correspondence may be addressed. E-mail: axw8{at}psu.edu.

  • ↵ § S.K. and A.F. contributed equally to this work.

  • Author contributions: S.K., A.W., and S.B.H. designed research; A.F. performed research; S.K. contributed new analytic tools; V.S. analyzed data; and A.W., S.B.H., and S.K. wrote the paper.

  • Conflict of interest statement: No conflicts declared.

  • Abbreviations: ML, maximum likelihood; C.I., confidence interval; MBR, multifactor bootstrap resampling; Ma, millions of years ago; RTTM, root-to-tip mean; OWM, Old World monkey; GTR, general time reversible; JTT, Jones–Taylor–Thornton.

  • Freely available online through the PNAS open access option.

  • Copyright © 2005, The National Academy of Sciences
View Abstract

References

  1. ↵
    McBrearty, S. & Jablonski, N. G. (2005) Nature 437 , 105–108. pmid:16136135
    OpenUrlCrossRefPubMed
  2. ↵
    Simons, E. L. & Pilbeam, D. R. (1965) Folia Primat. 3 , 81–152.
    OpenUrl
  3. ↵
    Sarich, V. M. & Wilson, A. C. (1967) Science 158 , 1200–1203. pmid:4964406
    OpenUrlAbstract/FREE Full Text
  4. Easteal, S. & Herbert, G. (1997) J. Mol. Evol. 44 , Suppl. 1, S121–S132. pmid:9071020
  5. ↵
    Kumar, S. & Hedges, S. B. (1998) Nature 392 , 917–920. pmid:9582070
    OpenUrlCrossRefPubMed
  6. Chen, F. C., Vallender, E. J., Wang, H., Tzeng, C. S. & Li, W. H. (2001) J. Hered. 92 , 481–489. pmid:11948215
    OpenUrlAbstract/FREE Full Text
  7. ↵
    Stauffer, R. L., Walker, A., Ryder, O. A., Lyons-Weiler, M. & Hedges, S. B. (2001) J. Hered. 92 , 469–474. pmid:11948213
    OpenUrlAbstract/FREE Full Text
  8. Hasegawa, M., Thorne, J. L. & Kishino, H. (2003) Genes Genet. Syst. 78 , 267–283. pmid:14532706
    OpenUrlCrossRefPubMed
  9. ↵
    Wildman, D. E., Uddin, M., Liu, G., Grossman, L. I. & Goodman, M. (2003) Proc. Natl. Acad. Sci. USA 100 , 7181–7188. pmid:12766228
    OpenUrlAbstract/FREE Full Text
  10. ↵
    Glazko, G. V. & Nei, M. (2003) Mol. Biol. Evol. 20 , 424–434. pmid:12644563
    OpenUrlAbstract/FREE Full Text
  11. Uddin, M., Wildman, D. E., Liu, G., Xu, W., Johnson, R. M., Hof, P. R., Kapatos, G., Grossman, L. I. & Goodman, M. (2004) Proc. Natl. Acad. Sci. USA 101 , 2957–2962. pmid:14976249
    OpenUrlAbstract/FREE Full Text
  12. ↵
    Arnason, U., Gullberg, A., Burguete, A. S. & Janke, A. (2000) Hereditas 133 , 217–228. pmid:11433966
    OpenUrlCrossRefPubMed
  13. ↵
    Kumar, S. (2005) Nat. Rev. Genet. 6 , 654–662. pmid:16136655
    OpenUrlCrossRefPubMed
  14. ↵
    Leakey, M. G., Feibel, C. S., McDougall, I., Ward, C. & A. W. (1998) Nature 393 , 62–66. pmid:9590689
    OpenUrlCrossRefPubMed
  15. ↵
    Brunet, M., Guy, F., Pilbeam, D., Mackaye, H. T., Likius, A., Ahounta, D., Beauvilain, A., Blondel, C., Bocherens, H., Boisserie, J. R., et al. (2002) Nature 418 , 145–151. pmid:12110880
    OpenUrlCrossRefPubMed
  16. Haile-Selassie, Y. (2001) Nature 412 , 178–181. pmid:11449272
    OpenUrlCrossRefPubMed
  17. ↵
    Senut, B., Pickford, M., Gommery, D., Mein, P., Cheboi, K. & Coppens, Y. (2001) C. R. Acad. Sci. 332 , 137–144.
    OpenUrl
  18. Haile-Selassie, Y., Asfaw, B. & White, T. D. (2004) Am. J. Phys. Anthropol. 123 , 1–10. pmid:14669231
    OpenUrlPubMed
  19. ↵
    Haile-Selassie, Y., Suwa, G. & White, T. D. (2004) Science 303 , 1503–1505. pmid:15001775
    OpenUrlAbstract/FREE Full Text
  20. ↵
    Pickford, M. & Senut, B. (2005) Anthropol. Sci. 113 , 95–102.
    OpenUrlCrossRef
  21. ↵
    Thorne, J. L. & Kishino, H. (2002) Syst. Biol. 51 , 689–702. pmid:12396584
    OpenUrlCrossRefPubMed
  22. ↵
    Hedges, S. B. & Kumar, S. (2003) Trends Genet. 19 , 200–206. pmid:12683973
    OpenUrlCrossRefPubMed
  23. ↵
    Tajima, F. (1993) Genetics 135 , 599–607. pmid:8244016
    OpenUrlAbstract/FREE Full Text
  24. ↵
    Bird, A. P. (1980) Nucleic Acids Res. 8 , 1499–1504. pmid:6253938
    OpenUrlAbstract/FREE Full Text
  25. ↵
    Subramanian, S. & Kumar, S. (2003) Genome Res. 13 , 838–844. pmid:12727904
    OpenUrlAbstract/FREE Full Text
  26. ↵
    Yang, Z. (1997) Comput. Appl. Biosci. 13 , 555–556. pmid:9367129
    OpenUrlFREE Full Text
  27. ↵
    Nei, M. & Kumar, S. (2000) Molecular Evolution and Phylogenetics (Oxford Univ. Press, New York).
  28. ↵
    Remane, J., Cita, M. B., Dercourt, J., Bouysse, P., Repetto, F. & Faure-Muret, A. (2002) International Stratigraphic Chart (International Union of Geological Sciences, Paris).
  29. ↵
    Hedges, S. B. & Kumar, S. (2004) Trends Genet. 20 , 242–247. pmid:15109778
    OpenUrlCrossRefPubMed
  30. ↵
    Nei, M. (1987) Molecular Evolutionary Genetics (Columbia Univ. Press, New York).
  31. ↵
    Ho, S. Y., Phillips, M. J., Cooper, A. & Drummond, A. J. (2005) Mol. Biol. Evol. 22 , 1561–1568. pmid:15814826
    OpenUrlAbstract/FREE Full Text
  32. ↵
    Penny, D. (2005) Nature 436 , 183–184. pmid:16015312
    OpenUrlCrossRefPubMed
  33. ↵
    Mikkelsen, T. S., Hillier, L. W., Eichler, E. E., Zody, M. C., Jaffe, D. B., Yang, S., Enard, W., Hellmann, I. & Lindblad-Toh, K. (2005) Nature 437 , 69–87. pmid:16136131
    OpenUrlCrossRefPubMed
  34. ↵
    Furey, T. S., Diekhans, M., Lu, Y., Graves, T. A., Oddy, L., Randall-Maher, J., Hillier, L. W., Wilson, R. K. & Haussler, D. (2004) Genome Res. 14 , 2034–2040. pmid:15489323
    OpenUrlAbstract/FREE Full Text
  35. ↵
    Brookes, A. J. (1999) Gene 234 , 177–186. pmid:10395891
    OpenUrlCrossRefPubMed
  36. ↵
    Nielsen, R., Bustamante, C., Clark, A. G., Glanowski, S., Sackton, T. B., Hubisz, M. J., Fledel-Alon, A., Tanenbaum, D. M., Civello, D., White, T. J., et al. (2005) PloS. Biol. 3 , e170. pmid:15869325
    OpenUrlCrossRefPubMed
  37. ↵
    Clark, A. G., Glanowski, S., Nielsen, R., Thomas, P., Kejariwal, A., Todd, M. J., Tanenbaum, D. M., Civello, D., Lu, F., Murphy, B., et al. (2003) Cold Spring Harbor. Symp. Quant. Biol. 68 , 471–477. pmid:15338650
    OpenUrlCrossRefPubMed
  38. ↵
    Yi, S., Ellsworth, D. L. & Li, W. H. (2002) Mol. Biol. Evol. 19 , 2191–2198. pmid:12446810
    OpenUrlAbstract/FREE Full Text
  39. ↵
    Kumar, S. & Subramanian, S. (2002) Proc. Natl. Acad. Sci. USA 99 , 803–808. pmid:11792858
    OpenUrlAbstract/FREE Full Text
  40. ↵
    Steiper, M. E., Young, N. M. & Sukarna, T. Y. (2004) Proc. Natl. Acad. Sci. USA 101 , 17021–17026. pmid:15572456
    OpenUrlAbstract/FREE Full Text
  41. ↵
    Schrago, C. G. & Russo, C. A. (2003) Mol. Biol. Evol. 20 , 1620–1625. pmid:12832653
    OpenUrlAbstract/FREE Full Text
  42. ↵
    Kishino, H., Thorne, J. L. & Bruno, W. J. (2001) Mol. Biol. Evol. 18 , 352–361. pmid:11230536
    OpenUrlAbstract/FREE Full Text
  43. ↵
    Aris-Brosou, S. & Yang, Z. (2002) Syst. Biol. 51 , 703–714. pmid:12396585
    OpenUrlCrossRefPubMed
  44. ↵
    Thorne, J. L. & Kishino, H. (2005) in Statistical Methods in Molecular Evolution, ed. Nielsen, R. (Springer, New York), pp. 233–256.
  45. ↵
    Arnason, U., Gullberg, A., Janke, A. & Xu, X. (1996) J. Mol. Evol. 43 , 650–661. pmid:8995062
    OpenUrlCrossRefPubMed
  46. ↵
    Springer, M. S., Murphy, W. J., Eizirik, E. & O'Brien, S. J. (2003) Proc. Natl. Acad. Sci. USA 100 , 1056–1061. pmid:12552136
    OpenUrlAbstract/FREE Full Text
  47. ↵
    Ho, S. Y., Phillips, M. J., Drummond, A. J. & Cooper, A. (2005) Mol. Biol. Evol. 22 , 1355–1363. pmid:15758207
    OpenUrlAbstract/FREE Full Text
PreviousNext
Back to top
Article Alerts
Email Article

Thank you for your interest in spreading the word on PNAS.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Placing confidence limits on the molecular age of the human–chimpanzee divergence
(Your Name) has sent you a message from PNAS
(Your Name) thought you would like to see the PNAS web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Citation Tools
Placing confidence limits on the molecular age of the human–chimpanzee divergence
Sudhir Kumar, Alan Filipski, Vinod Swarna, Alan Walker, S. Blair Hedges
Proceedings of the National Academy of Sciences Dec 2005, 102 (52) 18842-18847; DOI: 10.1073/pnas.0509585102

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Request Permissions
Share
Placing confidence limits on the molecular age of the human–chimpanzee divergence
Sudhir Kumar, Alan Filipski, Vinod Swarna, Alan Walker, S. Blair Hedges
Proceedings of the National Academy of Sciences Dec 2005, 102 (52) 18842-18847; DOI: 10.1073/pnas.0509585102
Digg logo Reddit logo Twitter logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Mendeley logo Mendeley
Proceedings of the National Academy of Sciences of the United States of America: 102 (52)
Table of Contents

Submit

Sign up for Article Alerts

Jump to section

  • Article
    • Abstract
    • Materials and Methods
    • Results and Discussion
    • Conclusions
    • Acknowledgments
    • Footnotes
    • References
  • Figures & SI
  • Info & Metrics
  • PDF

You May Also be Interested in

Abstract depiction of a guitar and musical note
Science & Culture: At the nexus of music and medicine, some see disease treatments
Although the evidence is still limited, a growing body of research suggests music may have beneficial effects for diseases such as Parkinson’s.
Image credit: Shutterstock/agsandrew.
Large piece of gold
News Feature: Tracing gold's cosmic origins
Astronomers thought they’d finally figured out where gold and other heavy elements in the universe came from. In light of recent results, they’re not so sure.
Image credit: Science Source/Tom McHugh.
Dancers in red dresses
Journal Club: Friends appear to share patterns of brain activity
Researchers are still trying to understand what causes this strong correlation between neural and social networks.
Image credit: Shutterstock/Yeongsik Im.
White and blue bird
Hazards of ozone pollution to birds
Amanda Rodewald, Ivan Rudik, and Catherine Kling talk about the hazards of ozone pollution to birds.
Listen
Past PodcastsSubscribe
Goats standing in a pin
Transplantation of sperm-producing stem cells
CRISPR-Cas9 gene editing can improve the effectiveness of spermatogonial stem cell transplantation in mice and livestock, a study finds.
Image credit: Jon M. Oatley.

Similar Articles

Site Logo
Powered by HighWire
  • Submit Manuscript
  • Twitter
  • Facebook
  • RSS Feeds
  • Email Alerts

Articles

  • Current Issue
  • Special Feature Articles – Most Recent
  • List of Issues

PNAS Portals

  • Anthropology
  • Chemistry
  • Classics
  • Front Matter
  • Physics
  • Sustainability Science
  • Teaching Resources

Information

  • Authors
  • Editorial Board
  • Reviewers
  • Librarians
  • Press
  • Site Map
  • PNAS Updates

Feedback    Privacy/Legal

Copyright © 2021 National Academy of Sciences. Online ISSN 1091-6490