Skip to main content

Main menu

  • Home
  • Articles
    • Current
    • Special Feature Articles - Most Recent
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • List of Issues
  • Front Matter
    • Front Matter Portal
    • Journal Club
  • News
    • For the Press
    • This Week In PNAS
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Editorial and Journal Policies
    • Submission Procedures
    • Fees and Licenses
  • Submit
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Accessibility Statement
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian

User menu

  • Log in
  • My Cart

Search

  • Advanced search
Home
Home
  • Log in
  • My Cart

Advanced Search

  • Home
  • Articles
    • Current
    • Special Feature Articles - Most Recent
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • List of Issues
  • Front Matter
    • Front Matter Portal
    • Journal Club
  • News
    • For the Press
    • This Week In PNAS
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Editorial and Journal Policies
    • Submission Procedures
    • Fees and Licenses
  • Submit
Research Article

Recent acceleration of human adaptive evolution

John Hawks, Eric T. Wang, Gregory M. Cochran, Henry C. Harpending, and Robert K. Moyzis
  1. *Department of Anthropology, University of Wisconsin, Madison, WI 53706;
  2. ‡Department of Algorithm Development and Data Analysis, Affymetrix, Inc., Santa Clara, CA 95051;
  3. §Department of Anthropology, University of Utah, Salt Lake City, UT 84112; and
  4. ¶Department of Biological Chemistry and Institute of Genomics and Bioinformatics, University of California, Irvine, CA 92697

See allHide authors and affiliations

PNAS December 26, 2007 104 (52) 20753-20758; https://doi.org/10.1073/pnas.0707650104
John Hawks
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: jhawk@wisc.edu harpend@xmission.com rmoyzis@uci.edu
Eric T. Wang
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Gregory M. Cochran
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Henry C. Harpending
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: jhawk@wisc.edu harpend@xmission.com rmoyzis@uci.edu
Robert K. Moyzis
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: jhawk@wisc.edu harpend@xmission.com rmoyzis@uci.edu
  1. Contributed by Henry C. Harpending, August 13, 2007 (received for review May 24, 2007)

  • Article
  • Figures & SI
  • Info & Metrics
  • PDF
Loading

Abstract

Genomic surveys in humans identify a large amount of recent positive selection. Using the 3.9-million HapMap SNP dataset, we found that selection has accelerated greatly during the last 40,000 years. We tested the null hypothesis that the observed age distribution of recent positively selected linkage blocks is consistent with a constant rate of adaptive substitution during human evolution. We show that a constant rate high enough to explain the number of recently selected variants would predict (i) site heterozygosity at least 10-fold lower than is observed in humans, (ii) a strong relationship of heterozygosity and local recombination rate, which is not observed in humans, (iii) an implausibly high number of adaptive substitutions between humans and chimpanzees, and (iv) nearly 100 times the observed number of high-frequency linkage disequilibrium blocks. Larger populations generate more new selected mutations, and we show the consistency of the observed data with the historical pattern of human population growth. We consider human demographic growth to be linked with past changes in human cultures and ecologies. Both processes have contributed to the extraordinarily rapid recent genetic evolution of our species.

  • HapMap
  • linkage disequilibrium
  • Neolithic
  • positive selection

Human populations have increased vastly in numbers during the past 50,000 years or more (1). In theory, more people means more new adaptive mutations (2). Hence, human population growth should have increased in the rate of adaptive substitutions: an acceleration of new positively selected alleles.

Can this idea really describe recent human evolution? There are several possible problems. Only a small fraction of all mutations are advantageous; most are neutral or deleterious. Moreover, as a population becomes more and more adapted to its current environment, new mutations should be less and less likely to increase fitness. Because species with large population sizes reach an adaptive peak, their rate of adaptive evolution over geologic time should not greatly exceed that of rare species (3).

But humans are in an exceptional demographic and ecological transient. Rapid population growth has been coupled with vast changes in cultures and ecology during the Late Pleistocene and Holocene, creating new opportunities for adaptation. The past 10,000 years have seen rapid skeletal and dental evolution in human populations and the appearance of many new genetic responses to diets and disease (4).

In such a transient, large population, size increases the rate and effectiveness of adaptive responses. For example, natural insect populations often produce effective monogenic resistance to pesticides, whereas small laboratory populations under similar selection develop less effective polygenic adaptations (5). Chemostat experiments on Escherichia coli show a continued response to selection (6), with continuous and repeatable responses in large populations but variable and episodic responses in small populations (7). These results are explained by a model in which smaller population size limits the rate of adaptive evolution (8). A population that suddenly increases in size has the potential for rapid adaptive change. The best analogy to recent human evolution may be the rapid evolution of domesticates such as maize (9, 10).

Human genetic variation appears consistent with a recent acceleration of positive selection. A new advantageous mutation that escapes genetic drift will rapidly increase in frequency, more quickly than recombination can shuffle it with other genetic variants (11). As a result, selection generates long-range blocks of linkage disequilibrium (LD) across tens or hundreds of kilobases, depending on the age of the selected variant and the local recombination rate. The expected decay of LD with distance surrounding a recently selected allele provides a powerful means of discriminating selection from other demographic causes of extended LD, such as bottlenecks and admixture (9, 12).

The important reason for this increase in discrimination is the vastly different genomic scale that LD-based approaches use compared with previous methods (scales of millions of bases rather than thousands of bases). LD methods use polymorphism distance and order information and frequency to search for selection, unlike all previous methods (9, 12). Previous methods, therefore, have difficulty defining selection unambiguously from other population architectures on the kb scale usually examined. On the megabase (Mb) scale examined by LD approaches, however, extensive modeling and simulations indicate that other demographic causes of extensive LD can be discriminated easily from those caused by adaptive selection (9). Further, current LD approaches restrict comparisons to a set of frequencies and inferred allele ages for which neutral explanations are essentially implausible.

Previously, we applied the LD decay (LDD) test to SNP data from Perlegen and the HapMap (13), finding evidence for recent selection on ≈1,800 human genes. We refer to these as ascertained selected variants (ASVs). The probabilistic LDD test searches for the expected decay of adjacent SNPs surrounding a recently selected allele. Importantly, the method is insensitive to local recombination rate, because local rate influences the extent of LD surrounding both alleles, while the method looks for LD differences between alleles. Further, the method relies only on high heterozygosity SNPs for analysis, exactly the type of data obtained for the HapMap project.

The number of ASVs detected encompasses some 7% of human genes and is consistent with the proportion found in another survey using a related approach (12). Because LD decays quickly over time, most ASVs are quite recent (14), compared with other approaches that detect selection over longer evolutionary time scales (15, 16). Many human genes are now known to have strongly selected alleles in recent historical times, such as lactase (17, 18), CCR5 (19, 20), and FY (21). These surveys show that such genes are very common. This observation is surprising: in theory, such strongly selected variants should be rare (2, 3). The observed distribution seems to reflect an exceptionally rapid rate of adaptive evolution.

But the hypothesis that genomic data show a high recent rate of selection must overcome two principal objections: (i) The LDD test might miss older selection and (ii) a high constant rate of adaptive substitution might also explain the large number of ASVs. The first objection is addressed by recalculating the LDD test on a 3-fold larger dataset, because higher SNP density is needed to detect older selected alleles with comparable sensitivity. We test the second objection by considering a constant rate as the null hypothesis then deriving and testing genomic consequences.

Results

Finding Old Alleles.

The original Perlegen and HapMap datasets were relatively small (1.6 million and 1.0 million SNPs, respectively). The low SNP density limited the power of LD methods to detect older selection events, particularly in high-recombination areas of the genome (9). Likewise, a related study of selection (12) was biased toward newer alleles by requiring multiple adjacent SNPs to exhibit extended LD. Older selected alleles, where LDD is more rapid, would be rejected with this approach. Neither of those previous studies (9, 12) attempted to quantitate the numbers of selected events over an extended time frame, but were merely initial searches for recent extended LD at individual alleles, the most sensitive method to detect recent adaptive change. Both found abundant evidence for recent selection.

Therefore, we have now recomputed the LDD test on the newly released 3.9-million HapMap genotype dataset (13). By varying the LDD test search parameters, we can now statistically detect alleles with more rapid LDD (and hence older inferred ages) (9). For all parameters used, the detection threshold was set at an average log likelihood (ALnLH) > 2.6 SD (≥99.5th percentile) from the genome average. Again, this LDD threshold is a stringent cutoff for the detection of genomic outliers, because the high number of selective events are included in the genome average (9). The probabilistic LDD test does not require the calculation of inferred haplotypes (9), so it is not a daunting computational task to calculate ALnLH values for the HapMap 3.9 million SNPs genotyped in 270 individuals: 90 European ancestry (CEU), 90 African (Yoruba) ancestry (YRI), 45 Han Chinese (CHB), and 45 Japanese (JPT).

This analysis uncovered only 12 new SNPs (in six clusters) not originally detected in the CEU population (9) and 466 new SNPs representing 206 independent clusters in the YRI population. A total of 2,803 (CEU), 2,367 (CHB), 2,783 (JPT), and 3,486 (YRI) selection events were found. As noted (9), many inferred selected sites have faster LDD in YRI samples (with older coalescence times), resulting in lower background LD and more previously unobserved variants. The denser HapMap dataset provided better resolution of LDD (i.e., rapid decay can be reliably detected from background LD only with high density). The 3.9-million HapMap dataset discovered more ASVs, but only an incremental increase in the CEU and a (≈7%) increase in YRI values. This finding indicates that most events (defined by the LDD test) coalescing to ages up to 80,000 years ago have been detected, and any ascertainment bias against older selection is very slight within the given frequency range.

Ancient selected alleles are also more likely to be near or at fixation than recent alleles. Just as we excluded rare alleles, we also excluded high-frequency alleles (i.e., >78%) in our age distribution. But the number of such high-frequency alleles provides another test of the hypothesis that the LDD test has missed older events. We modified the LDD test to find these high-frequency “near-fixed” alleles and found only 50 candidates. Other studies have likewise found few near-fixed alleles (22, 23). These studies also show that very few ASVs are shared between HapMap samples; most are population-specific (9, 12). In our data, only 509 clusters are shared between CEU and YRI samples; many of these are likely to have been under balancing selection [supporting information (SI) Appendix ]. The small number of near-fixed events and the small number of shared events are strong evidence that the LDD test has not missed a large number of ancient selected alleles.

Allele Ages.

We used a modification of described methods (24–26) to estimate an allele age (coalescence time) for each selected cluster. We focused on the HapMap populations with the largest sample sizes, which were the YRI and CEU samples. Similar results were obtained for the CHB and JPT populations (data not shown).

Fig. 1 presents histograms of these age estimates. The YRI sample shows a modal (peak) age of ≈8,000 years ago, assuming 25-year generations; the CEU sample shows a peak age of ≈5,250 years ago, both values consistent with earlier work (9, 12). The difference in peak age likely explains why weaker tests have found stronger evidence of selection in European ancestry samples (27, 28), unlike the current study.

Fig. 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 1.

Age distribution of ascertained selected alleles. Each point represents the number of variants dated to a single 10-generation bin. Fitted curves are the number of ascertained variants predicted by Eq. 2 under a constant population size and constant s̄ = 0.022 for YRI and s̄ = 0.034 for CEU. The distribution drops to zero approaching the present, because all alleles have frequencies >22% today. The 2,965 (YRI) and 2,246 (CEU) selection ages shown have had 509 alleles removed that are likely examples of ongoing balanced selection ( SI Appendix ). Including these alleles in the analysis does not change the overall conclusion of acceleration of selection.

Rate Estimation.

Using the diffusion model of positive selection (29), we estimated the adaptive substitution rate consistent with the observed age distribution of ASVs. For the YRI data, this estimate is 0.53 substitutions per year. For the CEU data, this estimate is 0.59 substitutions per year. The average fitness advantage of new variants (assuming dominant effects) is estimated as 0.022 for the YRI distribution and 0.034 for the CEU distribution. Curves obtained by using these estimated values fit the observed data well (Fig. 1). The higher estimated rate for Europeans emerges from the more recent modal age of variants. For further analyses, we used the lower rate estimated from the YRI sample as a conservative value.

Predictions of Constant Rate.

We can derive four predictions from the rate of adaptive substitution, each of which refutes the null hypothesis of constant rate:

  1. The null hypothesis predicts that the average nucleotide diversity across the genome should be vastly lower than observed. Recurrent selected substitutions greatly reduce the diversity of linked neutral alleles by hitchhiking or pseudohitchhiking (30, 31). Using an approximation for site heterozygosity under pseudohitchhiking (30, 32) we estimated the expected site heterozygosity under the null hypothesis as 3.5 × 10−5 ( SI Appendix ). This value is less than one-tenth the observed site heterozygosity, which is between 4.0 and 6.0 × 10−4 in human populations (13, 33, 34).

  2. Hitchhiking is more important in regions of low recombination, so the null hypothesis predicts a strong relationship between nucleotide diversity and local recombination rate. The null hypothesis predicts a 10-fold increase in diversity across the range of local recombination rates represented by human gene regions. Empirically, diversity is slightly correlated with local recombination rate, but the relationship is weak and may be partly explained by mutation rate (13, 35).

  3. The annual rate of 0.53 adaptive substitutions consistent with the YRI data predicts an implausible 6.4 million adaptive substitutions between humans and chimpanzees. In contrast, there are only ≈40,000-aa substitutions separating these species, and only ≈18 million total substitutions (36). This amount of selection, amounting to >1/3 of all substitutions, or 100 times the observed number of amino acid substitutions, is implausible.

  4. The null hypothesis predicts that many selected alleles should be found between 78% and 100% frequency. Positively selected alleles follow a logistic growth curve, which proceeds very rapidly through intermediate frequencies. Because selected alleles spend relatively little time in the ascertainment range, the ascertained blocks should be the “tip of the iceberg” of a larger number of recently selected blocks at or near fixation. For example, the ASVs in the YRI dataset have a modal age of ≈8,000 years ago. Based on the diffusion model for selection on an additive gene, ascertained variants should account for only 18% of the total number of selected variants still segregating. In contrast, 41% of segregating variants should be >78%. Dominant alleles (which have a higher fixation probability) progress even more slowly (>78%), so that additivity is the more conservative assumption. Empirically, few such near-fixed variants with high LD scores have been found in the human genome (13). Modifying the LDD algorithm to specifically search for high-frequency “fixed” alleles found only 50 potential sites, in contrast to the >5,000 predicted by the constant rate model. Although it is possible that the rapid LDD expected for older selected alleles near fixation may not be detected as efficiently by the LDD test, two other surveys have also found small numbers of such events (22, 23). This difference of two orders of magnitude is a strong refutation of the null hypothesis.

Population Growth.

The rate of adaptive evolution in human populations has indeed accelerated within the past 80,000 years. The results above demonstrate the extent of acceleration: the recent rate must be one to two orders of magnitude higher than the long-term rate to explain the genomewide pattern.

Population growth itself predicts an acceleration effect, because the number of new mutations increases as a linear product of the number of individuals (2), and exponential growth increases the fixation probability of new adaptive mutations (37). We considered the hypothesis that the magnitude of human population growth might explain a large fraction of the recent acceleration of new adaptive alleles. To test this hypothesis, we constructed a model of historic and prehistoric population growth, based on historical and archaeological estimates of population size (1, 38, 39).

Population growth in the Upper Paleolithic and Late Middle Stone Age began by 50,000 years ago. Several archaeological indicators show long-term increases in population density, including more small-game exploitation, greater pressure on easily collected prey species like tortoises and shellfish, more intense hunting of dangerous prey species, and occupation of previously uninhabited islands and circumarctic regions (40). Demographic growth intensified during the Holocene, as domestication centers in the Near East, Egypt, and China underwent expansions commencing by 10,000 to 8,000 years ago (41, 42). From these centers, population growth spread into Europe, North Africa, South Asia, Southeast Asia, and Australasia during the succeeding 6,000 years (42, 43). Sub-Saharan Africa bears special consideration, because of its initial large population size and influence on earlier human dispersals (44). Despite the possible early appearance of annual cereal collection and cattle husbandry in North Africa, sub-Saharan Africa has no archaeological evidence for agriculture before 4,000 years ago (42). West Asian agricultural plants like wheat did poorly in tropical sun and rainfall regimes, while animals faced a series of diseases that posed barriers to entry (45). As a consequence, some 2,500 years ago the population of sub-Saharan Africa was likely <7 million people, compared with European, West Asian, East Asian, and South Asian populations approaching or in excess of 30 million each (1). At that time, the sub-Saharan population grew at a high rate, with the dispersal of Bantu populations from West Africa and the spread of pastoralism and agriculture southward through East Africa (46, 47). Our model based on archaeological and historical evidence includes large long-term African population size, gradual Late Pleistocene population growth, an early Neolithic transition in West Asia and Europe, and a later rise in the rate of growth in sub-Saharan Africa coincident with agricultural dispersal (Fig. 2).

Fig. 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 2.

Historic and prehistoric population size estimates for human populations ( SI Appendix ). Key features are the larger ancestral African population size and the earlier Neolithic growth in core agricultural areas.

As shown in Fig. 3, the demographic model predicts the recent peak ages of the African and European distributions of selected variants, at a much lower average selection intensity than the constant population size model. In particular, the demographic model readily explains the difference in age distributions between YRI and CEU samples: the YRI sample has more variants dating to earlier times when African populations were large compared with West Asia and Europe, whereas earlier Neolithic growth in West Asia and Europe led to a pulse of recent variants in those regions. The data that falsify the constant rate model, such as the observed genomewide heterozygosity value and the probable number of human–chimpanzee adaptive substitutions, are fully consistent with the demographic model.

Fig. 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 3.

Tip of the iceberg. Both the demographic and constant-rate models can account for the age distribution of ascertained variants (CEU data shown), but they differ greatly in the expected number of variants above the ascertainment frequency (fixed or near-fixed). The demographic model predicts a low long-term substitution rate and few alleles >78%, consistent with the observed data.

Discussion

Our simple demographic model explains much of the recent pattern, but some aspects remain. Although the small number of high-frequency variants (between 78% and 100%) is much more consistent with the demographic model than a constant rate of change, it is still relatively low, even considering the rapid acceleration predicted by demography. Demographic change may be the major driver of new adaptive evolution, but the detailed pattern must involve gene functions and gene–environment interactions.

Cultural and ecological changes in human populations may explain many details of the pattern. Human migrations into Eurasia created new selective pressures on features such as skin pigmentation, adaptation to cold, and diet (25, 26, 28). Over this time span, humans both inside and outside of Africa underwent rapid skeletal evolution (48, 49). Some of the most radical new selective pressures have been associated with the transition to agriculture (4). For example, genes related to disease resistance are among the inferred functional classes most likely to show evidence of recent positive selection (9). Virulent epidemic diseases, including smallpox, malaria, yellow fever, typhus, and cholera, became important causes of mortality after the origin and spread of agriculture (50). Likewise, subsistence and dietary changes have led to selection on genes such as lactase (18).

It is sometimes claimed that the pace of human evolution should have slowed as cultural adaptation supplanted genetic adaptation. The high empirical number of recent adaptive variants would seem sufficient to refute this claim (9, 12). It is important to note that the peak ages of new selected variants in our data do not reflect the highest intensity of selection, but merely our ability to detect selection. Because of the recent acceleration, many more new adaptive mutations should exist than have yet been ascertained, occurring at a faster and faster rate during historic times. Adaptive alleles with frequencies <22% should then greatly outnumber those at higher frequencies. To the extent that new adaptive alleles continued to reflect demographic growth, the Neolithic and later periods would have experienced a rate of adaptive evolution >100 times higher than characterized most of human evolution. Cultural changes have reduced mortality rates, but variance in reproduction has continued to fuel genetic change (51). In our view, the rapid cultural evolution during the Late Pleistocene created vastly more opportunities for further genetic change, not fewer, as new avenues emerged for communication, social interactions, and creativity.

Materials and Methods

The 3.9-million HapMap release was obtained from the International HapMap Project website (www.hapmap.org). The LDD test (9) was applied to all four HapMap population datasets. Briefly, by examining individuals homozygous for a given SNP, the fraction of inferred recombinant chromosomes (FRC) at adjacent polymorphisms can be directly computed without the need to infer haplotype, a computationally daunting task on such large datasets. The test uses the expected increase with distance in FRC surrounding a selected allele to identify such alleles. Importantly, the method is insensitive to local recombination rate, because local rate will influence the extent of LD surrounding all alleles, while the method looks for LD differences between alleles. By using a large sliding window (ranging from 0.25 to 1.0 Mb in the current study), and by explicitly acknowledging the expected LD structure of selected alleles, the LDD test can distinguish selection from other population genetic/demographic mechanisms, resulting in large LD blocks (9).

A modification of the LDD test was conducted on the CEU and YRI datasets, to find selected alleles near fixation. Unlike the normal LDD test, all SNPs >78% frequency (the cutoff used for primary analysis of this data) were queried, using the same sliding windows as the normal test. Unlike the standard test, however, the requirement that the alternative allele be no more than 1 SD from the genome average was not implemented (9). Ninety-three clusters were identified in the CEU population and 85 were identified in the YRI population (with 65 overlaps), a total of 113 fixed events. Unlike normal LDD screens (9), half of these observed fixed events determined by long-range LD were in extreme centromeric or telomeric regions, which have no recombination or high recombination, respectively (13, 52). The interpretation of extended LD in these regions is ambiguous, therefore, because low recombination maintains large LD blocks (centromeres), and well documented high telomere–telomere exchange homogenizes these regions (52). Removing these centromeric and telomeric regions in which LD is likely to be the result of mechanisms different from selection yields ≈50 regions of potential fixation.

Clustering.

The LDD test produces “clusters” of SNPs with the signature of selection, because of the extensive LD surrounding these alleles (9). Each cluster is likely to represent a single selection event, and hence we have attempted to minimize potential overcounting by cluster analysis. Using a simple nearest-neighbor technique, we assign a 10-kb radius to each selected SNP. Each pass through the data produces a new set of centroids, and cluster membership is reassigned to the nearest centroid. A SNP that lies >20 kb away from the nearest centroid is considered a new cluster, with it being the sole member. Using larger window sizes (up to 100 kb) reduces the number of independent clusters (by approximately half), however, at the cost of “fusing” likely independent events (data not shown). We believe the 10-kb window, therefore, is a conservative first-pass clustering of the observed selection events.

Each selected SNP identified by the LDD test was sorted and mapped to its physical location on human chromosomes (University of California Santa Cruz Human Genome 17). We iterate through the SNP list, starting with the most distal, and a SNP and its closest neighbor (within 10 kb radius) are clustered together with a new centroid (average) i computed. To be included as part of the ith cluster, the next SNP on the sorted SNP list must fall within 20 kb of the ith cluster. If it is within 20 kb of both an upstream and downstream cluster, to be integrated in the ith cluster it must have a distance to the ith centroid closer than the next closest centroid (i + 1). Otherwise, a new centroid and cluster is initiated. This task is repeated for all SNPs identified by the LDD test.

Allele Age Calculations.

Coalescence times (commonly referred to as allele ages) were calculated by methods described (24–26). Briefly, information contained in neighboring SNPs and the local recombination frequency is used to infer age. The genotyped population is binned (at the SNP under inferred selection, the target SNP) into the major and minor alleles (9). While every neighboring SNP gives information on the age of the target SNP, a single recombination event carries all of the downstream neighbors to an equal or higher FRC. Hence, our algorithm moves away (positively and negatively) from the target SNP and computes allele age only when a higher FRC level is reached in a neighboring SNP. A single neighboring SNP with no neighbors within 20 kb is not used for computation. This method is consistent with the theoretical and experimental expectations of LDD surrounding selected alleles (9).

For neighboring SNPs, allele age is computed by using: Embedded Image where t = allele age (in generations), c = recombination rate (calculated at the distance to the neighboring SNP), xt = frequency in generation t, and y = frequency on ancestral chromosomes. This method is a method-of-moments estimator (24), because the estimate results from equating the observed proportion of nonrecombinant chromosomes with the proportion expected if the true value of t is the estimated value. It requires no population genetic or demographic assumptions, only the exponential decay of initially perfect LD because of recombination. Estimates are obtained until FRC reaches 0.3, to avoid allele age calculations of lower reliability. We assume the ancestral allele is always the allele with neutral or genome average LDD ALnLH scores (9). Average regional recombination rates were obtained by querying data from ref. 53 in the University of California Santa Cruz database (http://genome.ucsc.edu). Regions with <0.1 cM/Mb average recombination rate were excluded. All allele age estimates are averages of the individual calculations at the target SNP (26).

Estimating the Rate of Adaptive Substitutions.

Under the null hypothesis of a constant rate of adaptive substitution, the age distribution of ASVs can estimate the mean fitness advantage (s̄) of new selected variants. The empirical distribution of fitness effects of adaptive substitutions is not known. On theoretical grounds, this distribution is expected to approximate a negative exponential (3). Other studies have assumed this distribution or a gamma distribution with similar shape (54–56), and selected mutations in laboratory organisms appear to fit this theoretical model (57, 58). In these expressions, s is the selection coefficient favoring a new mutation, and s̄ is the mean selection coefficient among the set of all advantageous mutations. We assume that adaptive alleles are dominant in effect, which allows the highest fixation probability (59) and the most rapid increase in frequencies and is therefore conservative (less dominance requires a higher substitution rate to explain the observed distribution). The value of s̄ is not known, and we are concerned with finding the single value that creates the best fit of the population size prediction to the observed data. We assumed a negative exponential distribution of s, in which Pr[s] = e −s/ s̄. The number of ascertained new adaptive variants originating in any single generation t is given by the equation: Embedded Image Here, ν is the rate of adaptive mutations per genome per generation, and Nt is the effective population size in generation t. This integral derives from the expectation of adaptive mutations in a diploid population (here, 2Nν) multiplied by the fixation probability 2s for each, again assuming dominant fitness effect. Under the null hypothesis, the population size Nt is constant across all generations, so the expected number of new adaptive mutations (ascertained and nonascertained) is likewise constant.

We considered the range of s between value a, yielding a current mean frequency of 0.22, and value b, yielding a current mean frequency of 0.78, as derived from the diffusion approximation for dominant advantageous alleles (60). The parameter ν is constant in effect across all generations, while the number of ascertained variants originating in each generation varies with the range of s placing new alleles in the ascertainment range. We applied a hill-climbing algorithm to find the best-fit value of s̄ for the empirical distribution of block ages, allowing ν to vary freely. With an estimate for s̄, the rate of adaptive mutations, ν, can be estimated as the value that satisfies Eq. 2 . This value is also sufficient to estimate the expected number of substitutions per generation, which is the value of the integral in Eq. 2 over the range 0 to infinity (in our analyses, the vast majority had 0.01 ≤ s ≤0.1). For the YRI data, assuming dominant fitness effects, the resulting estimate of adaptive substitution rate is 13.25 per generation, or 0.53 per year.

Acknowledgments

We thank Alan Fix, Dennis O'Rourke, Kristen Hawkes, Alan Rogers, Chad Huff, Milford Wolpoff, Balaji Srinivasan, and five anonymous reviewers for comments and discussions. This work was supported by grants from the U.S. Department of Energy, the National Institute of Mental Health, and the National Institute of Aging (to R.K.M.), the Unz Foundation (to G.M.C.), the University of Utah (to H.C.H.), and the Graduate School of the University of Wisconsin (to J.H.).

Footnotes

  • †To whom correspondence may be addressed. E-mail: jhawk{at}wisc.edu, harpend{at}xmission.com, or rmoyzis{at}uci.edu
  • Author contributions: J.H., E.T.W., and G.M.C. contributed equally to this work; J.H., E.T.W., G.M.C., and R.K.M. designed research; J.H., E.T.W., G.M.C., H.C.H., and R.K.M. performed research; E.T.W. and R.K.M. contributed new reagents/analytic tools; J.H., E.T.W., G.M.C., and H.C.H. analyzed data; and J.H. and R.K.M. wrote the paper.

  • The authors declare no conflict of interest.

  • This article contains supporting information online at www.pnas.org/cgi/content/full/0707650104/DC1.

  • Freely available online through the PNAS open access option.

  • © 2007 by The National Academy of Sciences of the USA

References

  1. ↵
    1. Biraben J-N
    (2003) Population Sociétés 394:1–4.
    OpenUrl
  2. ↵
    1. Fisher RA
    (1930) The Genetical Theory of Natural Selection (Clarendon, Oxford).
  3. ↵
    1. Orr HA
    (2003) Genetics 163:1519–1526.
    OpenUrlAbstract/FREE Full Text
  4. ↵
    1. Armelagos GJ ,
    2. Harper KN
    (2005) Evol Anthropol 14:68–77.
    OpenUrlCrossRef
  5. ↵
    1. Roush RT ,
    2. McKenzie JA
    (1987) Annu Rev Entomol 32:361–380.
    OpenUrlCrossRefPubMed
  6. ↵
    1. Lenski RE ,
    2. Travisano M
    (1994) Proc Natl Acad Sci USA 91:6808–6814.
    OpenUrlAbstract/FREE Full Text
  7. ↵
    1. Wick LM ,
    2. Weilenmann H ,
    3. Egli T
    (2002) Microbiology 148:2889–2902.
    OpenUrlAbstract/FREE Full Text
  8. ↵
    1. Wahl LM ,
    2. Krakauer DC
    (2000) Genetics 156:1437–1448.
    OpenUrlAbstract/FREE Full Text
  9. ↵
    1. Wang ET ,
    2. Kodama G ,
    3. Baldi P ,
    4. Moyzis RK
    (2006) Proc Natl Acad Sci USA 103:135–140.
    OpenUrlAbstract/FREE Full Text
  10. ↵
    1. Wright SI ,
    2. Bi IV ,
    3. Schroeder SG ,
    4. Yamasaki M ,
    5. Doebley JF ,
    6. McMullen MD ,
    7. Gaut BS
    (2006) Science 308:1310–1314.
    OpenUrl
  11. ↵
    1. Kim Y ,
    2. Nielsen R
    (2004) Genetics 167:1513–1524.
    OpenUrlAbstract/FREE Full Text
  12. ↵
    1. Voight BF ,
    2. Kudaravalli S ,
    3. Wen X ,
    4. Pritchard JK
    (2006) PLoS Biol 4:e72.
    OpenUrlCrossRefPubMed
  13. ↵
    1. The International HapMap Consortium
    (2005) Nature 437:1299–1320.
    OpenUrlCrossRefPubMed
  14. ↵
    1. Przeworski M
    (2001) Genetics 160:1179–1189.
    OpenUrl
  15. ↵
    1. Bustamante CD ,
    2. Fledel-Alon A ,
    3. Williamson S ,
    4. Nielsen R ,
    5. Hubisz MT ,
    6. Glanowski S ,
    7. Tanenbaum DM ,
    8. White TJ ,
    9. Sninsky JJ ,
    10. Hernandez RD ,
    11. et al.
    (2005) Nature 437:1153–1157.
    OpenUrlCrossRefPubMed
  16. ↵
    1. Pollard KS ,
    2. Salama SR ,
    3. King B ,
    4. Kern AD ,
    5. Dreszer T ,
    6. Katzman S ,
    7. Siepel A ,
    8. Pedersen JS ,
    9. Bejerano G ,
    10. Baertsch R ,
    11. et al.
    (2006) PLoS Genet 2:e168.
    OpenUrlCrossRefPubMed
  17. ↵
    1. Hollox EJ ,
    2. Poulter M ,
    3. Zvarek M ,
    4. Ferak V ,
    5. Krause A ,
    6. Jenkins T ,
    7. Saha N ,
    8. Kozlov AI ,
    9. Swallow DM
    (2001) Am J Hum Genet 68:160–172.
    OpenUrlCrossRefPubMed
  18. ↵
    1. Bersaglieri T ,
    2. Sabeti PC ,
    3. Patterson N ,
    4. Vanderploeg T ,
    5. Schaffner SF ,
    6. Drake JA ,
    7. Rhodes M ,
    8. Reich DE ,
    9. Hirschhorn JN
    (2004) Am J Hum Genet 74:1111–1120.
    OpenUrlCrossRefPubMed
  19. ↵
    1. Novembre J ,
    2. Galvani AP ,
    3. Slatkin M
    (2005) PLoS Biol 3:e339.
    OpenUrlCrossRefPubMed
  20. ↵
    1. Sabeti PC ,
    2. Walsh E ,
    3. Schaffner SF ,
    4. Varilly P ,
    5. Fry B ,
    6. Hutcheson HB ,
    7. Cullen M ,
    8. Mikkelsen TS ,
    9. Roy J ,
    10. Patterson N ,
    11. et al.
    (2005) PLoS Biol 3:e378.
    OpenUrlCrossRefPubMed
  21. ↵
    1. Hamblin MT ,
    2. Thompson EE ,
    3. Di Rienzo A
    (2002) Am J Hum Genet 70:369–383.
    OpenUrlCrossRefPubMed
  22. ↵
    1. Williamson S ,
    2. Hubisz MJ ,
    3. Clark AG ,
    4. Payseur BA ,
    5. Bustamante CD ,
    6. Nielsen R
    (2007) PLoS Genet 3:e90.
    OpenUrlCrossRefPubMed
  23. ↵
    1. Kimura R ,
    2. Fujimoto A ,
    3. Tokunaga K ,
    4. Ohashi J
    (2007) PLoS One 2:e286.
    OpenUrlCrossRef
  24. ↵
    1. Slatkin M ,
    2. Rannala B
    (2000) Annu Rev Genom Hum Genet 1:225–249.
    OpenUrlCrossRefPubMed
  25. ↵
    1. Ding Y-C ,
    2. Chi H-C ,
    3. Grady DL ,
    4. Morishima A ,
    5. Kidd JR ,
    6. Kidd KK ,
    7. Flodman P ,
    8. Spence MA ,
    9. Schuck S ,
    10. Swanson JM ,
    11. et al.
    (2002) Proc Natl Acad Sci USA 99:309–314.
    OpenUrlAbstract/FREE Full Text
  26. ↵
    1. Wang E ,
    2. Ding Y-C ,
    3. Flodman P ,
    4. Kid JR ,
    5. Kidd KK ,
    6. Grady DL ,
    7. Ryder OA ,
    8. Spence MA ,
    9. Swanson JM ,
    10. Moyzis RK
    (2004) Am J Hum Genet 74:931–944.
    OpenUrlCrossRefPubMed
  27. ↵
    1. Kayser M ,
    2. Brauer S ,
    3. Stoneking M
    (2003) Mol Biol Evol 20:893–900.
    OpenUrlAbstract/FREE Full Text
  28. ↵
    1. Akey JM ,
    2. Eberle MA ,
    3. Rieder MJ ,
    4. Carlson CS ,
    5. Shriver MD ,
    6. Nickerson DA ,
    7. Kruglyak L
    (2004) PLoS Biol 2:e286.
    OpenUrlCrossRefPubMed
  29. ↵
    1. Wright S
    (1969) The Theory of Gene Frequencies, Evolution and the Genetics of Populations (Univ Chicago Press, Chicago), Vol 2.
  30. ↵
    1. Gillespie JH
    (2000) Genetics 155:909–919.
    OpenUrlAbstract/FREE Full Text
  31. ↵
    1. Kim Y
    (2006) Genetics 172:1967–1978.
    OpenUrlAbstract/FREE Full Text
  32. ↵
    1. Betancourt AJ ,
    2. Kim Y ,
    3. Orr HA
    (2004) Genetics 168:2261–2269.
    OpenUrlAbstract/FREE Full Text
  33. ↵
    1. Wang D ,
    2. Fan J ,
    3. Siao C ,
    4. Berno A ,
    5. Young P ,
    6. Sapolsky R ,
    7. Ghandour G ,
    8. Perkins N ,
    9. Winchester E ,
    10. Spencer J ,
    11. et al.
    (1998) Science 280:1077–1081.
    OpenUrlAbstract/FREE Full Text
  34. ↵
    1. Stephens JC ,
    2. Schneider JA ,
    3. Tanguay DA ,
    4. Choi J ,
    5. Acharya T ,
    6. Stanley SE ,
    7. Jiang R ,
    8. Messer CJ ,
    9. Chew A ,
    10. Han J-H ,
    11. et al.
    (2001) Science 293:489–493.
    OpenUrlAbstract/FREE Full Text
  35. ↵
    1. Hellmann I ,
    2. Ebersberger I ,
    3. Ptak SE ,
    4. Pääbo S ,
    5. Przeworski M
    (2003) Am J Hum Genet 72:1527–1535.
    OpenUrlCrossRefPubMed
  36. ↵
    1. The Chimpanzee Sequencing and Analysis Consortium
    (2005) Nature 437:69–87.
    OpenUrlCrossRefPubMed
  37. ↵
    1. Otto SP ,
    2. Whitlock MC
    (1997) Genetics 146:723–733.
    OpenUrlAbstract/FREE Full Text
  38. ↵
    1. Coale AJ
    (1974) Sci Am 231:40–52.
    OpenUrlCrossRefPubMed
  39. ↵
    1. Weiss K
    (1984) Hum Biol 56:637–649.
    OpenUrlPubMed
  40. ↵
    1. Stiner MC ,
    2. Munro ND ,
    3. Surovell TA
    (2000) Curr Anthropol 41:39–73.
    OpenUrlPubMed
  41. ↵
    1. Bar-Yosef O ,
    2. Belfer-Cohen A
    1. Gebauer AB ,
    2. Price TD
    (1992) in Transitions to Agriculture in Prehistory, eds Gebauer AB , Price TD (Prehistory Press, Madison, WI), pp 21–48.
  42. ↵
    1. Bellwood P
    (2005) First Farmers: The Origins of Agricultural Societies (Blackwell, Oxford).
  43. ↵
    1. Price TD
    , ed (2000) Europe's First Farmers (Cambridge Univ Press, Cambridge, UK).
  44. ↵
    1. Relethford JH
    (1999) Evol Anthropol 8:7–10.
    OpenUrlCrossRef
  45. ↵
    1. Gifford-Gonzalez D
    (2000) Afr Archaeol Rev 17:95–139.
    OpenUrlCrossRef
  46. ↵
    1. Hanotte O ,
    2. Bradley DG ,
    3. Ochieng JW ,
    4. Verjee Y ,
    5. Hill EW ,
    6. Rege JEO
    (2002) Science 296:336–339.
    OpenUrlAbstract/FREE Full Text
  47. ↵
    1. Diamond J ,
    2. Bellwood P
    (2003) Science 300:597–603.
    OpenUrlAbstract/FREE Full Text
  48. ↵
    1. Frayer DW
    (1977) Am J Phys Anthropol 46:109–120.
    OpenUrlCrossRefPubMed
  49. ↵
    1. Larsen CS
    (1995) Annu Rev Anthropol 24:185–213.
    OpenUrlCrossRef
  50. ↵
    1. McNeill W
    (1976) Plagues and Peoples (Doubleday, Garden City, NY).
  51. ↵
    1. Crow JF
    (1966) BioScience 16:863–867.
    OpenUrlCrossRef
  52. ↵
    1. Riethman HC ,
    2. Xiang Z ,
    3. Paul S ,
    4. Morse E ,
    5. Hu X-L ,
    6. Flint J ,
    7. Chi H-C ,
    8. Grady DL ,
    9. Moyzis RK
    (2001) Nature 409:948–951.
    OpenUrlCrossRefPubMed
  53. ↵
    1. Kong A ,
    2. Gudbjartsson DF ,
    3. Sainz J ,
    4. Jonsdottir GM ,
    5. Gudjonsson SA ,
    6. Richardsson B ,
    7. Sigurdardottir S ,
    8. Barnard J ,
    9. Hallbeck B ,
    10. Masson G ,
    11. et al.
    (2002) Nat Genet 31:241–247.
    OpenUrlCrossRefPubMed
  54. ↵
    1. Keightley PD ,
    2. Lynch M
    (2003) Evolution (Lawrence, Kans) 57:683–685.
    OpenUrl
  55. ↵
    1. Shaw FH ,
    2. Geyer CJ ,
    3. Shaw RG
    (2002) Evolution (Lawrence, Kans) 56:453–463.
    OpenUrl
  56. ↵
    1. Elena SF ,
    2. Ekunwe L ,
    3. Hajela N ,
    4. Oden SA ,
    5. Lenski RE
    (1998) Genetica 102/103:349–358.
    OpenUrlCrossRef
  57. ↵
    1. Imhof M ,
    2. Schlötterer C
    (2001) Proc Natl Acad Sci USA 98:1113–1117.
    OpenUrlAbstract/FREE Full Text
  58. ↵
    1. Kassen R ,
    2. Bataillon T
    (2006) Nat Genet 38:484–488.
    OpenUrlCrossRefPubMed
  59. ↵
    1. Haldane JBS
    (1927) Trans Cambridge Philos Soc 23:19–41.
    OpenUrl
  60. ↵
    1. Ewens WJ
    (2004) Mathematical Population Genetics (Cambridge Univ Press, Cambridge, UK).
PreviousNext
Back to top
Article Alerts
Email Article

Thank you for your interest in spreading the word on PNAS.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Recent acceleration of human adaptive evolution
(Your Name) has sent you a message from PNAS
(Your Name) thought you would like to see the PNAS web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Citation Tools
Recent acceleration of human adaptive evolution
John Hawks, Eric T. Wang, Gregory M. Cochran, Henry C. Harpending, Robert K. Moyzis
Proceedings of the National Academy of Sciences Dec 2007, 104 (52) 20753-20758; DOI: 10.1073/pnas.0707650104

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Request Permissions
Share
Recent acceleration of human adaptive evolution
John Hawks, Eric T. Wang, Gregory M. Cochran, Henry C. Harpending, Robert K. Moyzis
Proceedings of the National Academy of Sciences Dec 2007, 104 (52) 20753-20758; DOI: 10.1073/pnas.0707650104
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Mendeley logo Mendeley

Article Classifications

  • Social Sciences
  • Anthropology
  • Biological Sciences
  • Anthropology
Proceedings of the National Academy of Sciences: 104 (52)
Table of Contents

Submit

Sign up for Article Alerts

Jump to section

  • Article
    • Abstract
    • Results
    • Discussion
    • Materials and Methods
    • Acknowledgments
    • Footnotes
    • References
  • Figures & SI
  • Info & Metrics
  • PDF

You May Also be Interested in

Water from a faucet fills a glass.
News Feature: How “forever chemicals” might impair the immune system
Researchers are exploring whether these ubiquitous fluorinated molecules might worsen infections or hamper vaccine effectiveness.
Image credit: Shutterstock/Dmitry Naumov.
Reflection of clouds in the still waters of Mono Lake in California.
Inner Workings: Making headway with the mysteries of life’s origins
Recent experiments and simulations are starting to answer some fundamental questions about how life came to be.
Image credit: Shutterstock/Radoslaw Lecyk.
Cave in coastal Kenya with tree growing in the middle.
Journal Club: Small, sharp blades mark shift from Middle to Later Stone Age in coastal Kenya
Archaeologists have long tried to define the transition between the two time periods.
Image credit: Ceri Shipton.
Illustration of groups of people chatting
Exploring the length of human conversations
Adam Mastroianni and Daniel Gilbert explore why conversations almost never end when people want them to.
Listen
Past PodcastsSubscribe
Panda bear hanging in a tree
How horse manure helps giant pandas tolerate cold
A study finds that giant pandas roll in horse manure to increase their cold tolerance.
Image credit: Fuwen Wei.

Similar Articles

Site Logo
Powered by HighWire
  • Submit Manuscript
  • Twitter
  • Facebook
  • RSS Feeds
  • Email Alerts

Articles

  • Current Issue
  • Special Feature Articles – Most Recent
  • List of Issues

PNAS Portals

  • Anthropology
  • Chemistry
  • Classics
  • Front Matter
  • Physics
  • Sustainability Science
  • Teaching Resources

Information

  • Authors
  • Editorial Board
  • Reviewers
  • Subscribers
  • Librarians
  • Press
  • Cozzarelli Prize
  • Site Map
  • PNAS Updates
  • FAQs
  • Accessibility Statement
  • Rights & Permissions
  • About
  • Contact

Feedback    Privacy/Legal

Copyright © 2021 National Academy of Sciences. Online ISSN 1091-6490