Skip to main content
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian
  • Log in
  • Log out
  • My Cart

Main menu

  • Home
  • Articles
    • Current
    • Latest Articles
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • Archive
  • Front Matter
  • News
    • For the Press
    • Highlights from Latest Articles
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Purpose and Scope
    • Editorial and Journal Policies
    • Submission Procedures
    • For Reviewers
    • Author FAQ
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian

User menu

  • Log in
  • Log out
  • My Cart

Search

  • Advanced search
Home
Home

Advanced Search

  • Home
  • Articles
    • Current
    • Latest Articles
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • Archive
  • Front Matter
  • News
    • For the Press
    • Highlights from Latest Articles
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Purpose and Scope
    • Editorial and Journal Policies
    • Submission Procedures
    • For Reviewers
    • Author FAQ

New Research In

Physical Sciences

Featured Portals

  • Physics
  • Chemistry
  • Sustainability Science

Articles by Topic

  • Applied Mathematics
  • Applied Physical Sciences
  • Astronomy
  • Computer Sciences
  • Earth, Atmospheric, and Planetary Sciences
  • Engineering
  • Environmental Sciences
  • Mathematics
  • Statistics

Social Sciences

Featured Portals

  • Anthropology
  • Sustainability Science

Articles by Topic

  • Economic Sciences
  • Environmental Sciences
  • Political Sciences
  • Psychological and Cognitive Sciences
  • Social Sciences

Biological Sciences

Featured Portals

  • Sustainability Science

Articles by Topic

  • Agricultural Sciences
  • Anthropology
  • Applied Biological Sciences
  • Biochemistry
  • Biophysics and Computational Biology
  • Cell Biology
  • Developmental Biology
  • Ecology
  • Environmental Sciences
  • Evolution
  • Genetics
  • Immunology and Inflammation
  • Medical Sciences
  • Microbiology
  • Neuroscience
  • Pharmacology
  • Physiology
  • Plant Biology
  • Population Biology
  • Psychological and Cognitive Sciences
  • Sustainability Science
  • Systems Biology

Coalescence and genetic diversity in sexual populations under selection

Richard A. Neher, Taylor A. Kessinger, and Boris I. Shraiman
PNAS September 24, 2013 110 (39) 15836-15841; https://doi.org/10.1073/pnas.1309697110
Richard A. Neher
aEvolutionary Dynamics and Biophysics Group, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany; and
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: richard.neher@tuebingen.mpg.de
Taylor A. Kessinger
aEvolutionary Dynamics and Biophysics Group, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany; and
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Boris I. Shraiman
bKavli Institute for Theoretical Physics andcDepartment of Physics, University of California, Santa Barbara, CA 93116
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  1. Edited* by Herbert Levine, Rice University, Houston, TX, and approved August 12, 2013 (received for review May 22, 2013)

  • Article
  • Figures & SI
  • Info & Metrics
  • PDF
Loading

Significance

Many populations are genetically diverse, and genomes of individuals can differ at millions of loci, some of which affect the fitness of the organism. Although recombination will separate distant loci rapidly, nearby loci are inherited together and stay linked for long times. Selected alleles at linked loci influence each other’s dynamics in complex ways that are poorly understood. We present an analysis of the coupled histories of linked loci subject to selection and recombination and make predictions for the resulting genetic diversity. We show that simple patterns emerge from the collective effect of many loci and that these patterns can be used to infer evolutionary parameters from sequence data.

Abstract

In sexual populations, selection operates neither on the whole genome, which is repeatedly taken apart and reassembled by recombination, nor on individual alleles that are tightly linked to the chromosomal neighborhood. The resulting interference between linked alleles reduces the efficiency of selection and distorts patterns of genetic diversity. Inference of evolutionary history from diversity shaped by linked selection requires an understanding of these patterns. Here, we present a simple but powerful scaling analysis identifying the unit of selection as the genomic “linkage block” with a characteristic length, Graphic, determined in a self-consistent manner by the condition that the rate of recombination within the block is comparable to the fitness differences between different alleles of the block. We find that an asexual model with the strength of selection tuned to that of the linkage block provides an excellent description of genetic diversity and the site frequency spectra compared with computer simulations. This linkage block approximation is accurate for the entire spectrum of strength of selection and is particularly powerful in scenarios with many weakly selected loci. The latter limit allows us to characterize coalescence, genetic diversity, and the speed of adaptation in the infinitesimal model of quantitative genetics.

  • Hill–Robertson interference
  • genealogy
  • Bolthausen–Sznitman coalescent

In asexual populations, different genomes compete for survival, and the fate of most new mutations depends more on the total fitness of the genome they reside in than on their own contribution to fitness. As a result, beneficial mutations on one genetic background can be lost to competition with other backgrounds, an effect known as “clonal interference” (1⇓–3); likewise, deleterious mutations in very fit genomes can fix. This interference is reduced by recombination and disappears when recombination is rapid enough such that selection can act independently on different loci. Many eukaryotes recombine their genetic material by crossing-over of homologous chromosomes. As a result, distant loci evolve independently but nearby tightly linked loci remain coupled. Such interference, known as Hill–Robertson interference, reduces the efficacy of selection (4, 5) and reduces levels of neutral variation. Neutral diversity is indeed correlated with local recombination rates in several species, suggesting that linked selection is an important evolutionary force (6, 7). One typically distinguishes background selection against deleterious mutations (8, 9) from sweeping beneficial mutations, which lead to hitchhiking (10, 11). Both of these processes reduce diversity at linked loci and probably contribute to the observed correlation (12). Another piece of evidence for the importance of linked selection comes from the weak correlation between levels of genetic diversity and the population size (13). Whereas classic neutral models predict that diversity should increase linearly with the population size (14), in models dominated by selection, the diversity depends only weakly on the population size (3). Hence, linked selection could explain this “paradox of variation” (15).

From the perspective of a neutral allele, any random association with genetic backgrounds of different fitness results in fluctuations of its allele frequency. To distinguish this source of stochasticity from genetic drift, Gillespie (11) coined the term “genetic draft.” Whereas genetic draft is understood well when caused by strongly selected mutations whose dynamics are deterministic at high frequencies (5, 16, 17), the cumulative effect of many weak effect mutations has mainly been addressed using simulations (18, 19). Many populations harbor substantial heritable phenotypic variation, which, in an unknown way, depends on a large number of polymorphisms in the genome. The majority of these polymorphisms are likely to have small effects on phenotypes and fitness. Collectively, they can still dominate phenotypic variation (20) and possibly fitness variation. This limit is known as the infinitesimal model in quantitative genetics. Quantitative genetics, however, typically ignores linkage between loci and the maintenance of genetic diversity (21, 22).

Here, we characterize the structure of genealogies, genetic diversity, and the rate of adaptation in sexual populations in the limit of numerous weakly selected alleles. We build on recent progress in our understanding of genealogies in adapting asexual populations (23⇓–25), and we will first review these results briefly. We will then present a scaling argument that reduces the problem of coalescence within a sexually reproducing population to an asexual population with suitably scaled parameters. This correspondence allows us to predict levels of genetic diversity, coalescence time scales, and site frequency spectra. Our results hold regardless of whether the polymorphisms originated as weakly deleterious or beneficial mutations, and thus cover weak effect background selection as well as adaptation. We confirm the validity of the mapping to the asexual model by comparing its predictions with numerical simulations of evolving sexual populations. We use this approximation to demonstrate that in the limit of numerous weakly selected mutations, the rate of adaptation scales as the square root of recombination rate.

Results

In asexual populations, all loci share the same genealogical history and the fate of a lineage depends on the fitness of the entire genome. If fitness depends on a large number of polymorphic loci with comparable effects, the fitness distribution in the population will be roughly Gaussian and the fittest individuals are Graphic ahead of the fitness mean, where Graphic is the total fitness variance in the population (2, 26, 27). In large asexual populations, only individuals in the high fitness nose have an appreciable chance to contribute to future generations. It will take those individuals roughly Graphic generations to dominate the population. Hence, the probability that two randomly chosen individuals had a common ancestor Graphic generations ago is of order 1 (i.e., their ancestral lineages have likely coalesced). A more thorough analysis of coalescence in adapting asexual populations can be found in studies by Neher and Hallatschek (23) and by Desai et al. (24). In small populations with Graphic, coalescence is dominated by neutral processes (nonheritable fluctuations in offspring number known as genetic drift). The average number of generations back to the most recent common ancestor of any pair of extant genomes, also known as the pair coalescence time, is given by:Embedded Imagewhere c is a constant of order 1 that captures deviations from Gaussianity that depend on details of the model. For the infinitesimal model studied here, Graphic (23).

In an attempt to extend applicability of the neutral coalescent, one sometimes defines an effective population size, Graphic, equal to Graphic regardless of whether coalescence is neutral or not (28). By definition, a neutral model with Graphic predicts the same levels of genetic diversity, but the statistical properties of the genealogies dominated by selection are quite different and cannot be papered over simply by redefining the population size. We will therefore avoid the term Graphic and stick to Graphic. For the approximately neutral case, Graphic, the coalescent tree is of the Kingman type (14). As Graphic increases, coalescence is more and more driven by the amplification of fit genomes, which generates a very skewed offspring number distribution over time scales of order Graphic. As a result, the genealogies resemble the Bolthausen–Sznitman coalescent (BSC) (25, 29) with very different statistical properties. Two representative coalescent trees sampled from asexual populations, one neutral and one rapidly adapting, are shown in Fig. 1A.

Fig. 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 1.

Coalescence in neutral and adapting populations. (A) Typical coalescent tree from neutral (Left) and adapting (Right) asexual populations. In adapting populations, coalescent trees branch asymmetrically and contain approximate multiple mergers. (B) Illustration of asexual blocks in sexual populations. The sketch depicts a representative chromosome at the bottom with polymorphisms indicated as balls. Different loci within segments shorter than Graphic share most of their genealogical history, (i.e., have trees similar to the one indicated in the center of the segment where TMRCA is the time to the most recent common ancestor). Coalescence within this segment of length Graphic is either neutral or driven by the fitness differences between different haplotypes spanning these segments. (Inset) Fitness distribution of these haplotype blocks is indicated. Distant parts of the chromosome are in linkage equilibrium, and the tree changes as one moves along the chromosome. The succession of changing trees is the ancestral recombination graph.

Sexual Populations and Recombination.

In contrast to asexual evolution, recombination decouples different loci in sexual populations: the further apart, the more rapidly. The typical length of the segment that is not disrupted decreases with time asEmbedded Imagewhere ρ is the cross-over rate and L is the length of the chromosome. The second approximation is justified whenever Graphic. If polymorphisms affecting fitness are spread evenly across the genome and are dense (the infinitesimal model), we expect that different segregating haplotypes in a region of length Graphic harbor fitness variation proportional to the segment lengthEmbedded ImageThis fitness variance shrinks with time as the block length decreases. Although initial fitness differences between blocks are large, they are chopped into smaller blocks so rapidly that selection has no time to amplify the fittest of these early large blocks. However, the rate at which blocks are chopped up decreases as they get shorter, and, at some point, the rate of chopping them up is outweighed by the amplification of the fittest blocks by selection. The latter happens when fitness differences between haplotypes of this block are comparable to the recombination rate. More precisely, the relevant block length is the length that survives over the time scale of coalescence Graphic. In large enough populations, the time scale of coalescence itself is determined by these fitness differences via Eq. 1. In contrast to asexual populations, only the fitness variance, Graphic, within the linkage block of length Graphic is relevant, rather than the total variance Graphic (Fig. 1B). Using Graphic in Eq. 2, we findEmbedded ImageLinkage disequilibrium (LD) should decay over this length scale. Substituting Graphic into Eq. 3 yieldsEmbedded ImageHence, the time scales of coalescence and neutral diversity are given by the inverse of the fitness variance per map length Graphic with a logarithmic correction (see also refs. 9, 30 for the case of strongly selected mutations). To arrive at this result, we have assumed that Graphic. If this condition is not satisfied, local coalescence will be approximately neutral. In this case, Graphic and the LD extends over Graphic nucleotides. Empirically, we observe a smooth and rapid cross-over between these two regimes (below and Fig. 2). The condition for draft dominance, Graphic, is more stringent in sexual populations than in asexual populations, in which it is Graphic. In other words, recombination reduces interference and results in drift-dominated coalescence over a larger parameter range.

Fig. 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 2.

Coalescence in sexual populations. The figure shows the average pair coalescence time Graphic relative to the neutral expectation as a function of Graphic determined using Eq. 5. For Graphic, Graphic, whereas Graphic otherwise.

We predict now that the results for genetic diversity in the asexual coalescent apply with Graphic as the local fitness variance and that linkage disequilibrium between common loci extends over a distance Graphic. We will validate these predictions by forward simulations of different population models.

Constant Selection in the Infinitesimal Model.

We first consider a model of a population whose fitness variance is set by external (environmental) factors in which the selected trait depends on many weak effect polymorphisms and de novo mutations (Materials and Methods). This model might be a first approximation to scenarios where selection pressures are dictated by a changing environment, an evolving immune system, or a breeder who imposes a constant artificial selection. We simulate our population using a discrete generation model with an approximately constant population size and a finite number of sites in the genome as implemented in FFPopSim (31) (Materials and Methods). We track the genealogy of a locus in the center of the chromosome, which allows us to study properties of representative coalescent trees.

After allowing the population to equilibrate, we sample the evolving population in roughly Graphic intervals and measure Graphic, the site frequency spectrum (SFS), and the LD between polymorphisms at intermediate frequencies Graphic. We perform these simulations for many combinations of parameters. For each combination, we calculate Graphic according to Eq. 5. Fig. 2 shows that the average pair coalescence time Graphic approaches N for Graphic and that it is proportional to Graphic (with logarithmic corrections) for Graphic as predicted.

In addition to a reduction in genetic diversity, we predict that the local genealogies will resemble samples from the BSC rather than the Kingman coalescent whenever Graphic. Fig. 3 shows a collection of SFSs colored by the Graphic. With increasing Graphic, the SFS smoothly interpolates between the expectations for the Kingman coalescent and the BSC. As soon as the SFS starts deviating from the prediction of the Kingman coalescent, Tajima’s D turns negative. For large Graphic, we find a nonmonotonic SFS with a steep divergence Graphic characteristic of the BSC.

Fig. 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 3.

SFSs, normalized by Graphic, for a large number of parameter combinations. Color indicates the value of Graphic. For large Graphic, the SFSs display the nonmonotonicity characteristic of the BSC (dashed line), whereas the SFSs are described well by the prediction from Kingman’s coalescent (solid line) if Graphic. The BSC curve serves as a guide to the eye because its normalization depends on Graphic.

Another important feature of diversity in sexual populations is the genomic distance across which loci share much of their genealogy. This can be quantified by measuring the correlations between loci (LD) at different distances. In order for our picture to be consistent, the extent of LD should be approximately equal to Graphic. We measured LD as Graphic for different distances d and plot it against Graphic (Fig. 4). As predicted, LD decays over the length Graphic.

Fig. 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 4.

Correlation length along the genome. The figure shows LD, quantified as average Graphic, between pairs of loci at different distances (the curves are normalized to their value at zero distance). The x axis shows the distance between loci d rescaled by Graphic determined using Eq. 2, with t equal to the measured pair coalescence time. After this rescaling, the distance dependence of all simulations follows approximately the same master curve, which shows that LD extends for Graphic.

Frequent Small Effect Mutations.

In the model studied above, fitness variance was set by external factors. We now consider a model where the fitness variance and diversity are set by a balance between frequent novel mutations of small effect and the removal of variation by selection (i.e., fixation or loss of alleles). This type of model has been studied for asexual populations (26, 32). Using these results, we expect that the fitness variance within a block of length Graphic is given byEmbedded ImageHere, μ is the mutation rate and Graphic is the second moment of the distribution of mutational effects. Note than in this infinitesimal limit, it is irrelevant whether mutations are deleterious or beneficial; only the second moment of the fitness effect distribution is important. The quantity Graphic is the “diffusion” constant of haplotype fitness in the absence of selection. Eq. 6 implies that fitness variation accumulates over the time it takes a few lineages to dominate the population, which is approximately given by half the pair coalescence time (23). Substituting Eq. 2 with Graphic into Eq. 6, we findEmbedded ImageRemarkably, the fitness variance of the effectively asexual blocks is simply the ratio of the variance injection per nucleotide, Graphic, and the cross-over rate (at least when Graphic). The coalescence time cancels. We therefore find for Graphic Embedded Imagewhere c is again a constant of order 1. In the limit where coalescence is driven by selection, the total rate of adaptation isEmbedded Image

These results apply to steadily adapting populations (i.e., scenarios where beneficial mutations dominate), populations suffering from a mutational meltdown, or populations where the two processes balance. We simulate the lattermost using a model with recurrent mutations such that the population settles into a dynamic equilibrium where the fixation of beneficial mutations is roughly canceled out by that of deleterious mutations (33). The predictions for neutral diversity, LD, and the SFS match the simulation results very well. Fig. S1 shows plots analogous to Figs. 2–4. The prediction for the total fitness variance, Eq. 9, is compared with the simulation results in Fig. 5. We investigated additional models to demonstrate the robustness of the conclusions regarding model assumptions and simulation method. Fig. S2 shows neutral diversity, LD, and SFS for a model in which unique beneficial mutations are injected at sites that become monomorphic. Fig. S3 shows results for a bona fide infinite sites model of chromosomes that accumulate beneficial or deleterious mutations. In all these cases, the observed diversity agrees well with Eq. 8 and the SFS shows the expected cross-over from the Kingman to the BSC predictions as Graphic increases.

Fig. 5.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 5.

Total fitness variation due to frequent weak effect mutations in a model where deleterious and beneficial mutations balance each other. The color shows the average number of cross-overs per simulated segment. There is a residual dependence on ρ due to large corrections to the asymptotic behavior.

Loosely Linked Loci.

Our analysis has focused on the effect of fitness variation in short effectively asexual blocks. As discussed above, the total strength of selection σ can be much larger than the fitness differences within effectively asexual blocks Graphic. However, a particular locus only remains linked to distant polymorphisms for a short time, and the contribution of these distant loci averages out. For our focus on the effect of tightly linked loci to be valid, the integral contribution of such loosely linked loci to drift and draft should be small compared with the effect of fitness variation Graphic within the segment. Loosely linked loci are amenable to a perturbative analysis known as quasilinkage equilibrium (34, 35). In the study by Neher and Shraiman (35), it is shown that the stochastic dynamics of the allele frequency Graphic at locus i due to loosely linked loci is described by the following Langevin equation:Embedded Imagewhere Graphic is the LD between loci i and j, Graphic is the fitness effect of the derived allele at locus j, and Graphic is random noise with autocorrelation function Graphic, representing genetic drift. If the two loci are loosely linked (i.e., the cross-over rate Graphic between them is much larger than the effect of selection on either of them), Graphic is also a fluctuating quantity. The autocorrelation function of Graphic is (35)Embedded ImageGiven this autocorrelation, we can now integrate over fluctuations due to genetic drift and loosely linked selected loci to obtain a renormalized diffusion coefficient (a reduced Graphic). Reproducing equation 44 of ref. 35, we haveEmbedded ImageThis result is similar to results in other studies (9, 30, 36) in that it shows that the level of drift is increased by a factor that depends on the square of the ratio of selection and linkage, averaged over the genome.

If we now consider the integral effect of all loci further away than ξ, it is always dominated by the closest loci, so that Graphic (obtained as a continuum approximation to the sum in Eq. 12, Graphic). Hence, provided that Graphic, a condition that obtains when fitness variation at distant loci is sufficiently small or the loci are sufficiently distant, their effect can be accounted for by a simple rescaling of the effective population size (17); this is the “weak draft” regime. Note, however, that the recombination rate between distant loci is ultimately limited by the outcrossing rate and that distant loci can have substantial effects in facultatively sexual populations (17, 37).

The negligible effect of loosely linked loci is a consequence of two types of averaging that are apparent in Eq. 11. First, the associations between these distant loci are transient and average out over time. This manifests itself in the decay time of Graphic in Eq. 11. Second, different individuals carry different alleles at these distant loci; hence, their fitness effect is averaged over different descendents. As a consequence, the autocorrelation in Eq. 11 is proportional to Graphic . Together, these two averages result in the Graphic contribution of loosely linked loci.

For the more tightly linked loci (i.e., Graphic), the behavior crosses over to the “strong draft” regime. This cross-over length scale Graphic is controlled entirely by the “local” quantities: the recombination rate per base pair ρ and the local fitness variance density. Furthermore, Graphic is generally larger than Graphic, with Graphic. This ratio corresponds to the reduction in the block size during the span of time between local selection effects first coming into play and the coalescence time. In the limit of Graphic, recombination events within the Graphic block must be reckoned with, but for more realistic population sizes, we have shown above that focusing on the Graphic-sized asexual segment captures the effects of strong draft quite well.

Length Distribution of Segments Identical by Descent.

The structure of genealogies has implications for the length Graphic of segments identical by descent (IBD) in pairs of individuals. Their distribution, Graphic, is directly related to the distribution of pair coalescence times, Graphic, via the relation Graphic. In neutrally evolving populations of constant size, pair coalescence times are exponentially distributed with mean Graphic. Consequently, the length of IBD segments is distributed as Graphic and has a long, slowly decaying tail. If Graphic, coalescence is accelerated on average but predominantly happens after lineages have reached the upper tail of the fitness distribution of different alleles of a linkage block. Hence, the distribution of pair coalescence times is peaked at Graphic rather than being exponential (compare with figure 3 of ref. 23). This shift in the distribution of Graphic with relatively rare very recent coalescence has the consequence that Graphic is approximately exponential. Long IBD segments are therefore much less likely than in the neutral case with the same Graphic.

Discussion

In most sexual populations, the histories of different chromosomes or loci far apart on a chromosome are weakly correlated. Nearby loci, however, are more tightly linked, which results in correlated histories and LD. Because the density of heterozygous sites is Graphic and the length scale of LD is Graphic, the typical number of SNPs in one linkage block is Graphic. If n is much larger than 1, and a sizeable fraction of those SNPs affect fitness, different haplotypes segregating within such a block will display a broad distribution in local fitness with a variance that we have denoted by Graphic. Neutral alleles linked to haplotypes drawn from this distribution will be affected by linked selection. This, in turn, results in genealogies different from standard neutral models but similar to the BSC characteristic of rapidly adapting asexual populations (23, 38).

In regions of high recombination in obligately outcrossing species, the number of polymorphisms per linkage block, n, is of order 1 and linked selection will mainly result from the occasional strong selective sweep (39). However, recombination rates vary by orders of magnitude across the genome (40), and Graphic in low recombination regions. In those regions, the cumulative effect of many weakly selected polymorphisms is expected to be important. This holds in particular for species that outcross rarely, such as many plants, nematodes, yeasts, and viruses (41⇓⇓–44). This type of linked selection will overwhelm genetic drift if Graphic. The fitness variance per block is given by Graphic, where Graphic is the second moment of the effect distribution of polymorphisms. Hence, we require Graphic. Provided n is large enough, even nominally neutral Graphic polymorphisms collectively dominate the dynamics of haplotypes of length Graphic. In this infinitesimal limit, the nature of linked selection is irrelevant and our results apply to any mix of deleterious and beneficial mutations as long as the effects of individual mutations are weak and their number is large.

Relation to Previous Work.

Most previous work on genetic draft and selective interference considered mutations with strong effects that behave deterministically at high frequencies, whereas we focus on weak effect mutations. Reduction of genetic diversity by sweeping beneficial mutations was first discussed by Maynard Smith (10) (also refs. 11, 45⇓–47). In these models, genetic diversity is determined by the typical waiting time between two successive selective sweeps close enough to affect a given locus. Similarly, deleterious mutations reduce diversity at linked sites. Assuming that mutations have a large detrimental effect on fitness and happen with rate μ per site, it was shown (9, 36) that the reduction of genetic diversity is a function of Graphic. As in our analysis here, the strongest effect on genetic diversity comes from tightly linked loci. Our analysis of loosely linked loci is similar to the work by Santiago and Caballero (30). The latter, however, breaks down at tight linkage, and the cross-over to the asexual behavior is essential for a consistent description in the limit of many weakly selected loci. This limit has mainly been studied using computer simulations (18, 19, 48), and few analytical results are available.

Weissman and Barton (17) investigated the rate of adaptation and its effect on diversity using scaling arguments similar to the one presented here. In their model, adaptation is driven by individual selective sweeps. The duration of a sweep explicitly sets the time scale Graphic on which coalescence happens. In this model, the speed of adaptation is proportional to the map length. In contrast, our model assumes many weak effect mutations, and the time scale of coalescence is set by Graphic, which is self-consistently determined and itself depends on model parameters, such as ρ and Graphic. We can recover their result for the rate of adaptation by setting Graphic and Graphic. With these assumptions, we obtain Graphic instead of Eq. 9. The model used by Weissman and Barton (17) applies to a limit where, at most, one strongly selected and sweeping mutation falls into one linkage block. The basic properties of genealogies and SFSs are expected to be qualitatively similar in the limit of one sweep per block. If the contribution from weak mutations is negligible while sweeps are common, the coalescence properties will be dominated by sweeps at different distances. This limit has been studied by Durrett and Schweinsberg (49) and also results in a multiple merger coalescent.

Other types of models are appropriate if the rate of outcrossing is small compared with the SD in fitness (37, 38, 50) or if recombination proceeds via horizontal transfer of short pieces of DNA (37, 51). In these cases, one finds a very strong dependence of the rate of adaptation on the rate of outcrossing or horizontal transfer. Rare recombination has the potential to increase fitness variance dramatically because many loci are in strong LD.

In summary, we have characterized the effect of dense, weakly selected polymorphisms on genetic diversity, which might be the source of much of the phenotypic variability we observe (20, 22). Our analysis provides a consistent genealogical framework for the infinitesimal model of quantitative genetics. This limit of weakly selected mutations has so far eluded analytical understanding. We derived equations that relate the mutational input and the rate of recombination to neutral diversity and the site frequency spectra. Because genetic diversity (neutral or not) is directly accessible in population resequencing experiments, our results should be of practical relevance when interpreting such data. Furthermore, one is often interested in identifying particular mutations that arose in response to specific environmental challenges. If successful, those mutations tend to be of large effect and fall outside the scope of our model. Importantly, strong adaptations only perturb a fraction of the genome [more precisely, a segment of length Graphic, where s is the selection coefficient]. Our model provides the background on top of which such singular adaptations can be sought, and understanding the statistical patterns of diversity and linkage within this null model is essential for reliable inference.

Materials and Methods

We use a model with discrete generations, haploid individuals, an approximately constant population size, and a finite number of sites in the genome, as implemented in FFPopSim (31). We simulate a fraction of a chromosome of length L, with per site cross-over rate ρ. If Graphic, no recombination happens in most cases. In addition to forward simulation, we track the genealogy of a central locus, which allows us to measure pair coalescence times, the Graphic, and the neutral SFS directly (this functionality is implemented in a more recent release of FFPopSim; http://code.google.com/p/ffpopsim). For all parameters, we produce equilibrated populations by simulating for 10 Graphic. Subsequent measurements of population parameters start from these equilibrated populations and sample the population roughly twice every Graphic, as estimated from our theoretical arguments. All scripts associated with this paper can be obtained from http://git.tuebingen.mpg.de/reccoal.

Constant Selection.

To maintain a constant fitness variance Graphic, we rescale the selection coefficients associated with individual loci of each generation accordingly. Mutations are introduced into a random individual whenever a locus becomes monomorphic [i.e., the previously introduced mutation is lost or has fixed (38)]. This allows us to simulate a large number of sites efficiently in a limit where the overall mutation rate is small compared with Graphic. In this way, we keep all L loci polymorphic without using a high mutation rate, which would result in frequent recurrent mutations. We simulate a grid of parameters with N taking the values Graphic σ taking the values Graphic, and Graphic taking five logarithmically spaced values between Graphic and Graphic. For the analysis, simulations were filtered so that Graphic and Graphic. To prevent invalid logarithms, Graphic was replaced by Graphic in Eq. 5.

Dynamic Balance.

In this set of simulations, we simulate a genome consisting of finite sites in a constant fitness landscape where mutations at each locus have a small effect s. Mutations are injected at random with rate μ at each locus. In contrast to the models above, where mutations are injected only when a locus is monomorphic, we allow recurrent and back mutation to make the dynamic balance state possible. The grid of parameters used was Graphic, Graphic, Graphic, Graphic, and Graphic logarithmically spaced between s and 1.0. For the analysis, simulations were filtered such that Graphic, Graphic, and Graphic.

Acknowledgments

We thank Fabio Zanini for stimulating discussions and help with FFPopSim and Guy Sella for very useful comments on the manuscript. This work is supported by European Research Council Starting Grant HIVEVO 260686 (to R.A.N.) and, in part, by National Science Foundation Grant PHY11-25915 (to Kavli Institute for Theoretical Physics). B.I.S. acknowledges support from National Institutes of Health Grant R01 GM086793.

Footnotes

  • ↵1To whom correspondence should be addressed. E-mail: richard.neher{at}tuebingen.mpg.de.
  • Author contributions: R.A.N. and B.I.S. designed research; R.A.N., T.A.K., and B.I.S. performed research; R.A.N. and T.A.K. analyzed data; and R.A.N., T.A.K., and B.I.S. wrote the paper.

  • The authors declare no conflict of interest.

  • ↵*This Direct Submission article had a prearranged editor.

  • This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1309697110/-/DCSupplemental.

Freely available online through the PNAS open access option.

References

  1. ↵
    1. Gerrish PJ,
    2. Lenski RE
    (1998) The fate of competing beneficial mutations in an asexual population. Genetica 102-103(1-6):127–144.
    OpenUrlCrossRef
  2. ↵
    1. Desai MM,
    2. Fisher DS
    (2007) Beneficial mutation selection balance and the effect of linkage on positive selection. Genetics 176(3):1759–1798.
    OpenUrlAbstract/FREE Full Text
  3. ↵
    1. Neher RA
    (2013) Genetic draft, selective interference, and population genetics of rapid adaptation. Annu Rev Ecol Evol Syst, 44, in press.
  4. ↵
    1. Hill WG,
    2. Robertson A
    (1966) The effect of linkage on limits to artificial selection. Genet Res 8(3):269–294.
    OpenUrlCrossRefPubMed
  5. ↵
    1. Barton NH
    (1995) Linkage and the limits to natural selection. Genetics 140(2):821–841.
    OpenUrlAbstract/FREE Full Text
  6. ↵
    1. Begun DJ,
    2. Aquadro CF
    (1992) Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature 356(6369):519–520.
    OpenUrlCrossRefPubMed
  7. ↵
    1. Cutter AD
    (2006) Nucleotide polymorphism and linkage disequilibrium in wild populations of the partial selfer Caenorhabditis elegans. Genetics 172(1):171–184.
    OpenUrlAbstract/FREE Full Text
  8. ↵
    1. Charlesworth B,
    2. Morgan MT,
    3. Charlesworth D
    (1993) The effect of deleterious mutations on neutral molecular variation. Genetics 134(4):1289–1303.
    OpenUrlAbstract/FREE Full Text
  9. ↵
    1. Hudson RR,
    2. Kaplan NL
    (1995) Deleterious background selection with recombination. Genetics 141(4):1605–1617.
    OpenUrlAbstract/FREE Full Text
  10. ↵
    1. Maynard Smith J,
    2. Haigh J
    (1974) The hitch-hiking effect of a favourable gene. Genet Res 23(1):23–35.
    OpenUrlPubMed
  11. ↵
    1. Gillespie JH
    (2000) Genetic drift in an infinite population. The pseudohitchhiking model. Genetics 155(2):909–919.
    OpenUrlAbstract/FREE Full Text
  12. ↵
    1. Hudson RR
    (1994) How can the low levels of DNA sequence variation in regions of the drosophila genome with low recombination rates be explained? Proc Natl Acad Sci USA 91(15):6815–6818.
    OpenUrlAbstract/FREE Full Text
  13. ↵
    1. Leffler EM,
    2. et al.
    (2012) Revisiting an old riddle: What determines genetic diversity levels within species? PLoS Biol 10(9):e1001388.
    OpenUrlCrossRefPubMed
  14. ↵
    1. Kingman J
    (1982) On the genealogy of large populations. J Appl Probab 19A:27–43.
    OpenUrlCrossRef
  15. ↵
    1. Lewontin RC
    (1974) The Genetic Basis of Evolutionary Change (Columbia Univ Press, New York).
  16. ↵
    1. Walczak AM,
    2. Nicolaisen LE,
    3. Plotkin JB,
    4. Desai MM
    (2012) The structure of genealogies in the presence of purifying selection: A fitness-class coalescent. Genetics 190(2):753–779.
    OpenUrlAbstract/FREE Full Text
  17. ↵
    1. Weissman DB,
    2. Barton NH
    (2012) Limits to the rate of adaptive substitution in sexual populations. PLoS Genet 8(6):e1002740.
    OpenUrlCrossRefPubMed
  18. ↵
    1. McVean GA,
    2. Charlesworth B
    (2000) The effects of Hill-Robertson interference between weakly selected mutations on patterns of molecular evolution and variation. Genetics 155(2):929–944.
    OpenUrlAbstract/FREE Full Text
  19. ↵
    1. Gordo I,
    2. Navarro A,
    3. Charlesworth B
    (2002) Muller’s ratchet and the pattern of variation at a neutral locus. Genetics 161(2):835–848.
    OpenUrlAbstract/FREE Full Text
  20. ↵
    1. Yang J,
    2. et al.
    (2010) Common SNPs explain a large proportion of the heritability for human height. Nat Genet 42(7):565–569.
    OpenUrlCrossRefPubMed
  21. ↵
    1. Bulmer MG
    (1980) The Mathematical Theory of Quantitative Genetics (Oxford Univ Press, Oxford).
  22. ↵
    1. Lynch M,
    2. Walsh B
    (1998) Genetics and Analysis of Quantitative Traits (Sinauer, Sunderland, MA).
  23. ↵
    1. Neher RA,
    2. Hallatschek O
    (2013) Genealogies of rapidly adapting populations. Proc Natl Acad Sci USA 110(2):437–442.
    OpenUrlAbstract/FREE Full Text
  24. ↵
    1. Desai MM,
    2. Walczak AM,
    3. Fisher DS
    (2013) Genetic diversity and the structure of genealogies in rapidly adapting populations. Genetics 193(2):565–585.
    OpenUrlAbstract/FREE Full Text
  25. ↵
    1. Brunet E,
    2. Derrida B,
    3. Mueller AH,
    4. Munier S
    (2007) Effect of selection on ancestry: An exactly soluble case and its phenomenological generalization. Phys Rev E Stat Nonlin Soft Matter Phys 76(4 Pt 1):041104.
    OpenUrlCrossRefPubMed
  26. ↵
    1. Tsimring LS,
    2. Levine H,
    3. Kessler DA
    (1996) RNA virus evolution via a fitness-space model. Phys Rev Lett 76(23):4440–4443.
    OpenUrlCrossRefPubMed
  27. ↵
    1. Rouzine IM,
    2. Wakeley J,
    3. Coffin JM
    (2003) The solitary wave of asexual evolution. Proc Natl Acad Sci USA 100(2):587–592.
    OpenUrlAbstract/FREE Full Text
  28. ↵
    1. Charlesworth B
    (2009) Fundamental concepts in genetics: Effective population size and patterns of molecular evolution and variation. Nat Rev Genet 10(3):195–205.
    OpenUrlCrossRefPubMed
  29. ↵
    1. Bolthausen E,
    2. Sznitman A-S
    (1998) On Ruelle’s probability cascades and an abstract cavity method. Communications in Mathematical Physics 197:247–276.
    OpenUrlCrossRef
  30. ↵
    1. Santiago E,
    2. Caballero A
    (1998) Effective size and polymorphism of linked neutral loci in populations under directional selection. Genetics 149(4):2105–2117.
    OpenUrlAbstract/FREE Full Text
  31. ↵
    1. Zanini F,
    2. Neher RA
    (2012) FFPopSim: An efficient forward simulation package for the evolution of large populations. Bioinformatics 28(24):3332–3333.
    OpenUrlAbstract/FREE Full Text
  32. ↵
    1. Cohen E,
    2. Kessler DA,
    3. Levine H
    (2005) Front propagation up a reaction rate gradient. Phys Rev E Stat Nonlin Soft Matter Phys 72(6 Pt 2):066126.
    OpenUrlCrossRefPubMed
  33. ↵
    1. Goyal S,
    2. et al.
    (2012) Dynamic mutation-selection balance as an evolutionary attractor. Genetics 191(4):1309–1319.
    OpenUrlAbstract/FREE Full Text
  34. ↵
    1. Kimura M
    (1965) Attainment of quasi linkage equilibrium when gene frequencies are changing by natural selection. Genetics 52(5):875–890.
    OpenUrlFREE Full Text
  35. ↵
    1. Neher R,
    2. Shraiman B
    (2011) Statistical genetics and evolution of quantitative traits. Rev Mod Phys 83:1283–1300.
    OpenUrlCrossRef
  36. ↵
    1. Nordborg M,
    2. Charlesworth B,
    3. Charlesworth D
    (1996) The effect of recombination on background selection. Genet Res 67(2):159–174.
    OpenUrlPubMed
  37. ↵
    1. Neher RA,
    2. Shraiman BI,
    3. Fisher DS
    (2010) Rate of adaptation in large sexual populations. Genetics 184(2):467–481.
    OpenUrlAbstract/FREE Full Text
  38. ↵
    1. Neher RA,
    2. Shraiman BI
    (2011) Genetic draft and quasi-neutrality in large facultatively sexual populations. Genetics 188(4):975–996.
    OpenUrlAbstract/FREE Full Text
  39. ↵
    1. Sella G,
    2. Petrov DA,
    3. Przeworski M,
    4. Andolfatto P
    (2009) Pervasive natural selection in the Drosophila genome? PLoS Genet 5(6):e1000495.
    OpenUrlCrossRefPubMed
  40. ↵
    1. Comeron JM,
    2. Ratnappan R,
    3. Bailin S
    (2012) The many landscapes of recombination in Drosophila melanogaster. PLoS Genet 8(10):e1002905.
    OpenUrlCrossRefPubMed
  41. ↵
    1. Bomblies K,
    2. et al.
    (2010) Local-scale patterns of genetic variability, outcrossing, and spatial structure in natural stands of Arabidopsis thaliana. PLoS Genet 6(3):e1000890.
    OpenUrlCrossRefPubMed
  42. ↵
    1. Barrière A,
    2. Félix M-A
    (2005) High local genetic diversity and low outcrossing rate in Caenorhabditis elegans natural populations. Curr Biol 15(13):1176–1184.
    OpenUrlCrossRefPubMed
  43. ↵
    1. Neher RA,
    2. Leitner T
    (2010) Recombination rate and selection strength in HIV intra-patient evolution. PLoS Comput Biol 6(1):e1000660.
    OpenUrlCrossRefPubMed
  44. ↵
    1. Tsai IJ,
    2. Bensasson D,
    3. Burt A,
    4. Koufopanou V
    (2008) Population genomics of the wild yeast Saccharomyces paradoxus: Quantifying the life cycle. Proc Natl Acad Sci USA 105(12):4957–4962.
    OpenUrlAbstract/FREE Full Text
  45. ↵
    1. Braverman JM,
    2. Hudson RR,
    3. Kaplan NL,
    4. Langley CH,
    5. Stephan W
    (1995) The hitchhiking effect on the site frequency spectrum of DNA polymorphisms. Genetics 140(2):783–796.
    OpenUrlAbstract/FREE Full Text
  46. ↵
    1. Barton N
    (1998) The effect of hitch-hiking on neutral genealogies. Genet Res 72:123–133.
    OpenUrlCrossRef
  47. ↵
    1. Kaplan NL,
    2. Hudson RR,
    3. Langley CH
    (1989) The “hitchhiking effect” revisited. Genetics 123(4):887–899.
    OpenUrlAbstract/FREE Full Text
  48. ↵
    1. Messer PW,
    2. Petrov DA
    (2013) Frequent adaptation and the McDonald-Kreitman test. Proc Natl Acad Sci USA 110(21):8615–8620.
    OpenUrlAbstract/FREE Full Text
  49. ↵
    1. Durrett R,
    2. Schweinsberg J
    (2005) A coalescent model for the effect of advantageous mutations on the genealogy of a population. Stochastic processes and their applications 115:1628–1657.
    OpenUrlCrossRef
  50. ↵
    1. Rouzine IM,
    2. Coffin JM
    (2005) Evolution of human immunodeficiency virus under selection and weak recombination. Genetics 170(1):7–18.
    OpenUrlAbstract/FREE Full Text
  51. ↵
    1. Cohen E,
    2. Kessler DA,
    3. Levine H
    (2005) Recombination dramatically speeds up evolution of finite populations. Phys Rev Lett 94(9):098102.
    OpenUrlCrossRefPubMed
View Abstract
PreviousNext
Back to top
Article Alerts
Email Article

Thank you for your interest in spreading the word on PNAS.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Coalescence and genetic diversity in sexual populations under selection
(Your Name) has sent you a message from PNAS
(Your Name) thought you would like to see the PNAS web site.
Citation Tools
Coalescence in sexual populations
Richard A. Neher, Taylor A. Kessinger, Boris I. Shraiman
Proceedings of the National Academy of Sciences Sep 2013, 110 (39) 15836-15841; DOI: 10.1073/pnas.1309697110

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Request Permissions
Share
Coalescence in sexual populations
Richard A. Neher, Taylor A. Kessinger, Boris I. Shraiman
Proceedings of the National Academy of Sciences Sep 2013, 110 (39) 15836-15841; DOI: 10.1073/pnas.1309697110
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Mendeley logo Mendeley
Proceedings of the National Academy of Sciences: 116 (7)
Current Issue

Submit

Sign up for Article Alerts

Jump to section

  • Article
    • Abstract
    • Results
    • Discussion
    • Materials and Methods
    • Acknowledgments
    • Footnotes
    • References
  • Figures & SI
  • Info & Metrics
  • PDF

You May Also be Interested in

Several aspects of the proposal, which aims to expand open access, require serious discussion and, in some cases, a rethink.
Opinion: “Plan S” falls short for society publishers—and for the researchers they serve
Several aspects of the proposal, which aims to expand open access, require serious discussion and, in some cases, a rethink.
Image credit: Dave Cutler (artist).
Several large or long-lived animals seem strangely resistant to developing cancer. Elucidating the reasons why could lead to promising cancer-fighting strategies in humans.
Core Concept: Solving Peto’s Paradox to better understand cancer
Several large or long-lived animals seem strangely resistant to developing cancer. Elucidating the reasons why could lead to promising cancer-fighting strategies in humans.
Image credit: Shutterstock.com/ronnybas frimages.
Featured Profile
PNAS Profile of NAS member and biochemist Hao Wu
 Nonmonogamous strawberry poison frog (Oophaga pumilio).  Image courtesy of Yusan Yang (University of Pittsburgh, Pittsburgh).
Putative signature of monogamy
A study suggests a putative gene-expression hallmark common to monogamous male vertebrates of some species, namely cichlid fishes, dendrobatid frogs, passeroid songbirds, common voles, and deer mice, and identifies 24 candidate genes potentially associated with monogamy.
Image courtesy of Yusan Yang (University of Pittsburgh, Pittsburgh).
Active lifestyles. Image courtesy of Pixabay/MabelAmber.
Meaningful life tied to healthy aging
Physical and social well-being in old age are linked to self-assessments of life worth, and a spectrum of behavioral, economic, health, and social variables may influence whether aging individuals believe they are leading meaningful lives.
Image courtesy of Pixabay/MabelAmber.

More Articles of This Classification

Biological Sciences

  • Structural basis for activity of TRIC counter-ion channels in calcium release
  • PGC1A regulates the IRS1:IRS2 ratio during fasting to influence hepatic metabolism downstream of insulin
  • Altered neural odometry in the vertical dimension
Show more

Population Biology

  • Unraveling the seasonal epidemiology of pneumococcus
  • Measurability of the epidemic reproduction number in data-driven contact networks
  • A dynamic model of transmission and elimination of peste des petits ruminants in Ethiopia
Show more

Physical Sciences

  • Deep elastic strain engineering of bandgap through machine learning
  • Single-molecule excitation–emission spectroscopy
  • Microscopic description of acid–base equilibrium
Show more

Physics

  • Unraveling materials Berry curvature and Chern numbers from real-time evolution of Bloch states
  • Opinion: “Plan S” falls short for society publishers—and for the researchers they serve
  • Limits of multifunctionality in tunable networks
Show more

Related Content

  • No related articles found.
  • Scopus
  • PubMed
  • Google Scholar

Cited by...

  • The evolutionarily stable distribution of fitness effects
  • Coalescence 2.0: a multiple branching of recent theoretical developments and their applications
  • Natural selection helps explain the small range of genetic variation within species
  • Directional Selection Rather Than Functional Constraints Can Shape the G Matrix in Rapidly Adapting Asexuals
  • Rate and cost of adaptation in the Drosophila genome
  • Signatures of selection in the human antibody repertoire: Selective sweeps, competing subclones, and neutral drift
  • Replicability of Introgression Under Linked, Polygenic Selection
  • Selection-Like Biases Emerge in Population Models with Recurrent Jackpot Events
  • The Effect of Strong Purifying Selection on Genetic Diversity
  • Fluctuations uncover a distinct class of traveling waves
  • Correlated Mutations and Homologous Recombination Within Bacterial Populations
  • Joint Prediction of the Effective Population Size and the Rate of Fixation of Deleterious Mutations
  • Evolution of Mutation Rates in Rapidly Adapting Asexual Populations
  • Collective Fluctuations in the Dynamics of Adaptation and Other Traveling Waves
  • The Evolutionarily Stable Distribution of Fitness Effects
  • Deleterious Passengers in Adapting Populations
  • How to Infer Relative Fitness from a Sample of Genomic Sequences
  • Characterization of Genetic Diversity in the Nematode Pristionchus pacificus from Population-Scale Resequencing Data
  • The Rate of Adaptation in Large Sexual Populations with Linear Chromosomes
  • Scopus (24)
  • Google Scholar

Similar Articles

Site Logo
Powered by HighWire
  • Submit Manuscript
  • Twitter
  • Facebook
  • RSS Feeds
  • Email Alerts

Articles

  • Current Issue
  • Latest Articles
  • Archive

PNAS Portals

  • Classics
  • Front Matter
  • Teaching Resources
  • Anthropology
  • Chemistry
  • Physics
  • Sustainability Science

Information

  • Authors
  • Editorial Board
  • Reviewers
  • Press
  • Site Map

Feedback    Privacy/Legal

Copyright © 2019 National Academy of Sciences. Online ISSN 1091-6490