# The effect of gene interactions on the long-term response to selection

See allHide authors and affiliations

Edited by Joseph Felsenstein, University of Washington, Seattle, WA, and approved March 7, 2016 (received for review September 22, 2015)

## Significance

The role of gene interactions in the response to selection has long been a controversial subject; whereas some have claimed they are not relevant for adaptation, others have argued that their long-term effects are of high significance. In this manuscript, we derive simple and general predictions for the effect of gene interactions on the long-term response to selection in two extreme regimes. We show that, when the dynamics of allele frequencies are dominated by genetic drift, the long-term response is surprisingly simple, depending only on the initial components of the trait variance, regardless of the detailed genetic architecture. In the opposite regime, when selection dominates the dynamics of allele frequencies, the long-term response depends only on the genotype−phenotype.

## Abstract

The role of gene interactions in the evolutionary process has long been controversial. Although some argue that they are not of importance, because most variation is additive, others claim that their effect in the long term can be substantial. Here, we focus on the long-term effects of genetic interactions under directional selection assuming no mutation or dominance, and that epistasis is symmetrical overall. We ask by how much the mean of a complex trait can be increased by selection and analyze two extreme regimes, in which either drift or selection dominate the dynamics of allele frequencies. In both scenarios, epistatic interactions affect the long-term response to selection by modulating the additive genetic variance. When drift dominates, we extend Robertson’s [Robertson A (1960) *Proc R Soc Lond B Biol Sci* 153(951):234−249] argument to show that, for any form of epistasis, the total response of a haploid population is proportional to the initial total genotypic variance. In contrast, the total response of a diploid population is increased by epistasis, for a given initial genotypic variance. When selection dominates, we show that the total selection response can only be increased by epistasis when some initially deleterious alleles become favored as the genetic background changes. We find a simple approximation for this effect and show that, in this regime, it is the structure of the genotype−phenotype map that matters and not the variance components of the population.

The relation between an organism’s genotype and its phenotype is immensely complicated, yet quantitative genetics predicts the correlations between relatives, and the response to selection over tens of generations, based primarily on an additive model. How do interactions between genes affect the response to selection? This question is not easy to answer: Although the additive model can be represented by a few parameters, there are an enormous number of possible relationships between genotype and phenotype. Some insight may come from studying specific well-understood systems, for example, regulation of gene expression by binding of transcription factors, or folding of RNA molecules. Here, we take the alternative approach, seeking statistical regularities, while making minimal assumptions about the nature of epistasis.

There has been a long-standing debate about the effect of genetic interactions on adaptation (1⇓⇓⇓⇓⇓–7). Although some claim that they are unimportant because they contribute very little to the total genetic variance of a population, and consequently to its short-term response, others claim that their long-term effects can be substantial. At the heart of the debate is the distinction between what has been called physiological and statistical epistasis. The former is independent of the state of the population, whereas the latter is the statistical contribution of gene interactions to the trait variance, which depends on the allele frequencies (8, 9).

In the absence of new mutations, the total response to selection is limited by the initial standing variation. Selection acts on the variation present in the population, and will typically act to reduce it; the extent to which it does so depends on the allele frequencies, and on the relation between genotype and phenotype (i.e., the “genetic architecture”). Interactions between genes (epistasis) are a key component of this genetic architecture, and their effect on the response to selection has been controversial. On the one hand, artificially selected populations are usually well approximated by the infinitesimal model, which, in its simplest form, assumes infinitely many loci of small and additive effect (10, 11). This suggests that gene interactions do not play an important role. On the other hand, biology can hardly be additive, and genomes are finite. If the trait depended on a small number of additive alleles, these would quickly fix—contradicting the robust observation of sustained response to selection. One possibility is that epistasis sustains additive genetic variance for longer: Alleles that were initially deleterious or near-neutral may acquire favorable effects as the genetic background changes, “converting” epistatic variance into additive, and so prolonging the response to selection.

Note that, if the map between genotype and phenotype were arbitrary, no predictions could be made: The fittest genotype could have any value, and genotypes can be organized in any way over the fitness landscape. We will make a statistical argument, which assumes that epistasis primarily involves low-order interactions, and that these can be treated as, to some degree, independent of each other. In particular, we assume that there is no systematic tendency for alleles with a positive marginal effect to interact positively (or negatively), on average; if there were, the rate of evolution would clearly accelerate (or decelerate).

Although much effort has been spent measuring the sign of epistasis among new mutations (12, 13), or among segregating variants (14⇓–16), it is not clear that this would tell us about the natural pattern over a wider range. A population under selection can reach trait values far beyond the initial trait variance it displays, so the epistasis among variants in the existing population may not predict longer-term evolution. Here, we analyze two extreme limits: the limit of very weak selection, in which drift dominates the dynamics of the genetic variances, and the deterministic limit, where selection is the only force. We investigate the effect of epistasis on the long-term response of a population, and the mechanistic basis for these effects.

## Results

### The Interaction Between Epistasis and Random Drift.

Robertson (17) showed that, in the weak selection limit *s* is the selection coefficient at a single locus, the total response of an additive trait in a finite diploid population is simply proportional to the additive variance initially present, provided there is no dominance. He used the fact that the probability of fixation of an allele can be written as *β*, the selection favoring an allele with effect

This calculation is valid in the limit of weak selection at every locus, reflected in the fact that the probability of fixation is approximated by the first-order perturbation to neutrality. It is also only strictly valid when the trait is additive, excluding any form of interaction between genes. Indeed, it seems very hard to generalize to allow for epistasis, because, then, the calculation of the fixation probability of an allele depends on the trajectories of the alleles at all other loci it interacts with.

An alternative derivation is possible, which focuses on the change in variance components due to drift. Consider an additive trait that is determined by very many loci, each under negligibly weak selection. Drift will disperse allele frequencies, decreasing the additive variance by a factor

What is the effect of epistasis on the total response of the population? Classical quantitative genetics (e.g., ref. 18) uses statistical arguments to derive expressions for the expected components of genetic variance in an inbred population. Barton and Turelli (19) derive these using an explicit population genetic model, assuming two alleles per locus; Hill et al. (20) show how these are related to the classical expressions, and that they apply without any constraint on the distribution of allelic effects at each locus. In the following, we make use of these results to derive expressions for the expected long-term response of populations with arbitrary genetic architectures. We first focus on haploid populations and then extend our results to diploid populations.

### The Long-Term Response for Haploids.

Suppose that a haploid population initially has additive genetic variance *k*th order variance component as *β* is the selection gradient and *i* on the trait) is small compared with its variance (*Dynamics of Variance Components Under Genetic Drift*). Assuming a uniform rate of inbreeding, so that *β* is

Our result is based on the dynamics of genetic variances, but Robertson’s (17) result (reproduced at the beginning of the section) suggests an alternative view: As alleles fluctuate in frequency, they change the average effects of alleles at other loci, thereby changing their probability of fixation. In particular, as we show in *The Interaction Between Strong Directional Selection and Epistasis*, alleles that are initially deleterious but become beneficial as the response unfolds should contribute to this increase in long-term response. This view predicts that an increase in long-term response should be correlated with an increase in the number of alleles that are beneficial at the end of the response, which we confirmed through simulations (Fig. S1).

### The Long-Term Response for Diploids.

For a diploid population, with no dominance (meaning that the phenotype of the heterozygote is simply the mean of the homozygous phenotypes), epistasis has a stronger effect. From equation 45 of Barton and Turelli (19),*β* is*k*th order variance is proportional to

We have assumed constant directional selection, *β*, on the trait. However, the result readily extends to more complex forms of selection. Assuming that fitness increases monotonically with the trait value, we can transform to a scale on which log fitness depends linearly on the trait. On this new scale, the variance components will be different, but our general results show how the total response depends on these components, when measured on the appropriate scale.

### The Interaction Between Strong Directional Selection and Epistasis.

We now turn to the deterministic case, where selection is the predominant process. Imagine that, in the initial haploid population, alleles at *n* loci have marginal effects *β*, they have selective advantage

Typically, the initial distribution of allele frequencies will be U-shaped, as expected for a population in a stationary distribution with low mutation rates per site (*N*_{e}*μ* < 1 for haploids). Thus, an enormous range of variation can be released as initially rare alleles increase, and are brought together by recombination. We will assume that the population is so large that alleles that are initially deleterious, but will ultimately be favored, will not be lost from the population. This is not realistic, but it is equivalent to assuming a low rate of recurrent mutation, such that alleles are continually reintroduced, so that, if they are favored, they will become common enough to increase deterministically. We assume that epistasis is not so strong that the marginal effects of alleles fluctuate more rapidly than the timescale needed for them to establish.

If most alleles are very rare or very common, they will spend a relatively short time passing from low to high frequency. Thus, we can caricature the process as a series of separate substitutions; at each substitution, the marginal effects of all of the other alleles change. This will change the time at which favorable alleles substitute, and may cause them to lose their advantage entirely. This caricature is not quite the same as the Strong Selection Weak Mutation approximation (21), in which populations are assumed to be fixed for a single genotype, and to evolve by fixation of new mutations, because rare alleles still follow a complex dynamic. In reality, substitutions may overlap, and their effects change continuously. This limiting scenario is helpful in understanding the following argument, but, because we focus on the final outcome, we do not need to assume that alleles substitute one by one. All we need to know is what is the difference between the final and initial effects of an allele.

Assuming only pairwise interactions, two alleles per locus, and linkage equilibrium throughout (see *Materials and Methods*), the ultimate change in mean is*i* has selective advantage *i* on the trait mean that changes linearly with the frequencies of all of the other alleles (see *Materials and Methods*). Recall that we labeled alleles such that, initially,

This calculation is based on the assumption that additive effects cannot change sign; if interactions are strong enough to turn an initially beneficial allele,

The probability that an allele switches sign as the remaining favorable alleles increase is simply the probability that the sum of all interactions is less than the negative of its main effect: *i*th locus, and the *ε* =

We can now calculate the average probability of an allelic reversal by integrating over the distribution of the main effects, **2**). This is because, for each allele that reverses its sign, an interaction is effectively removed, thereby reducing *A*). This expression should converge to the one assuming independence at low values of *A*). Each of these allelic reversals (or “flips”) will translate into some change in trait value compared with an additive population. For each allele, this increase will be **2** and **4**, we see that *B*). Note that, whereas only the *n* initially favored alleles that lose their advantage will be eliminated. The previous calculations are upper bounds for the effect of epistasis on the long-term response of a population, because we considered that

When drift dominates, the ultimate response depends only on initial variance components. When selection dominates, the effect of epistasis on the ultimate response depends on reversals in selection on individual alleles, so that a different genotype is reached; this depends on the ratio of SDs of epistatic versus main effects, multiplied by the square root of the number of interacting alleles that sweep from low to high frequency (*Measuring Strength of Interactions from F2 Crosses* and Fig. S2). This implies that the initial epistatic variance does not predict the long-term response of the population, because the population may contain strong interactions (large

## Discussion

The role of epistasis in evolution has long been controversial. Wright (1) argued that epistasis would cause populations to become trapped at local “adaptive peaks” and proposed that a “shifting balance” between selection and random drift could allow them to explore alternative peaks, so as to move toward the global optimum. This theory motivated much work on the structure of natural populations, yet it remains unclear whether adaptation is significantly slowed by trapping on local peaks (1, 22). Mayr (23) criticized the supposed neglect of epistasis by “bean-bag genetics,” provoking a robust defense by Haldane (24). More recently, it has been proposed that epistatic variance can be “converted” into additive variance following a bottleneck, aiding adaptation (25). The failure of large genome-wide association studies to assign much heritable variance to specific loci (the so-called “missing heritability”) has been attributed to epistasis (26, 27), although this explanation is unnecessary (28). Overall, the practical success of the additive model in quantitative genetics appears hard to reconcile with the strong molecular interactions between genes.

We investigate how epistasis affects the response to selection, by asking a simple and clearly defined question: By how much does epistasis influence the ultimate change in the mean of a selected trait? We compare the effects of directional selection on two populations that initially have the same genetic variance for a trait; in one, inheritance is strictly additive, whereas, in the other, there can be strong gene interactions. We find simple results in the two extreme cases, where either drift or selection dominate.

In the first case, where drift is stronger than selection on individual alleles, the outcome can be predicted from the initial variance components. This seems remarkable but can be understood as a perturbation to neutrality: When selection is spread over very many loci, its effect on any one locus is weak relative to drift, and so the variance components are hardly perturbed by selection. This is an extension of the infinitesimal model to nonadditive inheritance (29). For haploids, the total selection response is proportional to the initial genotypic variance (including both additive and nonadditive components). For diploids, *k*th-order components of variance have effect multiplied by *k*th-order epistasis is proportional to the product of *k* loci, and so cannot be large (10).

In contrast, when selection is strong relative to drift

When drift dominates, our conclusions follow simply from the variance components, without further assumptions. When selection dominates, we assume that epistasis is not systematically biased toward (or against) interactions between favorable alleles. If a systematic bias is allowed, then epistasis can have an arbitrarily large effect. To see this, imagine a trait that is some function *z*. If

Throughout this paper, we assumed that the populations remained at linkage equilibrium. However, linkage disequilibrium (LD) will be generated in a number of ways. Drift alone will produce some LD, although this should be symmetric around no disequilibrium and so should not, on average, have strong effects. In the selection-dominated regime, however, LD should be generated consistently in a directional fashion. Nagylaki’s theorem (31) states that, under weak selection, as we assume here, the population is guaranteed to approach and remain close to linkage equilibrium. Furthermore, the fact that we assume that interactions have no preferred direction should further hinder the buildup of LD. Nevertheless, LD will affect the response: A truly infinite population with no recombination (full linkage) is guaranteed to find the global peak, and linkage can only affect the rate at which this is approached. In this sense, linkage can help because recombining populations can get “trapped” at local peaks.

Do gene interactions and epistasis affect the long-term response? In the drift-dominated regime, which can apply even when selection on the trait is strong for polygenic traits, the long-term response is merely proportional to the initial genetic variance. Epistatic architectures can reach higher trait values compared with additive populations of the same additive variance, because the former necessarily harbor more genetic variance. However, this effect will be small and the response slower because epistatic variance typically represents a small fraction of the total genetic variance. In the selection-dominated regime, it is the specifics of the interaction structure that matter. Substantial increases of the long-term response can be reached when interactions are strong and induce allelic reversals, but the initial epistatic variance of the population is not predictive of this (see *Deterministic Limits as a Function of Allele Frequency Distribution* and Figs. S3 and S4). These results set expectations for the effect of epistasis on the long-term response under directional selection and help reconcile the success of the infinitesimal additive model (11) with the biological fact that genetic interactions are pervasive in nature.

## Materials and Methods

### Simulations.

Except where noted, we assume a haploid population of effective size *β*, and that the population is close to linkage equilibrium. We ignore the environmental component of variance, because this does not affect the response to directional selection. Our results for finite populations under weak selection are independent of the trait architecture. However, for simulations, we assume that the trait of a diallelic haploid genotype of *n* loci

We assume directional selection for increased trait values such that the fitness of a particular genotype

### Finite Population Simulations.

The simulations were performed by keeping track only of allele frequencies, thereby abolishing LD. Every generation, allele frequencies are updated according to the deterministic expectation, *N* copies is then sampled with this probability of success for each locus, and the allele frequencies are updated according to this sample. We repeat this procedure until no genetic variation exists at any locus, i.e., the population is fixed for one genotype.

### Deterministic Numerical Simulations.

Every iteration, we iterate the deterministic recursion

### Fitness Landscape Analysis.

To find the absolute maximum trait increase, we associate a trait value

## Dynamics of Variance Components Under Genetic Drift

For a haploid population with *n* biallelic loci, any trait can be defined as

If binomial sampling (genetic drift) is the main force determining dynamics at every locus, and the population is random mating and at linkage equilibrium, we can write, for the expectation at the next generation (for multiple realizations of the same process) under the Wright-Fisher model,

Because

Each of the **S1**. With a bit of algebra (cumbersome using this notation, but see ref. 19), we can see that the expectation for *i* contributes to all of the higher-order components of variance (

For example, for a genetic architecture involving only pairwise interactions, the contribution of a locus to additive variance is

## Calculating Expectations for the Initial Components of Variance over the Allele Frequency Distribution

The initial epistatic variance present in a population can be written as

## Deterministic Limits as a Function of Allele Frequency Distribution

The long-term response of a population in the deterministic regime is mostly determined by the fitness landscape the population evolves on (see *The Interaction Between Strong Directional Selection and Epistasis*). When initial allele frequencies are not vanishingly rare, the long-term response will be below the limits we show in *The Interaction Between Strong Directional Selection and Epistasis*, because the population starts at a higher trait value. Fig. S3 shows the long-term response as a function of the allele frequency distribution for strong

## Measuring Strength of Interactions from F2 Crosses

One can measure the critical parameter

## Acknowledgments

The authors thank Jitka Polechová and Michael Turelli for helpful comments. This work was supported by European Research Council Advanced Grant ERC-2009-AdG-250152. This project has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under Grant Agreement 618091 Speed of Adaptation in Population Genetics and Evolutionary Computation (SAGE).

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. Email: tiago.paixao{at}ist.ac.at.

Author contributions: T.P. and N.H.B. designed research, performed research, and wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1518830113/-/DCSupplemental.

Freely available online through the PNAS open access option.

http://www.pnas.org/preview_site/misc/userlicense.xhtml## References

- ↵.
- Wright S

- ↵
- ↵
- ↵
- ↵
- ↵.
- Mäki-Tanila A,
- Hill WG

- ↵
- ↵.
- Lynch M,
- Walsh B

- ↵.
- Cheverud JM,
- Routman EJ

- ↵
- ↵.
- Weber KE,
- Diggins LT

*Drosophila melanogaster*at two population sizes. Genetics 125(3):585–597 - ↵.
- Costanzo M, et al.

- ↵
- ↵.
- Kelly JK

- ↵.
- Visser JAGMD,
- Hoekstra RF,
- Ende HVD

*Chlamydomonas*. Proc R Soc Lond B Biol Sci 263(1367):193–200 - ↵
- ↵.
- Robertson A

- ↵.
- Kempthorne O

- ↵
- ↵
- ↵
- ↵
- ↵.
- Mayr E

- ↵.
- Haldane JBS

- ↵
- ↵.
- Zuk O,
- Hechter E,
- Sunyaev SR,
- Lander ES

- ↵
- ↵
- ↵.
- Fisher RA

- ↵
- ↵.
- Nagylaki T

## Citation Manager Formats

## Sign up for Article Alerts

## Jump to section

- Article
- Abstract
- Results
- Discussion
- Materials and Methods
- Dynamics of Variance Components Under Genetic Drift
- Calculating Expectations for the Initial Components of Variance over the Allele Frequency Distribution
- Deterministic Limits as a Function of Allele Frequency Distribution
- Measuring Strength of Interactions from F2 Crosses
- Acknowledgments
- Footnotes
- References

- Figures & SI
- Info & Metrics