Evolutionary learning of adaptation to varying environments through a transgenerational feedback
See allHide authors and affiliations
Contributed by Stanislas Leibler, August 22, 2016 (sent for review June 1, 2016; reviewed by Terence Hwa and Oliver J. Rando)

Significance
Phenotypic diversification, a form of evolutionary bet hedging, is observed in many biological systems, ranging from isogenic bacteria to cell populations in a tumor. Such phenotypic diversity is adaptive only if it matches the statistics of local environmental variation. Often, the timescale of environmental variation can be much longer than the lifespan of individual organisms. Then how could organisms collect long-term environmental information to modulate their phenotypic diversity? We propose here a general mechanism of “evolutionary learning,” which could overcome the mismatch between the two timescales. The learning mechanism can in principle be realized through known molecular processes of epigenetic inheritance, suggesting experimental directions to probe the evolutionary significance of such processes.
Abstract
Organisms can adapt to a randomly varying environment by creating phenotypic diversity in their population, a phenomenon often referred to as “bet hedging.” The favorable level of phenotypic diversity depends on the statistics of environmental variations over timescales of many generations. Could organisms gather such long-term environmental information to adjust their phenotypic diversity? We show that this process can be achieved through a simple and general learning mechanism based on a transgenerational feedback: The phenotype of the parent is progressively reinforced in the distribution of phenotypes among the offspring. The molecular basis of this learning mechanism could be searched for in model organisms showing epigenetic inheritance.
All biological organisms have to adapt to variations in their environment. This is particularly challenging if these variations are not regular or present strong random components. In this case, adaptation has to rely on the information gathered over time, allowing organisms to anticipate such environmental changes.
One class of adaptation mechanisms depends directly on detection of undergoing changes in the environment, such as a sudden shift in temperature, chemical composition, or ecological structure. Sensing and signal transduction machinery allows the organism to change the course of development or induce a direct phenotypic transformation. Such “phenotypic plasticity” can be implemented on the level of an individual organism, at the cost of maintaining an efficient detection and signal transduction system. Examples of phenotypic plasticity are ubiquitous, e.g., the switching of metabolic pathways in microorganisms upon changes of food source (1) or the control of flowering time in plants according to temperature and photoperiod modulations (2).
Another class of adaptation mechanisms relies on “phenotypic diversification,” which acts primarily on the population level. Instead of transitioning to a new phenotype upon a detected change in the environment, adaptation is achieved by constantly sustaining a distribution of various phenotypes in the population. The latter is implemented by individual organisms having different propensities for expressing different phenotypes, which can be mathematically described by an associated probability distribution. At a given time, some of the phenotypes are better adapted to the prevailing local environment than the others. The simultaneous presence of multiple phenotypes can increase the population’s long-term growth rate in a varying environment. Such adaptive phenotypic diversification is often referred to as “bet hedging” (3, 4). Many observed cases of phenotypic diversity have indeed been suggested as examples of bet hedging (5, 6), including bacterial populations that generate a low frequency of slowly growing persisters that are tolerant to antibiotics (7), annual plants that produce a fraction of seeds that remain dormant underground for multiple years while unaffected by unpredictable weather (8⇓–10), and insects that undergo variable lengths of diapause that halts reproductive development to cope with uncertain ecological conditions (11).
Phenotypic diversification is proposed to be useful when environmental cues are unreliable or phenotypic plastic responses are costly or ineffective (12). (Theoretical studies of population dynamics in fluctuating environments, including bet-hedging strategies, often assume growing populations with density-independent fitness values. For recent work on the effect of environmental fluctuations on growth-limited populations, applicable to phenotypic plasticity, see, e.g., ref. 13.) It has been argued that the optimal distribution of phenotypes in the population is typically determined by the frequency of encountering different environmental conditions (14). Therefore, the bet-hedging organism requires information not only about the present environment, but also about the statistics of the past environment. Such long-term environmental information can be acquired only if the information is gathered, stored, and then transmitted over consecutive generations.
If the knowledge about the past is used to adjust the phenotypic diversity of the population in a favorable direction, then the organism may be said to have “learned” to adapt to the varying environment. Fundamentally, how can such “evolutionary learning” be achieved? Ideally, the mechanism of learning should function for a wide range of environmental variations. In particular, when the environment is stable except for rare shifts, the phenotype distribution should narrow down through this learning mechanism to the most favorable phenotype during each environmental condition. On the other hand, when the environment changes frequently and irregularly, the phenotype distribution should broaden out to allow bet hedging.
We use a theoretical model to show that the evolutionary learning of adaptation to varying environments can be achieved through a simple and very general mechanism. This mechanism acts on the level of individual organisms without relying on environmental signals. It is based on a positive feedback that enhances the probability of the offspring to express the same phenotype as the parent. Our results do not depend on particular molecular details of this “transgenerational feedback.” Many known examples of molecular processes with wide-ranging timescales (15, 16) can potentially support the learning mechanism postulated by our model. We describe some of them below, but first we outline the main idea of the learning mechanism itself.
Model
For simplicity, consider a population of isogenic asexual organisms that can express alternate phenotypes. Each individual inherits information determining the probability of expressing different phenotypes and randomly expresses a phenotype according to that probability distribution. For concreteness, we imagine that the phenotypic choice is based on a bistable epigenetic switch, which takes variable molecular inputs and generates one of two possible outputs that determines a phenotype
Schematic view of the evolutionary learning mechanism. (A) Each individual (solid circle) randomly expresses a phenotype
The evolutionary learning mechanism, which we propose, is based on the following positive feedback: If an individual exhibits phenotype
Through this learning mechanism, the phenotype probability distribution
SI Appendix
Learning Under Nonextreme Selection.
Here we derive the dynamic changes of the phenotype probability distribution
Recall that there can be multiple alternate phenotypes labeled by
When the environment remains
When the environment changes frequently,
Note that, when the gap between fitness values of different phenotypes vanishes, the adaptation time becomes increasingly longer. This result can be seen by linearizing Eq. S4. Assume that the environment is
Optimal Learning Rate.
Here we look for the optimal learning rate that maximizes the population growth for given patterns of environmental variations. Consider first the case of extreme selection. The asymptotic growth rate
Let us evaluate
First, suppose the environment changes periodically, so that
Asymptotic growth rate
Adaptation timescale
Further, suppose
A special case is when the environment is independent and identically distributed (i.i.d.) with probabilities
For the more general case of nonextreme selection, similar results are found by numerical simulations. Consider the same examples of environmental variation patterns as above. The growths of populations having different learning rates are compared in Fig. S3 A–C. We can see that the optimal learning rate is very close to the one found in the extreme selection case. For example, in Fig. S3A where the environment is i.i.d., we see that the smaller the learning rate η is, the faster the population grows, in agreement with Fig. S1D. Similarly, in Fig. S3 B and C, it can be inferred that the optimal learning rate lies in between
Simulation of population growth with different learning rates, under nonextreme selection with given patterns of environmental variations. The fitness matrix is chosen to be the same as in Fig. 2 of the main text; i.e.,
Finally, we use numerical simulations to study the effect of varying environmental statistics. As an example, Fig. S3D shows the case where the pattern of environmental variations alternates between being i.i.d.
Results
To better understand how the transgenerational positive feedback functions, consider first the case where environmental selection is extremely strong. Because only individuals with the phenotype that matches the environment may survive and produce offspring, those offspring will have the same
Now, suppose that the environment switches repeatedly during that timescale
We have thus shown that, in a rapidly fluctuating environment, the proposed learning mechanism enables the population to reach the optimal phenotype distribution for bet hedging. Eq. 3 also implies that the phenotype distribution
The evolutionary learning mechanism effectively allows the organism to gather information about the past environment. This process is possible because the learning mechanism gives every individual an effective memory of its ancestral phenotypes. Starting from an individual in the tth generation and following its lineage backward, let the phenotype of its n-generation ancestor be
Note that in the opposite regime, in which an environment lasts for a time
Our model can be generalized to include nonextreme environmental selection, in which case individuals with a mismatched phenotype can also produce offspring. According to the learning mechanism, those offspring will have a phenotype distribution
Those analytical results are confirmed by numerical simulations. Examples of such simulations for nonextreme selection are shown in Fig. 2. As predicted, the evolutionary learning mechanism allows the population to dynamically adjust the phenotype distribution, depending on the pattern of environmental variations. When the environment changes frequently, i.e.,
Simulation of a learning population in a fluctuating environment. The environment alternates between two conditions
In addition, if the timescale of environmental changes is comparable to the adaptation timescale
The adaptation timescale
Note that the process of evolutionary learning cannot be thought of as adaptation through competition between groups of individuals with different and stably inherited phenotype probability distribution
Discussion
The learning mechanism that we propose here can potentially be realized by means of epigenetic inheritance. The essential feature of the learning mechanism is the progressive reinforcement of the parent phenotype in the distribution of offspring phenotypes. Such parent-dependent changes of phenotype distribution have been characterized, for example, in the Agouti viable yellow (
Many other examples of such transgenerational effects are found in organisms across a wide range of taxa (21). There are well-established molecular processes that enable transgenerational epigenetic inheritance (15, 22), such as DNA methylation, histone modification, small RNA interference, and protein structural templating. Any of those epigenetic processes can in principle serve as the basis of the proposed learning mechanism. In addition, if the environmental statistics remains stable, the adaptation achieved through the learning mechanism may be further stabilized by other molecular processes acting on even longer timescales, including genetic adaptations such as contingency loci, copy number changes, and mutations (15, 16).
In conclusion, we have introduced a general evolutionary learning mechanism for adaptation to varying environments. It is based on a transgenerational feedback that can use well-established molecular processes of epigenetic inheritance. It is our hope that this theoretical work will stimulate experimental searches for such a learning mechanism in organisms exhibiting adaptive phenotypic diversification.
Materials and Methods
Here we derive the dynamic changes of the phenotype probability distribution
Long-Lasting Environment and Plastic Phenotype Distribution.
Under extreme selection, only individuals with a phenotype that matches the environment could survive and produce offspring. When the environment is
The explicit expression of
Therefore, the probability for the favorable phenotype
which increases with time and approaches 1 when
As the environment remains to be
where
with an instantaneous growth rate
Frequently Changing Environment and Optimal Bet-Hedging Solution.
When the environment changes frequently and irregularly, the phenotype probability distribution
For the second equality we used the condition of extreme selection, which means
To see that this steady distribution is in fact the optimal bet-hedging solution, note that the asymptotic growth rate of a population using a constant phenotype distribution
where the second equality holds under extreme selection,
where
For consistency, we have to check that the fluctuation around the steady distribution is small. During the period of an environment that lasts for a typical time
Because the environmental changes are random,
Learning as Transgenerational Phenotypic Memory.
Here we derive how an individual’s phenotype probability
For large t, the initial probability
Simulation of the Learning Mechanism.
The adaptation of the phenotype distribution through the learning mechanism is simulated as follows: A population of size N is created. Each individual carries a probability distribution
Although the population size is constantly normalized, the instantaneous growth rate of the population can be estimated by
where the average
Those estimates are valid in the limit of a large sample size N.
Acknowledgments
We thank David Jordan, Edo Kussell, Harmit Malik, Eric Miska, Luca Peliti, Oliver Rando, Olivier Rivoire, and Alexander Tarakhovsky for numerous discussions and encouragements. This research has been partly supported by grants from the Simons Foundation (to S.L.) through Rockefeller University (Grant 345430) and the Institute for Advanced Study (Grant 345801). B.X. is funded by the Eric and Wendy Schmidt Membership in Biology at the Institute for Advanced Study.
Footnotes
- ↵1To whom correspondence may be addressed. Email: bkxue{at}ias.edu or livingmatter{at}rockefeller.edu.
Author contributions: B.X. and S.L. designed research; B.X. performed research; and B.X. and S.L. wrote the paper.
Reviewers: T.H., University of California, San Diego; and O.J.R., University of Massachusetts Medical Center, Worcester.
The authors declare no conflict of interest.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1608756113/-/DCSupplemental.
References
- ↵
- ↵
- ↵.
- Slatkin M
- ↵
- ↵.
- Simons AM
- ↵
- ↵.
- Balaban NQ,
- Merrin J,
- Chait R,
- Kowalik L,
- Leibler S
- ↵
- ↵.
- Simons AM
- ↵
- ↵
- ↵.
- Kussell E,
- Leibler S
- ↵
- ↵
- ↵
- ↵
- ↵.
- Hertz J,
- Palmer RG,
- Krogh AS
- ↵
- ↵.
- Rakyan VK, et al.
- ↵
- ↵
- ↵