How adaptive immunity constrains the composition and fate of large bacterial populations
See allHide authors and affiliations
Edited by Luca Peliti, Santa Marinella Research Institute, Rome, Italy, and accepted by Editorial Board Member Curtis G. Callan, Jr. June 15, 2018 (received for review February 28, 2018)

Significance
Complex communities of microorganisms are important ecological forces and phages are integral components of microbial populations. Among the many bacterial defense mechanisms against phages, CRISPR-Cas is unique in its ability to learn from past infections by storing pieces of phage DNA (called spacers) in its own genome to neutralize future infections. Our work shows that the rank abundance distribution of spacers across the whole bacterial population, which is readily accessed using genomic sequencing, may provide a phenomenological observable that reflects important structural aspects of bacterial populations. This study lays out a path toward a phenomenological framework for understanding microbial dynamics and may provide insights into complex and diverse natural populations where microscopic modeling is plagued by overparameterization and overfitting.
Abstract
Features of the CRISPR-Cas system, in which bacteria integrate small segments of phage genome (spacers) into their DNA to neutralize future attacks, suggest that its effect is not limited to individual bacteria but may control the fate and structure of whole populations. Emphasizing the population-level impact of the CRISPR-Cas system, recent experiments show that some bacteria regulate CRISPR-associated genes via the quorum sensing (QS) pathway. Here we present a model that shows that from the highly stochastic dynamics of individual spacers under QS control emerges a rank-abundance distribution of spacers that is time invariant, a surprising prediction that we test with dynamic spacer-tracking data from literature. This distribution depends on the state of the competing phage–bacteria population, which due to QS-based regulation may coexist in multiple stable states that vary significantly in their phage-to-bacterium ratio, a widely used ecological measure to characterize microbial systems.
Complex communities of microorganisms are important ecological forces in almost every environment from hot springs (1) to humans (2⇓⇓⇓–6). Phages, viruses which infect bacteria, are integral components of microbial populations: Phage predation has been shown to strongly influence bacterial evolution, diversity, and numbers (7, 8). To counter phages, bacteria have evolved many and complex immune mechanisms (9). CRISPR-Cas is one such defense mechanism which is both adaptive and heritable; i.e., it not only learns from past infections but also passes this knowledge to future generations. Many models have addressed the effects of CRISPR-Cas on microbial populations, but a conceptual vacuum remains: What experimental features of natural populations should be measured to compare with model predictions?
CRISPR-Cas machinery for adaptive immunity allows bacteria to acquire unique genetic elements (called spacers) from prior phage encounters to specifically target and evade recurrent attacks. The spacers are tens of nucleotides long and at each encounter may be acquired from any of the hundreds of possible locations on the infecting phage genome (called protospacers). Since individual spacers are distinguishable and because they are integrated in the genome, the result is a lineage of cells that can be identified by its spacer(s). The fate of an individual lineage, however, is subject to large fluctuations due to the stochastic dynamics of individual bacteria in a large rapidly evolving population. Experiments show that the abundance of individual spacers in a bacterial population under phage attack is indeed highly dynamic and varies over several orders of magnitude from one spacer to the next (7, 10⇓⇓–13). This leads to a natural question: What controls spacer diversity and abundance? In other words, how does recurrent phage attack alter the structure and composition of interacting spacer-marked lineages in a bacterial population?
Several previous models have addressed the role and dynamics of observed diversity of spacer types (14⇓⇓⇓⇓⇓⇓–21) in a qualitative way: (i) They have shown how system parameters such as phage adsorption rate (21), spacer acquisition rate (16, 21), and phage mutation and recombination (18) affect spacer diversity; (ii) they have shown how increasing diversity promotes population stability (16, 19); and (iii) they have reproduced the observed asymmetry in diversity along the locus in natural populations (14, 15, 18) by modeling biased acquisition at the leader end of the CRISPR locus. Most recently, Bradde et al. (20) showed a connection between spacer acquisition rates and spacer effectiveness to spacer diversity. To make a direct connection with data, we analyzed sequencing data from Paez-Espino et al. (12), a coevolution experiment with phage and bacteria which tracked spacer dynamics. Our analysis shows that despite rapid turnover of individual spacer types, the spacer rank-abundance distribution quickly stabilizes, which is a striking observation that previous models have not addressed.
Recently, similar questions about diversity in the adaptive immune system have gained traction in the context of vertebrates which generate and maintain a large population of specialized immune cells that, as a group, contain an extremely diverse set of binding sites that individually recognize different viruses. Like spacer abundance, the abundance of individual binding sites is highly variable (22⇓–24). This observation has led to the suggestion that a broad abundance distribution of binding sites may strike a balance between generating a rapid response against likely invaders and capturing new invaders (24). Although this is hard to test in vertebrates, laboratory experiments that alter bacterial population composition synthetically show that bacteria are more successful at fending off phages as their population-level spacer diversity increases (25). How the dynamics of individual bacterial lineages shape spacer diversity and how diversity in spacer sequences or types relates to diversity in spacer abundances remain unanswered.
Beyond the role of individual spacer lineages in shaping population structure, recent experiments have shown that bacterial populations exert top–down control on the CRISPR system: Two species of bacteria have been observed to regulate their CRISPR-Cas systems in response to cell density (26, 27). Interestingly, this control acts via the quorum sensing pathway, a pathway which also controls population-level responses such as virulence. This suggests a different paradigm where the effects of CRISPR-Cas need to be considered at the collective population level, rather than at the level of individual cells. Previous population-level models have not addressed this effect (20, 21, 28⇓⇓⇓⇓⇓⇓⇓⇓⇓–38), and modeling efforts addressing CRISPR-Cas regulation have focused on the relevant gene circuits and production of transcribed spacers called CRISPR RNAs (crRNAs), not on the population-level effects of regulation (39⇓–41).
We build a model that addresses the two aforementioned fundamental and unaddressed aspects of the CRISPR-Cas system: (i) Our model shows how stable rank-abundance distributions may arise despite rapid turnover of individual spacer types that are identical in their ability to provide immunity, and (ii) our model shows that density-dependent regulation of CRISPR-Cas admits a bistable state at the population level where the phage–bacterial population can be stable with two different configurations under the same external conditions. We further argue how having the knowledge of spacer diversity along with bistable states may shed light on the fate of natural microbial populations.
Model
Adaptive immunity in bacteria is controlled by a set of Cas proteins, which in a nutshell accomplish two different tasks: (i) When an invading phage inserts its genome into a bacterial cell but is not successful in killing the bacterium, Cas proteins take a small piece of phage genome and insert it into the bacterial genome at a specific site called the CRISPR locus. (ii) During a subsequent phage attack, the bacterium can use the information stored in the CRISPR locus to recognize the invading phage and neutralize it. Multiple spacers can be stored at a CRISPR locus, providing a genetic record of immunization that is inherited during DNA replication. The immunization record in principle can be read via next-generation sequencing and provides a rich presence/absence observable: the binary variable
(A) CRISPR locus. Small (∼30 nt) samples of invasive phage DNA called spacers (colored rectangles) are incorporated into the CRISPR genetic locus. Spacers are separated by short (∼30 nt) sequences called repeats (black diamonds). Multiple spacers can be stored at a CRISPR locus, resulting in a genetic record of immunization (42). In our analysis of the experimental data shown in Fig. 3 A–C, we identify spacers with a type i, a locus position j, and a bacterium k. (B) In our model, bacteria and phages interact in a chemostat (flow cell) with a constant inflow and outflow rate F. Nutrients flow into the chemostat at a fixed concentration
We model the abundance of the
To capture the inherent stochastic nature of spacer dynamics, we model the probability distribution
Our stochastic model has a corresponding mean-field or population-level description for average values of the different random variables, each represented by the same symbol as their corresponding random variable. At the mean-field level, all of the spacer-containing bacteria can be pooled into a single variable
Results
Mean-Field Steady States.
For phages to invade a bacterial population that is stable in a chemostat, their probability of successfully infecting bacteria without the benefits of adaptive immunity,
(A) Bacteria, phage, and nutrients at steady state as a function of the probability of phage success
Much like increasing
In contrast to the total bacterial population, ν first increases as e increases but reaches a maximum at an intermediate value of e (Fig. 2C). This can be understood as ν qualitatively tracking the phage population size, which shows a peak at intermediate spacer effectiveness (SI Appendix, Fig. S14). Qualitatively, this behavior is similar to the total phage population having a nonmonotonic behavior with increasing
Spacer Rank-Abundance Distributions.
Even at steady state with stable populations of phage and bacteria, the individual spacer abundances in the bacterial population are highly dynamic and vary significantly over time. This has been seen most directly in laboratory experiments (12, 13) but has also been observed in natural samples such as a hypersaline lake (43), human saliva (44), and acid mine drainage (10, 15). This continual spacer turnover is influenced by bacterial reproduction and death, spacer acquisition, and spacer loss, all of which have been observed in natural and laboratory populations.
In our stochastic model, we keep track of individual spacer acquisition and loss events. Not surprisingly, we find that spacer abundances fluctuate over time (Fig. 3 D and E and SI Appendix, section 4.1). However, we also find that the spacer rank-abundance distribution reaches a stationary state from an initial state with no spacers, shown in Fig. 3F and SI Appendix, section 2.5. Not only does the spacer distribution in our simple model reach a stationary state while individual spacers turn over rapidly, it also shows 1,000-fold variation in spacer abundances despite the fact that all spacers are functionally identical in our model and provide resistance to the same phage. The exact shape of the distribution depends on various parameters (SI Appendix, section 2.5) and is well approximated by a gamma distribution which has been used to describe species abundance distributions in ecology (45⇓⇓–48) (SI Appendix, section 2.6).
Comparison of spacer distributions between simulations (D–F) and experimental data from ref. 12 (A–C). (A and D) Subset of spacer-type trajectories over time in generations for experimental data (A) and simulated data with
To test predictions with data, we analyzed experimental data reported by Paez-Espino et al. (12) from a bacterial population under constant phage attack. We summarized their raw sequencing data into the presence/absence tensor
In general our analysis highlights that individual spacer identity and abundance may not themselves be important but collectively may provide a time-invariant observable in the form of steady-state rank-abundance distributions. And somewhat counterintuitively, spacers need not be functionally different in their effectiveness or acquisition probability to get large variability in spacer abundances.
Regulation of cas Expression.
Merely having an effective spacer, however, is not enough: To effectively neutralize phage, bacteria need to express cas genes when under attack. Experimental work has shown that bacteria can regulate their CRISPR-Cas systems in response to cell density, controlled under the quorum sensing pathway (26, 27). A cell increases its expression of Cas proteins at high cell density in response to a high concentration of quorum sensing molecules and down-regulates its expression of Cas proteins at low cell density. To understand the role of cell density-dependent regulation of the CRISPR-Cas system, we made spacer effectiveness e to be a function of cell density,
Notably, the dependence of spacer effectiveness on population size changes both the number and value of the steady-state fixed points. We find that the whole bacteria–phage–nutrient system undergoes a saddle-node bifurcation and is bistable for a range of parameters. The bistability results from a positive feedback that is established under quorum sensing (QS) control of the CRISPR-Cas system but is absent otherwise: The total bacterial population size increases with increasing spacer effectiveness, and in turn effectiveness increases as CRISPR-Cas is up-regulated by higher bacterial density (SI Appendix, Fig. S20). There are various parameters that can be used as the bifurcation parameter, but one that can be easily controlled in experimental systems and perhaps plays a role in natural systems is the normalized chemostat flow rate
Bacterial upregulation of cas gene expression at high density can induce bistability (A, yellow shaded area) as a function of the normalized chemostat flow rate
This bistable system may also exhibit hysteresis, which may have important ecological consequences, possibly functioning as a memory of past phage pressure or providing a switch-like behavior between “on” and “off” states of the CRISPR system. Not only can the phage-to-bacterium ratio [called virus-to-prokaryote ratio (VPR)] be significantly different between the two states but also spacer composition and diversity can be quite different (Fig. 4B and SI Appendix, section 2.5).
Our model exhibits bistability quite generically for large parameter ranges but requires choosing an appropriately steep function for effectiveness (discussion in SI Appendix, section 5).
Discussion
CRISPR-Cas is a unique system in that adaptive immunity is both hereditary and acquired. Its impacts on population dynamics are thus unlike any other immune system, and experimental observations must be interpreted with theory specific to the CRISPR-Cas system. Our analysis of experimental data yielded a striking result: Rank-abundance spacer distributions are stable over time, paralleling population-level stability, despite what looks like ongoing turnover in the abundances of individual spacer types. This overall stability suggests a need for a population-level approach in which questions about spacer diversity are addressed alongside questions about CRISPR-Cas regulation. In this framework, communities of bacteria function collectively more like a single organism capable of complex signaling and behavior than like a collection of individual bacteria undergoing selective dynamics.
In this work, we propose and analyze a simplified model of interacting bacteria and phage in which bacteria regulate the CRISPR-Cas system in a density-dependent way, which in turn controls the spacer-marked clonal composition of the bacterial population under phage attack. We find that the bacteria–phage population exhibits bistability with the possibility of coexistence between two ecologically different states. These two stable states may differ by orders of magnitude in the phage-to-bacterium ratio as well as differing in the spacer diversity and composition of the population. Our model also provides a framework where large variability in spacer abundance may arise due to population dynamics rather than due to individual parameters of spacers, since our model is neutral with no selective advantage for particular spacers. And finally, our model shows how a stable spacer rank-abundance distribution may emerge while individual spacer types turn over rapidly.
Sequencing provides an easy way to track spacers, which in turn provide a direct record of past interactions between a bacterium and its phages. Although there has not been much effort toward spacer tracking in individual bacteria, population-level spacer dynamics are becoming readily accessible both from laboratory experiments (12, 13, 38) and from natural populations (7, 10). In the laboratory, both large variability and rapid turnover of individual spacer types have been observed. Understanding these dynamics is certainly interesting but requires a much higher level of sampling and resolution than what is currently available (49). Also, acquiring such data, especially time resolved, for natural systems such as microbial mats and acid mine drainage may not be practical. Here we show that the spacer rank-abundance distribution may provide a more useful time-invariant observable for understanding the underlying dynamics in both natural and laboratory systems; our work predicts that measuring spacer abundances in natural populations may reveal abundance distributions that are stable in time and potentially indicative of the environmental conditions despite differences in the level of spacer sequences between populations and over time.
Even without phage diversity and phage mutations in our model, we reproduce important features of the spacer dynamics observed in recent laboratory experiments (12). In the presence of mutant phages, the net effectiveness of different spacers in providing immunity against phages may vary from one spacer to the next. We expect that a spacer’s effectiveness will depend on the fraction of the phage population with a matching protospacer. This fitness difference between spacers will have consequences for the population dynamics, and some aspects have been addressed in experiments (13, 50, 51) and models (16, 19, 31⇓–33, 38, 52) and reviewed in ref. 53.
Multistability at the level of cellular states, where a fraction of the population switches to an alternate state, has been explored at length with implications from bet hedging to lytic–lysogenic switching to antibiotic resistance and persistence (54⇓⇓⇓–58). Similarly structured populations are now being explored in contexts from healthy regenerating tissues to pathologies such as cancer (59). While recent models for large interacting microbial populations using a statistical mechanical approach (60⇓⇓–63) show that ecological multistability akin to what is seen in a spin glass may be present in such populations, they remain experimentally inaccessible. A notable exception is Gore et al. (54), who observed population-level bistability and coexistence between two cooperating yeast strains in a mixed culture. Here we provide an example of multistable, multispecies ecological states that may be readily accessible in experiments. We show that for a population of bacteria and phages the flow rate of a chemostat or dilution rate of a serially diluted population can serve as a bifurcation parameter. Since both nutrient concentration (which controls population density) and dilution rate are easy to control experimentally, ecological states in our phage–bacteria population should be readily accessible (SI Appendix, section 5.4).
In natural populations where phages and bacteria coexist, the phage-to-bacterium ratio, also called VPR, has been measured and reported for a wide range of conditions. While viruses are generally assumed to outnumber bacteria by a factor of 10 (8, 34, 66, 67), the measured ratio can vary between samples by as much as a factor of
(A) The phage-to-bacterium ratio (virus-to-prokaryote ratio, VPR) can differ by more than 10-fold between the two bistable states in the model (solid black lines). These values reflect the two underlying ecological states: VPR is low when bacteria are at high density and up-regulate CRISPR-Cas expression, and VPR is high at low bacterial density and low CRISPR-Cas expression. (B–D) VPR histogram and fitted log-normal distributions for organisms from eutrophic (high nutrient), mesotrophic (moderate nutrient), and oligotrophic (low nutrient) environments [data from Parikka et al. (64)]. We fit 1D Gaussian mixture models with one and two Gaussian distributions, respectively, to the data and chose the best-fitting model (blue line), using the Akaike information criterion (AIC). The data were fitted better by a single Gaussian distribution for the eutrophic data (Δ AIC
In our model, the normalized chemostat flow rate f is inversely proportional to the inflow nutrient concentration
To connect this qualitative feature of our model to natural populations, we analyzed VPR data from Parikka et al. (64) and found that the distribution of measured VPR values appears bimodal in low and moderate nutrient environments. It may be the case that at high nutrient levels where bacteria live in dense communities and are at high risk of lytic phage predation, most or all bacteria use a highly expressed CRISPR-Cas system and VPR is peaked at a single low value in that environment (Fig. 5B). Conversely, at low to moderate nutrient levels, different bacteria may use different immune strategies and so VPR values may span a wider range (Fig. 5 C and D). Note that at very low f and high nutrient availability, our model predicts monostability in the low-density, low-expression stable state corresponding to high VPR, yet we observe a unimodal low VPR in high-nutrient environments (Fig. 5B). In these conditions when phages are a large threat, bacteria may use another signal besides density to up-regulate the CRISPR-Cas system. In this work we provide an intuitive connection between an observed quantity such as VPR and a nontrivial insight into the ecological state of interacting bacteria and phages.
Materials and Methods
Data Analysis.
We analyzed data from an experiment in which Streptococcus thermophilus bacteria were mixed with phages and sequenced to track the expanding portion of the CRISPR locus over 15 d (12) by labeling spacers with a type i corresponding to a unique spacer sequence, a locus position j, and a bacteria label k. All spacers within an edit distance of 2 from each other were grouped into the same type. See SI Appendix, section 1 for details.
We compared data reported by Parikka et al. (64) with our model. When plotting VPR values, we combined average VPR measurements and individual VPR measurements (“VPR av” and “VPR” columns) to create a combined dataset of VPR values.
Our processed data can be found on GitHub at https://github.com/mbonsma/CRISPR-immunity.
Model Analysis.
The mean-field model was solved exactly at steady state in Mathematica. Steady-state values with regulation added were calculated numerically. See SI Appendix, section 3 for stability analysis.
Simulations.
Simulations were written in
Acknowledgments
We thank David Paez-Espino for discussions surrounding data from ref. 12. We thank Devaki Bhaya and Anton Zilman for helpful discussions. We acknowledge funding from the Natural Sciences and Engineering Research Council of Canada and Vanier Canada Graduate Scholarships.
Footnotes
- ↵1To whom correspondence should be addressed. Email: goyal{at}physics.utoronto.ca.
Author contributions: M.B.-F., D.S., and S.G. designed research; M.B.-F. and D.S. performed research; M.B.-F. analyzed data; M.B.-F. and S.G. wrote the paper; and D.S. wrote simulations.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission. L.P. is a guest editor invited by the Editorial Board.
Data deposition: The processed data and simulation code have been deposited in GitHub, https://github.com/mbonsma/CRISPR-immunity.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1802887115/-/DCSupplemental.
Published under the PNAS license.
References
- ↵
- ↵
- ↵
- ↵
- Korem T, et al.
- ↵
- ↵
- O’Toole GA
- ↵
- ↵
- ↵
- Doron S, et al.
- ↵
- Andersson AF,
- Banfield JF
- ↵
- ↵
- ↵
- Paez-Espino D, et al.
- ↵
- ↵
- ↵
- ↵
- Haerter JO,
- Sneppen K
- ↵
- ↵
- ↵
- ↵
- Han P,
- Deem MW
- ↵
- Weinstein JA,
- Jiang N,
- White RA,
- Fisher DS,
- Quake SR
- ↵
- ↵
- Desponds J,
- Mora T,
- Walczak AM
- ↵
- ↵
- Høyland-Kroghsbo NM, et al.
- ↵
- ↵
- Heilmann S,
- Sneppen K,
- Krishna S
- ↵
- ↵
- Haerter JO,
- Trusina A,
- Sneppen K
- ↵
- Weinberger AD,
- Wolf YI,
- Lobkovsky AE,
- Gilmore MS,
- Koonin EV
- ↵
- Iranzo J,
- Lobkovsky AE,
- Wolf YI,
- Koonin EV
- ↵
- ↵
- ↵
- Berezovskaya FS,
- Wolf YI,
- Koonin EV,
- Karev GP
- ↵
- ↵
- Ali Q,
- Wahl LM
- ↵
- Weissman JL, et al.
- ↵
- ↵
- Djordjevic M
- ↵
- Guzina J,
- Rodić A,
- Blagojević B,
- Dordević M
- ↵
- ↵
- ↵
- Pride DT, et al.
- ↵
- ↵
- Engen S,
- Lande R
- ↵
- ↵
- ↵
- ↵
- Deveau H, et al.
- ↵
- ↵
- ↵
- England WE,
- Whitaker RJ
- ↵
- ↵
- ↵
- ↵
- Tarnita CE,
- Washburne A,
- Martinez-Garcia R,
- Sgro AE,
- Levin SA
- ↵
- ↵
- ↵
- Bunin G
- ↵
- ↵
- Tikhonov M,
- Monasson R
- ↵
- Biroli G,
- Bunin G,
- Cammarota C
- ↵
- Parikka KJ,
- Le Romancer M,
- Wauters N,
- Jacquet S
- ↵
- Burnham K,
- Anderson D
- ↵
- ↵
- Barrangou R,
- van der Oost J
- Held NL, et al.
- ↵
- Payet JP,
- Suttle CA
- ↵
- Kasman LM, et al.
- ↵
- ↵
Citation Manager Formats
Article Classifications
- Physical Sciences
- Physics
- Biological Sciences
- Biophysics and Computational Biology