Fluctuating fitness shapes the clone size distribution of immune repertoires

The adaptive immune system relies on the diversity of receptors expressed on the surface of B and T-cells to protect the organism from a vast amount of pathogenic threats. The proliferation and degradation dynamics of different cell types (B cells, T cells, naive, memory) is governed by a variety of antigenic and environmental signals, yet the observed clone sizes follow a universal power law distribution. Guided by this reproducibility we propose effective models of somatic evolution where cell fate depends on an effective fitness. This fitness is determined by growth factors acting either on clones of cells with the same receptor responding to specific antigens, or directly on single cells with no regards for clones. We identify fluctuations in the fitness acting specifically on clones as the essential ingredient leading to the observed distributions. Combining our models with experiments we characterize the scale of fluctuations in antigenic environments and we provide tools to identify the relevant growth signals in different tissues and organisms. Our results generalize to any evolving population in a fluctuating environment.

The adaptive immune system relies on the diversity of receptors expressed on the surface of B-and T cells to protect the organism from a vast amount of pathogenic threats. The proliferation and degradation dynamics of different cell types (B cells, T cells, naive, memory) is governed by a variety of antigenic and environmental signals, yet the observed clone sizes follow a universal power-law distribution. Guided by this reproducibility we propose effective models of somatic evolution where cell fate depends on an effective fitness. This fitness is determined by growth factors acting either on clones of cells with the same receptor responding to specific antigens, or directly on single cells with no regard for clones. We identify fluctuations in the fitness acting specifically on clones as the essential ingredient leading to the observed distributions. Combining our models with experiments, we characterize the scale of fluctuations in antigenic environments and we provide tools to identify the relevant growth signals in different tissues and organisms. Our results generalize to any evolving population in a fluctuating environment.
immune repertoire | population dynamics | fluctuating fitness | lymphocyte receptor | repertoire sequencing A ntigen-specific receptors expressed on the membrane of B-and T cells (B-cell receptors, BCRs and T-cell receptors, TCRs) recognize pathogens and initiate an adaptive immune response (1). An efficient response relies on the large diversity of receptors that is maintained from a source of newly generated cells, each expressing a unique receptor. These progenitor cells later divide or die, and their offspring make up clones of cells that share a common receptor. The sizes of clones vary, as they depend on the particular history of cell divisions and deaths in the clone. The clone-size distribution thus bears signatures of the challenges faced by the adaptive system. Understanding the form of the clone-size distribution in healthy individuals is an important step in characterizing the antigenic recognition process and the functioning of the adaptive immune system. It also presents an important starting point for describing statistical deviations seen in individuals with compromised immune responses.
High-throughput sequencing experiments in different cell types and species (2)(3)(4)(5)(6)(7)(8)(9) have allowed for the quantification of clone sizes and their distributions (2,(9)(10)(11). Previous population dynamics approaches to repertoire evolution have taken great care in precisely modeling these processes for each compartment of the population, through the various mechanisms by which cells grow, die, communicate, and change phenotype (12)(13)(14)(15)(16)(17). However, one of the most striking properties of repertoire statistics revealed by high-throughput sequencing is the observation of power laws in clone-size distributions ( Fig. 1 A and B), which holds true for various species (human, mice, zebrafish), cell type (B-and T cells), and subsets (naive and memory, CD4 and CD8), and seems to be insensitive to these context-dependent details. It remains unclear, however, what universal features of these dynamics lead to the observed power-law distributions. Here we identify the key biological parameters of the repertoire dynamics that govern its behavior.
The wide range and types of interactions that influence a B-or T-cell fate happen in a complex, dynamical environment with inhomogeneous spatial distributions. They are difficult to measure in vivo, making their quantitative characterization elusive.
Motivated by the universality of the observed clone-size distribution, we describe the effective interaction between the immune cells and their environment as a stochastic process governed by only a few relevant parameters. All cells proliferate and die depending on the strength of antigenic and cytokine signals they receive from the environment, which together determine their net growth rate (Fig. 1C). This effective fitness that fluctuates in time is central to our description. We find that its general properties determine the form of the clone-size distribution. We distinguish two broad classes of models, according to whether these fitness fluctuations are clone-specific (mediated by their specific BCR or TCR) or cell-specific (mediated by phenotypic fluctuations such as the number of cytokine receptors). We identify the models that are compatible with the experimentally observed distributions of clone sizes. These distributions do not depend on the detailed mechanisms of cell signaling and growth, but rather emerge as a result of self-organization, with no need for fine-tuned interactions. Performing a series of validated approximations, we find a simple algebraic relationship constraining the different timescales of the problem by the experimentally observed exponent of the clone-size distribution. This result allows for testable predictions and estimates of the rates that govern the diversity of a clonal distribution.

Results
Clone Dynamics in a Fluctuating Antigenic Landscape. The fate of the cells of the adaptive immune system depends on a variety of clonespecific stimulations. The recognition of pathogens triggers large events of fast clone proliferation followed by a relative decay, with some cells being stored as memory cells to fend off future infections. Naive cells, which have not yet recognized an antigen, do not usually undergo such extreme events of proliferation and

Significance
Receptors on the surface of lymphocytes specifically recognize foreign pathogens. The diversity of these receptors sets the range of infections that can be detected and fought off. Recent experiments show that, despite the many differences between these receptors in different cell types and species, their distribution of diversity is a strikingly reproducible power law. By introducing effective models of repertoire dynamics that include environmental and antigenic fluctuations affecting lymphocyte growth or "fitness," we show that a temporally fluctuating fitness is responsible for the observed heavy tail distribution. These models are general and describe the dynamics of various cell types in different species. They allow for the classification of the functionally relevant repertoire dynamics from the features of the experimental distributions. death, but their survival relies on short binding events (called "tickling") to antigens that are natural to the organism (selfproteins) (18,19). Because receptors are conserved throughout the whole clone (with the exception of B-cell hypermutations), clones that are better at recognizing self-antigens and pathogens will on average grow to larger populations than bad binders. By analogy to Darwinian evolution, they are "fitter" in their local, time-varying environment.
We first present a general model for clonal dynamics that accounts for the characteristics common to all cell types, following previous work by de Boer, Perelson, and collaborators (14,20,21). We later explore the effect of specific features such as hypermutations, memory/naive compartmentalization, and thymic output decay on the clone-size distribution.
We denote by a j ðtÞ the overall concentration of an antigen j as a function of time. We assume that after its introduction at a random time t j , this concentration decays exponentially with a characteristic lifetime of antigens λ −1 , a j ðtÞ = a j,0 e −λðt−tjÞ as pathogens are cleared out of the organism, either passively or through the action of the immune response. Lymphocyte receptors are specific to certain antigens, but this specificity is degenerate, a phenomenon referred to as cross-reactivity or polyspecificity. The extent to which a lymphocyte expressing receptor i interacts with antigen j (foreign or self) is encoded in the cross-reactivity function K ij , which is zero if i and j do not interact, or a positive number drawn from a distribution to be specified, if they do. In general, interactions between lymphocytes and antigens effectively promote growth and suppress cell death, but for simplicity we can assume that the effect is restricted to the division rate. In a linear approximation, this influence is proportional to P j K ij a j ðtÞ, i.e., the combined effect of all antigens j for which clone i is specific. This leads to the following dynamics for the evolution of the size C i of clone i (Fig. 1C): where ν and μ are the basal division and death rates, respectively, and where Bξ i ðtÞ is a birth-death noise of intensity B 2 = ðν + P j K ij a j ðtÞ + μÞC i , with ξ i ðtÞ a unit Gaussian white noise (see SI Appendix, section A for details about birth-death noise).
New clones, with a small typical initial size C 0 , are constantly produced and released into the periphery with rate s C (Fig. 1C). For example, a number on the order of s C = 10 8 new T cells is output by the thymus daily in humans (22). Because the total number of T cells is on the order of 10 11 , this means that the net effect of cell death and proliferation results in a negative average growth rate of 10 −3 days −1 in homeostatic conditions (22). Because the probability of rearranging the exact same receptor independently is very low (<10 −10 ) (23), we assume that each new clone is unique and comes with its own set of cross-reactivity coefficients K ij . Assuming a rate s A of new antigens, the average net growth rate in Eq. 1 is f 0 = ν + ha j,0 ihKis A λ −1 − μ < 0, and the stationary number of clones should fluctuate around N C ≈ s C jf 0 j −1 clones. This is just an average, and treating each clone independently may lead to large variations in the total number of cells (i.e., the sum of sizes of all clones). To maintain a constant population size, clones compete with each other for specific resources (pathogens or self-antigens) and homeostatic control can be maintained by a global resource such as Interleukin 7 or Interleukin 2. Here we do not model this homeostatic control explicitly, but instead assume that the division and death rates ν, μ are tuned to achieve a given repertoire size. We verified that adding an explicit homeostatic control did not affect our results (SI Appendix, Fig. S2 and SI Appendix, section B).
We simulated the dynamics of a population of clones interacting with a large population of antigens. Each antigen interacts with each present clone with probability p = 10 −7 , and with strength K ij drawn from a Gaussian distribution of mean 1 and variance 1 (truncated to positive values). Although it has been argued that the breadth of cross-reactivity and affinity to selfantigens are correlated (24,25), here for simplicity we draw them independently, as we do not expect this correlation to qualitatively affect the results. A typical trajectory of the antigenic stimulation undergone by a given clone, P j K ij a j , is shown in Fig. 1E (green curve), and shows how clone growth tracks the variations of the antigenic environment. When the stimulation is particularly strong, the model recapitulates the typical behavior experimentally observed at the population level following a pathogenic invasion (26,27), as illustrated in Fig. 1D: The population of a clone explodes (red curve), driving the growth of the total population (blue curve), while taking over a large fraction of the carrying capacity of the system, and then decays back as the infection is cleared.
On average, the effects of division and death almost balance each other, with a slight bias toward death because of the turnover imposed by thymic or bone marrow output. However, at a given time, a clone that has high affinity for several present antigens will undergo a transient but rapid growth, whereas most other clones will decay slowly toward extinction. In other words, locally in time, the antigenic environment creates a unique "fitness" for each clone. Because growth is exponential in time,  One or a few cells that react to that specific antigen grow up to a macroscopic portion of the total population, and then decrease back to normal sizes after the invasion. A typical clone-size trajectory along with its pathogenic stimulation P j K ij a j ðtÞ shows the coupling between clone growth and variations of the antigenic environment (E). Parameters used: s C = 2,000 day −1 , C 0 = 2, s A = 1.96 · 10 7 day −1 , a j,0 = a 0 = 1, λ = 2 day −1 , p = 10 −7 , ν = 0.98 day −1 , and μ = 1.18 day −1 .
these differential fitnesses can lead to very large differences in clone sizes, even if variability in antigen concentrations or affinities is nominally small. We thus expect to observe large tails in the distribution of clone size. Fig. 2A shows the cumulative probability distribution function (CDF) of clone sizes obtained at steady state (blue curve) showing a clear power-law behavior for large clones, spanning several decades.
The exponent of the power law is independent of the introduction size of clones ( Fig. 2A, Inset) and the specifics of the randomness in the environment (exponential decay, random number of partners, random interaction strength) as long as its first and second moment are kept fixed (SI Appendix, Fig. S3 and SI Appendix, section C).
Simplified Models and the Origin of the Power Law. To understand the power-law behavior observed in the simulations, and its robustness to various parameters and sources of stochasticity, we decompose the overall fitness of a clone at a given time (its instantaneous growth rate) into a constant, clone-independent part equal to its average f 0 < 0, and a clone-specific fluctuating part of zero mean, denoted by f i ðtÞ. This leads to rewriting Eq. 1 as The function f i ðtÞ encodes the fluctuations of the environment as experienced by clone i. Because antigens can be recognized by several receptors, these fluctuations may be correlated between clones. Assuming that these correlations are weak, hf i ðtÞf j ðt′Þi ≈ 0, amounts to treating each clone independently of each other, and thus to reducing the problem to the single clone level. The stochastic process giving rise to f i ðtÞ is a sum of Poisson-distributed exponentially decaying spikes. This process is not easily amenable to analytical treatment, but we can replace it with a simpler stochastic process with the same temporal autocorrelation function. This autocorrelation is given by hf i ðtÞf i ðt′Þi = A 2 e −λjt−t′j , with the antigenic noise strength A 2 = s A pa 2 0 hK 2 iλ −1 , and where we recall that λ −1 is the characteristic lifetime of antigens. The simplest process with the same autocorrelation function is given by an overdamped spring in a thermal bath, or Ornstein-Uhlenbeck process, with η i ðtÞ a Gaussian white noise of intensity 1 and γ = A ffiffi ffi λ p quantifies the strength of variability of the antigenic environment (SI Appendix, section D). This is also the process of maximum entropy or caliber (28) with that autocorrelation function (SI Appendix, section E and ref. 29).
The effect of the birth-death noise Bξ i ðtÞ is negligible compared with the fitness variations for large clones and it has no effect on the tail (SI Appendix, Fig. S5 and SI Appendix, section F). It can thus be ignored when looking at the tail of the distribution and its power-law exponent, but it will play an important role for defining the range over which the power law is satisfied.
The population dynamics described by Eqs. 2 and 3 can be reformulated in terms of a Fokker-Planck equation for the joint abundance ρ of clones of a given log size x = log C and a given fitness f: where the source term sðx, f Þ describes new clones arriving at rate s C with size C 0 and normally distributed fitnesses of variance hf 2 i = γ 2 =λ. This Fokker-Planck equation can be solved numerically with finite element methods with an absorbing boundary condition at x = 0 to account for clone extinction. The solution, represented by the black curve in Fig. 2A, matches closely that of the full simulated population dynamics (in blue). The power-law behavior is apparent above a transition point that depends on the distribution of introduction sizes of new clones and the parameters of the model (see below). Intuitively, the microscopic details of the noise are not expected to matter when considering long timescales, as a consequence of the central limit theorem. However, the long tails of the distribution of clone sizes involve rare events and belong to the regime of large deviations, for which these microscopic details may be important. Therefore, the agreement between the process described by the overdamped spring and the exponentially decaying, Poisson-distributed antigens is not guaranteed, and in fact does not hold in all parameter regimes (SI Appendix, Fig. S8).
We can further simplify the properties of the noise by assuming that its autocorrelation time is small compared with other timescales. This leads to taking the limit γ, λ → ∞ while keeping their ratio σ = γ=λ constant, so that f i ðtÞ is just a Gaussian white noise with hf i ðtÞf i ðt′Þi = 2σ 2 δðt − t′Þ (SI Appendix, section F and SI Appendix, Fig. S4). The corresponding Fokker-Planck equation now reads ∂ t ρðx, tÞ = −f 0 ∂ x ρðx, tÞ + σ 2 ∂ 2 x ρðx, tÞ + sðxÞ, [5] with sðxÞ = s C δðx − logðC 0 ÞÞ. This equation can be solved analytically at steady state, and the resulting clone-size distribution is, for C > C 0 , with α = jf 0 j=σ 2 = λjf 0 j=A 2 (details in SI Appendix, section F). The full solution, represented in Fig. 2A in red, captures well the longtail behavior of the clone-size distribution despite ignoring the temporal correlations of the noise, and approaches the solution of the colored-noise model (Eq. 3) as λ, γ → ∞, as expected ( Fig. 2A). The power-law behavior and its exponent depend on the noise intensity, but are otherwise insensitive to the precise details of the microscopic noise, including its temporal properties. Fat tails (small α) are expected when the average cell lifetime is long (small jf 0 j) and when the antigenic noise is high (large σ or A). The explicit expression for the exponent of the power law 1 + α as a function of the biological parameters can be used to infer the antigenic noise strength A 2 directly from data. The typical net clone decay rate jf 0 j ≈ 10 −3 can be estimated from thymic output and repertoire size, as discussed earlier. The characteristic lifetime of antigens λ −1 is harder to estimate, as it corresponds to the turnover time of the antigens that the body is exposed to, but is probably on the order of days or a few weeks, λ ≈ 0.1 day −1 . We estimated α = 1 ± 0.2 from the zebrafish data of Fig. 1A (2, 10) using canonical methods of power-law exponent extraction (30) (see SI Appendix, section G for details), and also found a similar value in human T cells (31). The resulting estimate, A = 10 −2 day −1 , is rather striking, as it implies that fluctuations in the net clone growth rate, A, are much larger than its average f 0 .
Whereas the distribution always exhibits a power law for large clones, this behavior does not extend to clones of arbitrarily small sizes, where the details of the noise and how new clones are introduced matter. We define a power-law cutoff C * as the smallest clone size for which the cumulative distribution function differs from its best power-law fit by less than 10%. Using numerical solutions to the Fokker-Planck equation associated with the colored-noise model, we can draw a map of C * as a function of the parameters of the system. In Fig. 2 B and C we show how C * varies as a function of the introduction size for different values of the dimensionless parameter related to the effective strength of antigen fluctuations relative to their characteristic lifetime at fixed power-law exponents. In principle, one can use this dependency to infer effective parameters from data. In practice, when dealing with data it is more convenient to consider the value of the cumulative distribution at C * , rather than C * itself. For example, fixing C 0 = 4 and fitting the curve of Fig.  1A with our simplified model using λ as an adjustable parameter, we obtain λ ≈ 0.14 day −1 (SI Appendix, section G), which corresponds to a characteristic lifetime of antigens of around a week. Although this estimate must be taken with care, because of possible PCR amplification biases plaguing the small clone size end of the distribution, the procedure described here can be applied generally to any future repertoire sequencing dataset for which reliable sequence counts are available.
A Model of Fluctuating Phenotypic Fitness. So far, we have assumed that fitness fluctuations are identical for all members of a same clone. However, the division and death of lymphocytes do not only depend on signaling through their TCR or BCR. For example, cytokines are also growth inducers and homeostatic agents (32,33), and the ability to bind to cytokines depends on single-cell properties such as the number of cytokine receptors on the membrane of a given cell, independent of their BCR or TCR. Other stochastic single-cell factors may affect cell division and death. These signals and factors are cell-specific, as opposed to the clone-specific properties related to BCR or TCR binding. Together, they define a global phenotypic state of the cell that determines its time-varying fitness, independent of the clone and its TCR or BCR. This does not mean that these phenotypic fitness fluctuations are independent across the cells belonging to the same clone. Cells within a clone share a common ancestry, and may have inherited some phenotypic properties of their common ancestors, making their fitnesses effectively correlated with each other. However, this phenotypic memory gets lost over time, unlike fitness effects mediated by antigen-specific receptors.
We account for these phenotypic fitness fluctuations by a function f c ðtÞ quantifying how much the fitness of an individual cell c differs from the average fitness f 0 . This fitness difference is assumed to be partially heritable, which we model by where λ −1 c is the heritability, or the typical time over which the fitness-determining trait is inherited, γ c quantifies the variability of the fitness trait, and η c ðtÞ is a cell-specific Gaussian white noise of power 1. Despite its formal equivalence with Eq. 3, it is important to note that here the fitness dynamics occurs at the level of the single cell (and its offspring) instead of the entire clone. The dynamics of the fitness f i ðtÞ of a given clone i can be approximated from Eq. 7 by averaging the fitnesses f c ðtÞ of cells in that clone, yielding where η i ðtÞ and ξ i ðtÞ are clone-specific white noise of intensity 1, and ν and μ are the average birth and death rates, respectively, so that f 0 = ν − μ (details in SI Appendix, section I). The difference with Eq. 3 is the 1= ffiffiffiffiffiffiffiffiffiffi C i ðtÞ p prefactor in the fitness noise η i ðtÞ, which stems from the averaging of that noise over all cells in the clone, by virtue of the law of large numbers. Because of this prefactor, the fitness noise is now of the same order of magnitude as the birth-death noise, which must now be fully taken into account. Taking Eqs. 8 and 9 at the population level gives a Fokker-Planck equation with a source term accounting for the import of new clones. We verify the numerical steady-state Fokker-Planck solution against Gillespie simulations (SI Appendix, Fig. S6; see SI Appendix, section H for details). Fig. 3 A and B shows the distribution of clone sizes for different values of the phenotypic relaxation rate λ c and environment amplitude γ c . These distributions vary from a sharp exponential drop in the case of low heritability (large λ c ) to heavier tails in the case of long conserved cell states (small λ c ). To quantify the extent to which these distributions can be described as heavytailed, we fit them to a power law with exponential cutoff, ρðCÞ ∝ C −1−α e −C=Cm , where C m is the value below which the distribution could be interpreted as an (imperfect) power law. Fig. 3C shows a strong dependency of this cutoff with the phenotypic memory λ −1 c . The longer the phenotypic memory λ −1 c , the more clone-specific the fitness looks, and the more the distribution can be mistaken for a power law in a finite-size experimental distribution. Larger birth-death noise also extends the range of validity of the power law. As a result, and despite the absence of a true power-law behavior, these models of fluctuating phenotypic fitnesses cannot be discarded based on current experimental data.
The model can be solved exactly at the two extremes of the heritability parameter λ c . In the limit of infinite heritability (λ c → 0) the system is governed by selective sweeps. The clone with the largest fitness completely dominates the population, until it is replaced by a better one, giving rise to a trivial clone-size distribution. In the opposite limit, when heritability goes to 0 (λ c → + ∞), the Fokker-Planck equation can be solved analytically (SI Appendix, sections I and J), yielding an exact power law with exponential cutoff, The numerical solution of Fig.  3B is close to this limit. Note that even with a negligible exponential cutoff, the predicted α < 0 contradicts experimental observations.

Discussion
The model introduced in this paper describes the stochastic nature of the immune dynamics with a minimal number of parameters, helping interpret the different regimes. These parameters are effective in the sense that they integrate different levels of signaling, pathways, and mechanisms, focusing on the long timescales of clone dynamics. We assumed that they are general enough that different cell types (B-and T cells) or subsets (naive or memory) can be described by the same dynamical equations despite their differences. How do refined models including these differences affect our results?
Naive and memory cells differ in their turnover rate, i.e., their death rate, memory cells being renewed at a pace 10 times faster than naive ones (34). In our model, this difference is reflected in a higher birth-death noise for memory cells. We have shown that this noise had no effect on the tail of the clone-size distribution for clone-specific fitness (SI Appendix, Fig. S5), whereas it was important for the case of a cell-specific fitness, where birth-death noise contributed to the distribution to the same extent as fitness fluctuations. However, some repertoire datasets mix both naive and memory sets, and one could wonder whether our results hold for such mixtures. To examine this question, we simulated a simple two-compartment model where naive cells get irreversibly converted into memory cells when their stimulation is above a certain threshold (see SI Appendix, section K for details). We found that when fitness was clone-specific, the clone-size distribution of the mixture and that of memory cells alone still follow a power law, whereas that of naive cells only does so when conversion to memory upon stimulation is partial (SI Appendix, Fig. S12). Repeating the same analysis for the cell-specific fitness model, we found that clone-size distributions for each phenotype differed according to their respective birth-death noises, with a longer tail for memory cells as expected from their higher turnover rate.
The main difference between B-and T cells ignored by our model is that BCRs accumulate hypermutations upon proliferation. We studied this effect by allowing proliferating clones to spawn new clones with slightly modified affinities to antigens (SI Appendix, section L). The resulting clone-size distribution still follows a power law (SI Appendix, Fig. S13), although with a slightly smaller exponent due to increased stochasticity.
Another simplifying assumption of our model is that the dynamics reaches a steady state. This may be challenged by the decay of the thymic output s C with age. To estimate the importance of this effect, we simulated the model of a clone-specific fitness with an exponentially decaying source term, combined with a decreasing jf 0 j chosen to keep the population constant on average (SI Appendix, section M). The clone-size distributions at different points in time, shown in SI Appendix, Fig. S14, still follow a power law. Interestingly, the exponent α is predicted to decrease with age, consistent with α ∝ jf 0 j.
We showed that the relevant sources of stochasticity for the shape of the clone-size distributions fall into two main categories, depending on how cell fate is affected by the environment. Either the stochastic elements of clone growth act in a clone-specific way, through their receptor (BCR or TCR), leading to power-law distributions with exponent ≥1, or in a cellspecific way, e.g., through their variable level of sensitivity to cytokines (and more generally through any phenotypic trait affecting cell fitness), leading to exponentially decaying distributions with a power-law prefactor. These two types of signals (clone-specific and cell-specific) are important for the somatic evolution of the immune system (21,32,33,(35)(36)(37) and our analysis shows that the shape of the clone-size distribution is informative of their relative importance to the repertoire dynamics. It provides a first theoretical setting and an initial systematic classification for modeling immune repertoire dynamics. Our method applied to high-throughput sequencing data can be used to quantify how much each type of signal contributes to the overall dynamics, and what is the driving force for the different cell subsets. For example, although it is reasonable to speculate that clone-specific signals should dominate for memory cells (through antigen recognition), and cell-specific selection for naive cells (through cytokine-mediated homeostatic division), the relative importance of these signals for both cell types is yet to be precisely quantified, and may vary across species. A clear power law over several decades would strongly hint at dynamics dominated by interactions with antigens, whereas a faster decaying distribution would favor a scenario where individual cell fitness fluctuations dominate. Applying these methods to data from memory cells can give orders of magnitude for the division and half-life of memory lymphocytes, as well as the typical number of cells C 0 from a clone that are stored as memory following an infection. The application of our method to data from the first immune repertoire survey [BCRs in zebrafish (2)] suggests that clonespecific noise dominates in that case, allowing us to infer a relation between the dynamical parameters of the model from the observed power-law exponent ≈ 2. However, there are a few issues with applying our method directly to data in the current state of the experiments. First, the counts (i.e., how many cells have the same receptor sequence and belong to the same clone) from many high-throughput repertoire sequencing experiments are imperfect because of PCR bias and sampling problems. New methods using single-molecule barcoding have been developed for RNA sequencing (8,38,39), but they do not solve the problem entirely, as the number of expressed mRNA molecules may not faithfully represent the cell numbers because of possible expression bias. In addition, most studies (with the exception of ref. 40) have been sequencing only one of the two chains of lymphocyte receptors, which is insufficient to determine clone identity unambiguously. As methods improve, however, our model can be applied to future data to distinguish different sources of fitness stochasticity and to put reliable constraints on biological parameters. Studying clone-size distributions in healthy individuals allows us to characterize signatures of normally functioning immune systems. By comparing them to the same properties in individuals suffering from immune diseases or cancer, our approach could be used to identify sources of anomalies.
Thanks to its generality, our model is also relevant beyond its immunological context, and follows previous attempts to explain power laws in other fields (41)(42)(43). The dynamics described here corresponds to a generalization of the neutral model of population genetics (44) where thymic or bone marrow outputs are now reinterpreted as new mutations or speciations, and where we have added a genotypic or phenotypic fitness noise (receptor or cell-specific noise, respectively). It was recently shown that such genotypic fitness noise strongly affects the fixation probability and time in a population of two alleles (45,46). Note that, because new thymic or bone marrow clones are unrelated to existing clones, there are no lineage histories, in contrast with previous theoretical work on evolving populations in fluctuating fitness landscapes (47)(48)(49). Our main result (Eq. 6) shows how fitness noise can cause the clone-size distribution (called "frequency spectrum" in the context of population genetics) to follow a power law with an arbitrary exponent >1 in a population of fixed size, whereas the classical neutral model gives a power law of exponent 1 with an exponential cutoff (as shown in our exact solution with γ c = 0). Our results can be used to explain complex allele frequency spectra using fluctuating fitness landscapes.
ACKNOWLEDGMENTS. This work was supported in part by Grant ERCStG 306312.