A tug-of-war between driver and passenger mutations in cancer and other adaptive processes

Cancer progression is an example of a rapid adaptive process where evolving new traits is essential for survival and requires a high mutation rate. Precancerous cells acquire a few key mutations that drive rapid population growth and carcinogenesis. Cancer genomics demonstrates that these few 'driver' mutations occur alongside thousands of random 'passenger' mutations-a natural consequence of cancer's elevated mutation rate. Some passengers can be deleterious to cancer cells, yet have been largely ignored in cancer research. In population genetics, however, the accumulation of mildly deleterious mutations has been shown to cause population meltdown. Here we develop a stochastic population model where beneficial drivers engage in a tug-of-war with frequent mildly deleterious passengers. These passengers present a barrier to cancer progression that is described by a critical population size, below which most lesions fail to progress, and a critical mutation rate, above which cancers meltdown. We find support for the model in cancer age-incidence and cancer genomics data that also allow us to estimate the fitness advantage of drivers and fitness costs of passengers. We identify two regimes of adaptive evolutionary dynamics and use these regimes to rationalize successes and failures of different treatment strategies. We find that a tumor's load of deleterious passengers can explain previously paradoxical treatment outcomes and suggest that it could potentially serve as a biomarker of response to mutagenic therapies. Collective deleterious effect of passengers is currently an unexploited therapeutic target. We discuss how their effects might be exacerbated by both current and future therapies.


Introduction
While many populations evolve new traits via a gradual accumulation of changes, some adapt very rapidly. Examples include viral adaptation during infection (1); the emergence of antibiotic resistance (2); artificial selection in biotechnology (3); and cancer (4). Rapid adaptation is characterize by three key features: (i) the availability of strongly advantageous traits accessible by a few mutations, (ii) an elevated mutation rate (5,6), and (iii) a dynamic population size (7). Traditional theories of gradual adaptation are not applicable under these conditions, and new approaches are needed to address pressing problems in medicine and biotechnology.
Cancer progression is an example of a rapidly adapting population: cancers develop as many as ten new traits (8), often have a high mutation rate (8)(9)(10), and a population size that is rapidly changing over time. This process is driven by a handful of mutations and chromosomal abnormalities in cancer-related genes (oncogenes and tumor suppressors), collectively called drivers. From an evolutionary point of view, drivers are mutations that are beneficial to cancer cells because their phenotypes increase the cell proliferation rate or eliminate brakes on proliferation (8). Drivers, however, arise alongside thousands of other mutations and alterations dispersed through the genome that have no immediate beneficial effect, collectively called passengers.
Passengers have been previously assumed to be neutral and largely ignored in cancer research, yet growing evidence suggests that they may sometimes be deleterious to cancer cells and, thus, play an important role in both neoplastic progression and clinical outcomes. In an earlier study, we showed that deleterious passengers can readily accumulate during tumor progression and found that many passengers present in cancer genomes exhibit signatures of damaging mutations (11). Additionally, chromosomal gains and losses that are pervasive in cancer can be passengers, and have been shown to be highly damaging to cancer cells (12). Lastly, cancers with high levels of chromosomal alterations exhibit better clinical outcomes in breast, ovarian, gastric, and non-small cell lung cancer (13). Passenger mutations and chromosomal abnormalities can be deleterious via a variety of mechanisms: direct loss-of-function (14), proteotoxic cytotoxicity from protein disbalance and aggregation (15), or by inciting an immune response (16).
While the role of deleterious mutations in cancer is largely unknown, their effects on natural populations has been extensively studied in genetics (5,(17)(18)(19). The accumulations of deleterious mutations can cause the extinction of a population via processes known as Muller's ratchet and mutational meltdown (17,20,21) It was recently proposed that inevitable accumulation of deleterious mutations in natural populations should be offset by new beneficial mutations, leading to long-term population stability (19). Here we consider a rapid adaptation of a population with a variable size and subject of a high mutation rate. A rapidly adapting population faces a double bind: it must quickly acquire, often exceeding rare, adaptive mutations and yet avoid mutational meltdown. As a result, adaptive processes frequently fail. Indeed, less than 0.1% of species on earth have adapted fast enough to avoid extinction (22) and, similarly, only about 0.1% of precancerous lesions ever advance to cancer (23). To control cancer or pathogens, we should understand the constraints that evolution imposes on their rapid adaptation.
Here we investigate how asexual populations such as tumors rapidly evolve new traits while avoiding mutational meltdown. Unlike classical theories of gradual adaptation, the evolutionary model we develop has three key features: (i) rare, strongly advantageous driver mutations, (ii) a high mutation rate that makes moderately deleterious passengers relevant, and (iii) a population size that varies with the fitness of individual cells. We found that a tug-of-war between beneficial drivers and deleterious passengers creates two major regimes of population dynamics: an adaptive regime, where the probability of adaptation (cancer) is high; and a non-adaptive regime, where adaptation (cancer) is exceedingly rare.
Adaptive and non-adaptive regimes are separated by a critical population size or barrier to cancer progression that most lesions fail to overcome, and a critical mutation rate that leads to mutational meltdown. We found strong evidence of these phenomena in age-incidence curves and recent cancer genomics data. Agreement of the model with these data allows us to estimate the selective advantages of drivers as 10-50%, a range consistent with recent direct experimental measurements (24). Genomic data also show that deleterious passengers are approximately 100 times weaker. Our model offers a new interpretation of cancer treatment strategies and explains a previously paradoxical relationship between cancer mutation rates and therapeutic outcomes. Most importantly, it suggests that deleterious passengers offer a new, unexploited avenue of cancer therapy. 1

Model
We consider a dynamic population of cells that can divide, mutate (in a general sense, i.e. including alterations, epigenetic changes, etc), and die stochastically. Mutations occur during cell divisions with a per-locus rate µ. The number of driver loci in the genome, i.e. a driver target size, is T d , while the target size for deleterious passengers is T p . Hence, the genome-wide driver and passenger mutation rates are µ d = µT d and µ p = µT p respectively. A driver increases an individual's growth rate by s d ∼ 0.1, while a new passenger decreases the growth rate by s p ∼ 10 −4 − 10 −1 . Here we only consider fixed values of s d and s p because previous work showed that drivers and passengers sampled from various fitness distributions (exponential, normal, and Gamma) exhibit essentially the same dynamics (11). The net effect of multiple mutations on cell fitness w is given by w = (1 + s d ) n d (1 + s p ) np , where n d and n p are the total number of drivers and passengers in a cell.
The birth and death rates of a cell in our model depend not only on fitness, but also on the population size N via a Gompertzian growth function often used to describe cancerous populations (25) (see SI for details). At large N , deaths exceed births and tumors must adapt (or innovate) via new drivers to progress to larger population sizes. Thus, populations in our model expand and shrink in two ways: on a short time-scale due to stochastic cell divisions and deaths, and on a long time-scale due to the accumulation of advantageous and deleterious mutations. Previous models of advantageous and deleterious mutations have not considered a varying population size (26,27).
In cancer and other adapting populations the target size for advantageous mutations (drivers) is much smaller than the target size for deleterious mutations (T d T p ). If driver loci include a few specific sites (∼ 10 per gene) in all cancer-associated genes (approximately 100, (28)), then collectively drivers will constitute less than one one-millionth of the genome. Conversely, as much as 10% of the human genome is well-conserved and likely deleterious when mutated (29,30). In natural populations, T p should still remain much greater than T d simply because natural selection optimizes genomes to their environment, implying that most changes will be neutral or damaging. Indeed, most protein coding mutations and alterations were deleterious or neutral when investigated in fly (31), yeast (32), and bacterial genomes (33). We consider only moderately deleterious loci here (s p ≈ 10 −4 − 10 −1 )-which account for most nonsynonymous mutations (34,35). Deleterious mutations outside of this range either do not fixate or negligibly alter progression (11). Hence, we used a conservative size of T p ≈ 10 5 − 10 7 loci to account for passengers with fitness effects outside of this range that we are neglecting (see SI and Table S1 for details of parameters estimation). This quantity is still much greater than T d . Finally, we explored a variety of driver fitness advantages, as estimates in the literature ranged from 0.0001 (36) to 1 (24). Figure 1A shows the dynamics N (t) of individual populations starting at different initial sizes N 0 , which correspond to different potential hyperplasia sizes (we begin trajectories immediately after a stem cell acquires its first driver, see SI for a discussion of dynamics before this time point). Populations exhibit two ultimate outcomes: growth to macroscopic size (i.e. cancer progression), or extinction. The prevalence of either outcome is determined by a critical population size N * , about which larger populations (N > N * ) generally commit to rapid growth and smaller populations (N < N * ) generally commit to extinction.

A critical population size
To understand the cause of this critical population size N * , we looked at the short-term dynamics of populations. All trajectories show a reversed saw-toothed pattern (Fig. 1B), which result from a tug-of-war between drivers and passengers (11). When a new driver arises and takes over the population, the population size increases to a new stationary value. In between these rare driver events, the population size gradually decreases due to the accumulation of deleterious passengers. The relative rate of these competing processes determines whether a population commits to rapid growth or goes extinct.
We can identify the location of N * by considering the average change in population size over time (< dN/dt >), which is simply the average population growth due to driver accumulation (v d ) minus the population decline due to passenger accumulation (v p ). Fixation of a new driver causes an immediate jump in population size ∆N = N s d . These jumps occur randomly at a nearly constant rate f = µ d N s d , given by the driver occurrence rate µ d N , multiplied by a driver's fixation probability Similarly, passengers' velocity (v p = µ p N s p ) is a product of their rate of occurrence (µ p N ); their effect on population size (N s p ); and their probability of fixation (∼ 1/N , a more accurate measure of this probability is used below and provided in the SI). Thus, we obtain: where N * is the critical population size. Because the population velocity is negative below N * and positive above N * there is an effective barrier for cancer ( Fig. 1C): smaller populations tend to shrink, while larger populations tend to expand. Simulations support our conclusion that the probability of cancer increases with N 0 and sharply transitions at N * (Fig.  1D). Indeed, drastically different probability curves collapse onto a single curve once N 0 is rescaled by N * (computed from equation 2). Since N * captures only the average, or mean-field, dynamics, it misses the variability of outcomes in rapidly adapting populations. Figure 1E illustrates that the variability of outcomes depends upon the strength of drivers s d . Higher values of s d lead to larger stochastic jumps, which leads to larger deviations from mean behavior and more gradual changes in the probability of cancer across N 0 . Thus, we formulated and analytically solved a stochastic generalization of equation 1 that incorporates this variability (SI). Our solution provides an excellent fit to simulations (Fig. 1E) and indicates that N * and s d fully describe population dynamics (SI).
We can understand how N * and s d control cancer progression using a simple random-walk analogy. The population size experiences random jumps, resulting from driver fixation events, which are described by equation 1. These random jumps and declines are effectively a random walk in a one-dimensional effective potential (U ef f = (dN /dt ) dN ), Fig. 1C and SI) with stochastic jumps of frequency f and size ∆N . Similar to chemical reactions activated by thermal energy, cancer progression is a rare event triggered by a quick succession of driver fixations. Below, we show that human tissues operate in a regime where progression is rare and successful lesions are the rare lesions that happen to acquire drivers faster than average. We found that population dynamics depend entirely on two dimensionless parameters: a deterministic mean velocity, dependent only upon N/N * , and a stochastic step-size that is approximately proportional to s d . By reducing the complexity of our evolutionary system to two parameters, we were next able to infer their values for real cancers without over-fitting.

Model validation using cancer incidence and genomic data
Our model of cancer progression predicts the presence of an effective barrier to cancer where small lesions are very unlikely to ever progress to cancer. It also predicts a specific distribution in the number of passenger mutations and a specific relationship between drivers and passengers in individual cancer samples. We looked for evidences of these phenomena in age-incidence data (37) and cancer genomics data (28,(38)(39)(40). These comparisons also allowed us to estimate some critical parameters of the model: N 0 , s d , and s p . Figure 2A presents the incidence rate of breast cancer versus age (37) alongside the predictions from a classic driver-only model (SI) and our model. The incidence rate was calculated by considering a process where precancerous lesions arise with a constant rate r beginning at birth. These lesions then progress to cancer in time τ with a probability P (τ ) that we determined from simulations (Fig. S1). By convoluting this distribution P (τ ) with the lesion initiation rate r, we can predict the age-incidence rate I(t). Because many lesions go extinct in our model and never progress to cancer, the predicted incidence rate saturates at old-age: where P ∞ is the probability that a lesion will ever progresses to cancer, determined above.
Both the observed population incidence rates and our driver-passenger model saturate with age. This is a direct result of the probability of progression from a lesion to cancer being low. We estimate a lower bound for the rate of lesion formation r in breast cancer of at least 10 lesions per year that can be arrived at through two separate considerations: first, by considering the quantity of breast epithelial stem cells and the rate at which they can mutate into lesions (SI), and second, by considering the number of lesions observed  within the breast tissue of normal cadavers (23). By comparing this limit (∼ 10 lesions · year −1 ) to the maximum observed breast cancer incidence rate I max ≈ 10 −2 cancers · year −1 , we find that P ∞ ≈ 10 −3 , or only about 1 in 1,000 lesions ever progress. This finding is consistent with a number of clinical studies that have observed that very few lesions ever progress to cancer, while many more regress to undetectable size (41, 42)-another property seen in our model. Thus, good agreement between age-incidence data and our model is obtained when s d ≈ 0.1 − 0.2 and N 0 /N * is chosen such that P ∞ = 10 −3 . This suggests that cancer begins at a population size far below N * , where drivers are most often overpowered by passengers. Indeed, 21 of the 25 most prevalent cancers plateau at old-age suggesting that progression is inefficient in most tumor types (Fig. S1). In a driver-only model (see SI for details), every lesion progresses to cancer after sufficient time (i.e. P ∞ = 1), therefore a plateau in incidence rate can only result from a very low lesion formation rate (0.01 per year), which is inconsistent with abundant pathology data (23,43). Recent cancer genomics data offer a new opportunity to validate our model. Specifically, we looked at Somatic Nonsynonymous Mutations (SNMs) and Somatic Copy-Number Alterations (SCNAs) derived from over 700 individual cancer-normal sample pairs obtain from the breast (38), colon (28), lung (40), and skin (39) ( Table S2). We found similar results when analyzing SNMs and SCNAs both separately (Fig. S2, Table S3) or in aggregation (Fig. 2BC). Figure 2B shows a wide and asymmetric distribution of the total number of mutations, which is consistent with our model under realistic parameters. A driver-only model yields a narrower and more symmetric distribution that is inconsistent with the data (Fig. 2B). The driver-only model can fit the observed distribution only if it assumes that just 1-2 drivers are needed for cancer (Fig. S3, Table S4)-a value inconsistent with both the extent of recurrent mutations seen in cancer (10) and known biology. This large variance in mutation totals further supports our model and suggests that driver mutations and alterations have a large effect size: s d ≈ 0.4. Our estimate of s d ≈ 0.1 − 0.4 obtained from age-incidence and mutation histograms is in excellent agreement with experimentally measured changes in the growth rate of mouse intestinal stem cells upon induction of p53, APC or k-RAS mutations where measured values ranged from of 0.16 to 0.58 (24).
We then used cancer genomics data to compare the number of drivers and passengers observed in individual cancer samples to our model's predicted relationship. In our model, additional passengers must be counterbalanced by additional drivers for the population to succeed. If a lesion lingers around N * for a long time, then it must have acquired both many passengers and many counterbalancing drivers; while lesions that quickly progress through the barrier at N * acquire fewer of each. As a result, we expect a positive linear relationship between the number of drivers and passengers: n d · s d − n p · s p = constant; this result follows directly from the definition of fitness in our model (SI). Our predicted positive linear relationship between drivers and passengers is indeed observed in all tumor types that we studied (Fig. 2C, Table S3, p < 0.08 − 10 −6 ). The slope of this regression line is predicted to be s p /s d , which ranged from 1/21 to 1/193 (Table S3) for the various subtypes. While there is considerable variation and large margins of error in these numbers, these slopes (s p /s d ≈ 5 · (10 −2 − 10 −3 )) correspond to an s p of 5 · (10 −3 − 10 −4 ) when s d = 0.1. These rough values are similar to germ-line SNMs in humans of European descent, where 64% of all mutations exhibit an s p between 10 −5 and 10 −2 (35).
We considered and refuted several alternative explanations for the observed positive linear relationship between drivers and passengers. First, that the strength of SCNAs may differ from SNMs. Hence, we investigated each alteration-type separately and found positive linear relationships in both cases (Fig. S2,  Table S3). Second, that the number of driver alterations might be explained by variation in the tumor stage, or the rate and/or mechanism of mutagenesis. In Table S3 we show that these factors cannot suppress the correlation between drivers and passengers. Lastly, we considered and refuted the possibility that this relationship between drivers and passengers is non-linear (Fig. 2C insert). Because the data disagrees with all of these alternate hypotheses, we believe that it supports our conclusion that cancer progression is a tug-of-war between drivers and passengers.

A critical mutation rate
We next used simulations to investigate the probability of cancer over a broad range of evolutionary parameters ( Fig. S4) and found that there is a critical mutation rate above which the probability of cancer is exceedingly low (Fig. 3A). To explain this phenomenon and to find the parameters that determine this critical mutation rate µ * , we modified our analytical framework to consider selection against passengers and the effects of unfixed passengers on the accumulation of drivers. The modified framework, described in the SI, explains observed dynamics well (Fig. 3AB, S4). Previous theoretical work has shown that the number of unfixed passengers per cell is Poisson distributed with mean µ p /s p [first described in (44)]. This result assumes an approximate balance between the mutation rate of passengers and the selection against them, otherwise known as mutation-selection balance. The average fitness reduction of a cell due to this mutational load (i.e. the reduction in fitness relative to the fittest cells in the population) is µ p . A new driver arises in one of these cells at random and must carry the load of passengers residing in its cell along with it to fixation (18) (Fig. 3C); this process is often referred to as hitchhiking, so we describe these passengers as 'hitchhikers'. If the reduction in fitness due to the load of passengers (µ p = µT p ) exceeds the benefit of a new driver (s d ), then the driver will not fixate (Fig. 3C). Hence, cancer is extremely rare when µ p > s d . This suggests a critical mutation rate: The critical mutation rate suggests a new mode by which mutational meltdown operates. Prior models of mutational meltdown consider deleterious mutations in isolation (17), whereas our model points at the ability of deleterious mutations to inhibit the accumulation of advantageous mutations as a mechanism of  Figure 2: Signatures of balance between drivers and passengers in incidence and genomics data (A) Predicted and observed breast cancer incidence rates verses age. Notice that in our model, as well as in the data, incidence rates plateau at old age. A traditional driver-only model, where the incidence rate I increases with patient age t according to a power-law (I ∝ t k ), does not saturate. (B) Histogram of the collective number of protein-coding mutations (SNMs) and alterations (SCNAs) in breast cancer alongside predicted distributions. Our model, captures the width and asymmetry of the distribution well for s d = 0.4, while a driver-only model predicts a narrower and symmetric distribution. (C) The total number of driver verses the number of passenger alterations in sequenced tumors for several major subtypes. SC-NAs and SNMs were aggregated. As predicted by the model, all subtypes exhibited a positive linear relationship between the number of drivers and passengers (p < 0.08 − 10 −5 ). A driver-only model with neutral passengers does not predict this linear relationship between drivers and passengers. (Insert) The same genomics data plotted on log axes, with the yintercept from each subtype's linear fit subtracted. A linear relationship on logarithmic axes with a slope of approximately one suggests that the relationship between drivers and passengers is indeed linear. meltdown. While it has been previously shown that deleterious mutations interfere with the fixation of beneficial alleles (18,26,45), this phenomenon has never been studied in the context of population survival. We discuss some important implications of this critical mutation rate for cancer treatment below.
We found support for the critical mutation rate in both cancer age-incidence and cancer genomics data. If we constraint cancer progression to develop within the typical timeframe for cancer progression (i.e. when we begin to see a plateau in incidence: ∼ 60 years or 10,000 generations), the probability of cancer exhibits an optimum across mutation rates (Fig. 3D). Above µ * population meltdown is very common, while at very low mutation rates progression is too slow. The optimal mutation rate (10 −9 − 10 −8 mutations · nucleotide −1 · generation −1 ) is similar to mutation rates observed in cancer cell lines with a mutator phenotype (9) and the inferred mutation rate derived from the median number of mutations observed in a pan-cancer study of >3,000 tumors (10). Because µ * depends only on s d and T p , and is independent of other variables, we believe the maximal mutation rate should be the same across tumor subtypes (SI, Fig. S4). Indeed, a maximum of approximately 100 somatic mutations per Mb (99.4th percentile) was observed in the pan-cancer study mentioned above, which corresponds to our theoretical estimate (if we assume that the most mutagenic cancers still require 1,000 generations to progress).  Figure 3: Effect of mutation rate on cancer dynamics (A) The probability of cancer (adaptation) computed by simulations as a function of the initial population size and mutation rate. Evolutionary parameters roughly partition into a regime where cancer (adaption) is almost certain, and a regime where it is exceedingly rare. Estimates of N * from our theory (magenta, solid) accurately predict the transition observed at low mutation rates. Another transition is observed as mutation rate exceeds the critical mutation rate, also predicted by theory (µ * , blue line). Transition between these two regimes is better described by our theory when we incorporated (i) passenger interference with driver sweeps, and (ii) selection against passengers (black lines; dashed, solid, and dotted-dash predict 10%, 50%, and 90% probability of adaptation). (B) Cancer (adaption) probability (color) obtained by simulations as a function of mutation rate and s p . The theory (black lines) accurately reproduces the complex transition between both regimes. (C) Diagram illustrating how the load of passengers influences the probability of fixation of a driver. The distribution of fitness is due to a distribution of the number of passengers per cell, which follows a Poisson distribution with a mean reduction in fitness of µ p . Hence s p > µ p for a typical driver to outweigh the load of passengers. Segregating passengers not only reduce a driver's probability of fixation, but also its fitness benefit (26). (D) Probability of cancer for various mutation rates constrained to grow within 10,000 generations. We observe an optimum mutation rate.

Two regimes of dynamics
Taken together, our results demonstrate that the tug-of-war between advantageous drivers and deleterious passengers creates two major regimes of population dynamics: an adaptive regime where the probability of progression (cancer) is high (∼ 100 %) and a non-adaptive regime where cancer progression is exceedingly rare (Fig. 3, S4). Evolving populations that fail to adapt and go extinct may do so because they reside in a non-adaptive region of the phase space. Similarly, normal tissues that avoid cancers may present a tumor microenvironment that is in this non-adaptive regime. By keeping N * sufficiently high, a tissue or clinician could keep cancerous populations outside of the adaptive regime. This critical population size N * = T p s p /(T d s 2 d ) depends on the evolutionary parameters of the system. For example, if s p were increased by tuning the response of the immune system to mutation-harboring cells, or if T d were decreased via a drivertargeted therapy, adaptation would become less likely. Below we demonstrate that a successful treatment must push a cancer back to the non-adaptive regime.

The adaptive barrier and critical mutation rate explain cancer treatment outcomes
We simulated cancer growths and treatments and then monitored the long-term dynamics of these populations. Most treatments used today attempt to reduce tumor size, e.g. by specifically inhibiting key drivers (46) or by simply killing rapidly dividing cells (chemotherapy and radiation). Chemotherapy and radiation also elevate the mutation rate, thus affecting evolutionary dynamics. Previous work on the evolution of resistance to therapy has not considered the barriers to adaptation that we observe, so we re-investigated evolutionary outcomes from standard therapies and identified new potential ways to treat cancer. While real cancers have a varied evolutionary history, our analytical formalism predicts that cancer's future dynamics depend only on their current state, not their history (i.e. cancer dynamics are approximately path-independent, Fig. S5). We show below that this assumption can accurately predict outcomes in simulations and the clinic.
In Figure 4, we present the evolutionary paths of cancer-from hyperplasia, to cancer, to treatment, and relapse or remission-on top of the phase diagrams described earlier. Our analysis demonstrates that a treatment is successful if it pushes a cancer into the non-adaptive regime of evolutionary dynamics where the probability of adaptation is low. Conversely, therapies fail, and populations re-adapt and remiss, when the therapy does not move cancer far enough to place it in the non-adaptive regime.
Our model suggests that chemotherapies succeed, in part, because they move cancers across the mutational threshold µ * . Beyond this threshold, the probability that a driver is strong enough to overpower a load of passengers becomes small (see above, Fig. 2C), making it hard for cancer to readapt. Increasing the mutation rate has little effect on the critical population size N * (see above). Thus, our model suggest that cancers with a very high load of mutations/alterations are close to the critical mutation rate and should be more susceptible to mutagenic treatments, such as chemotherapy. Several recent studies (13,47) have noticed that patients survive breast and ovarian cancer most often when their tumors exhibited exceptional high levels of chromosomal alterations. This phenomenon was robust within and between subtypes of breast cancer (47). This finding is paradoxical for all previous models of cancer, where a greater mutation rate always accelerates cancer evolution and adaptation; yet is fully consistent with our model (Fig. 4B).
Treatments exploiting the mutational load of cancers (i.e. their accumulated passengers) remain largely unexplored. We show that increasing the deleterious effect of passengers s p causes tumors to enter remission. Increasing s p is doubly effective because it exacerbates the deleteriousness of accumulated passengers and also slows down future adaptation. When we simulate such treatment by a 3-5 fold increase of s p (Fig. 4C), we observe an immediate decline in the population size followed by a low probability of replace due to an increased N * . The phase diagram shows that a mild increase in s p is sufficient to push a population into an extinction regime and thus induce remission. Below we discuss possible treatment strategies that would increase s p .
Given the large number of treatment options, finding therapies that work synergistically is a very important problem in cancer research (reviewed in (48)). While synergism is often discussed in the context of pharmacology, our phase diagrams identify evolutionarily synergistic treatments. We found that remission was most likely to occur when the mutation rate and the fitness cost of passengers were increased simultaneously, more so than would be expected from simply adding together the effects of the individual therapies (Fig. S6). Hence, combinations of mutagenic chemotherapy along with treatments that elevate the cost of a mutational load may be most effective. According to our model, these therapies should also be compatible and complementary to driver-targeted therapies.

Discussion
We present an evolutionary model of rapid adaptation that incorporates rare, strongly advantageous driver mutations and frequent, mildly deleterious passenger mutations. In this process, a population can either succeed and adapt, or fail and go extinct. We found theoretically, and confirmed by simulations, two regimes of dynamics: one where a population almost always adapts, and another where it almost never adapts. Complex stochastic dynamics, which emerge due to a tug-of-war between drivers and passengers, can be faithfully described as diffusion over a potential barrier that separates these two regimes. The potential barrier is located at a critical population size that a population must overcome to adapt. We also found a critical mutation rate, above which populations quickly meltdown. This general framework for adaptive asexual populations appears to be perfectly suited to characterize the dynamics of cancer progression and responses to therapy. Progression to cancer is an adaptive process, driven by a few mutations in oncogenes and tumor suppressors. During this process, however, cells acquire tens of thousands of random mutations many of which may be deleterious to cancer cells. While strongly deleterious passenger mutations are weeded out by selection, mildly deleterious can fixate and even accumulate in a cancer by hitchhiking on drivers, as we have shown earlier (11). Passengers may be deleterious by inducing loss-of-function in critical proteins (14), gainof-function toxicity via proteotoxic/misfolding stress (15,49), or by triggering an immune response by a mutated epitope (16,50). While we looked at passengers in cancer exomes and in SCNAs, passengers may also constitute epigenetic modifications, or karyotypic imbalances (15). Hence the number of deleterious passengers may be more than currently observed by genome-wide assays.
Our framework suggests that most normal tissues reside in a regime where cancer progression is exceedingly rare; i.e. most lesions fail to grow above the critical population size and, thus, fail to overcome the adaptive barrier. Clinical cancers, on the contrary, reside above the adaptive barrier in a rapidly adapting state. Therapies must push a cancer below this adaptive barrier to succeed. In our framework, this entails moving the population below N * or increasing the mutation rate above µ * . The availability of a broad range of data for cancer allowed us to thoroughly test our framework's applicability.
We tested out model and estimated its parameters using cancer age-incidence curves, cancer exome sequences from almost 1,000 tumors in four cancer subtypes, and data on clinical outcomes. Age-incidence curves support the notion that the vast majority of lesions fail to progress and allow us to estimate the fitness benefit of a driver as s d ∼ 0.1 − 0.4. Genomics data suggests that passengers are indeed deleterious and that their deleterious effect is approximately one hundred times weaker than driver's beneficial effect. Moreover, the fitness benefit of a driver estimated from genomic data is roughly s d ∼ 0.4. Our range of values are consistent with recently measured 16-58% increases in the mouse intestinal stem cell proliferation rate upon mutations in APC, k-RAS or p53 (24). Taken together these data support the notion of a tug-of-war between rare and large-effect drivers and frequent, but mildly deleterious passengers s p ∼ 5 · (10 −4 − 10 −3 ), which nevertheless have a large collective effect.
Results of our analysis have direct clinical implications. Available clinical data (13,47,51) show that cancers with a higher load of chromosomal alterations, i.e. close to µ * , respond better to treatments. Our study suggests two potentially synergistic therapeutic strategies: to increase the mutation rate above µ * , and/or to increase the deleterious effect of accumulated passengers. An increase in the fitness cost of passengers would not only magnify the effects of an already accumulated mutational load, but also reduce future adaptation. This may be accomplished by (i) targeting unfolding protein response (UPR) pathways and/or the proteasome (15), (ii) hyperthermia that may further destabilize mutated proteins or clog UPR pathways (52), or (iii) by eliciting an immune response (16). Intriguingly, all these strategies are in clinical trials, yet they are often believed to work for reasons other than by exacerbating passengers' deleterious effects. In contrast, we predict that these therapies will be most effective in cancers with more passengers and an elevated mutation rate. Thus, characterizing the load of mutations/alterations in tumors may offer a new biomarker for predicting treatment outcomes and identify the best candidates for mutational chemotherapies.
While this study focused on asexual innovative evolution in cancer, our model may be generally applicable to other innovating populations. Consider a population in a new environment. The population is often initially small and fluctuating in size and often goes extinct, yet occasionally it expands to a larger stationary size by rapidly acquiring several new traits that are highly advantageous in the new environment. Both the evolutionary parameters (53) and observed phenomena (54) match our model well. Our mathematical framework may explain why these populations sometimes adapt, yet often fail.

Materials and Methods
All simulations were run using a previously-described first-order Gillespie algorithm (11). Driver genes were identified using MutSig (10) and GISTIC 2.0 (55) for potential NSM and SCNA drivers, respectively; requiring a Bonferroni-corrected enrichment p-value ≤ 5 · 10 −3 for classification of a gene as driver-harboring. All other genes were classified as 'passenger-harboring'.

Mathematical description and analysis of our model of advantageous drivers and deleterious passengers.
In this section, we present an exact mathematical formulation of our model, describe the broad ranges of parameters that we chose to explore, and offer an analytical description of our model. In our analytical description, we estimate (i) the effects of stochasticity on population dynamics, (ii) the rate of accumulation of deleterious passengers, and (iii) the interference of driver accumulation by deleterious passengers.

Detailed formulation of our model
As mentioned in the main text, we model cancer via a first-order Gillespie Algorithm. Each cell within the cancer is represented by a separate "chemical species" or reactant in the Gillespie algorithm. Cells are defined by their state: {n d , n p } 1 . n d denotes the number of drivers in the cell, while n p denotes the number of passengers. Cells can then divide, with and without mutations, and die according to the following half-reactions: The functions B(n d , n p ) and D(N ) represent the birth and death rates of cells, while N represents the total number of cells in the precancerous population. The birth rate assumes multiplicative fitness effects of mutations and no epistasis between mutations: We also define a generation in terms of the mean division time: The death rate is defined such that, in the absence of mutations, the expectation value of the population size will obey a Gompertz curve at large sizes and a logistic curve at small sizes: We used a simpler form of this death function for populations grown to less than 10 6 cells: This second functional form did not significantly alter dynamics at small sizes [28], has been used previously [24], is easier to calculate, and seemed equally justified to us for small sizes because very little is known about the true carrying capacity of a tumor micro-environment in its early stages. Lastly, the number of new drivers δd and passengers δp acquired during cell division are Poisson-distributed random variables with mean U d and U p , respectively. For example, P (δp = k|U p ) = U k p e −Up k! . Many of the particular design choices and properties of our model were altered and then investigated in a previous study [28]. Specifically, we considered (1) the effects of mutations with additive effects [i.e. B(n d , n p ) = 1 + n d s d − n p s p ], (2) the effects of mutations that alter the death rate [i.e. D = D(N, n d , n p )], (3) the effects of driver and passenger mutations selected from various distributions (exponential, normal, and gamma) of fitness effect sizes, and (4) variations on the relation between population size and death rate. For the parameters that we believe are most relevant to cancer (Table ), these permutations did not qualitatively alter our simulations. However, in the analytical analysis presented below we discuss the boundaries where assumptions of our model break-down; this was, in part, why we analyzed the model in such detail.
It should be noted that many of the considerations discussed above are germaine to all models of tumor progression, not simply the in silico model presented here. Consider that recent data on growth rates of human tumors differs from data obtain from mouse models: human tumors grow according to an exponential curve [7], while mouse tumors grow according to a Gompertz curve [14]. Careful mathematical consideration of the differences between a model of progression where growth is exponential, and one where growth is Gompertzian, should allow us to understand when it is necessary to refine the design of simulations and experiments and/or temper our conclusions.
Before describing the entire dynamics of our model, it is useful to consider the difference between our simulations initiated at their stationary size (N 0 cells) and simulations initiated at 1 cell. In the absence of mutations, an initial population of one cell will grow logistically until it reaches the stationary size. Hence, it takes approximately Log 2 [N 0 ] ∼ Log 2 [10 3 ] ∼ 10 generations for the initial cell to approach stationary size. This is far shorter than the average time required for cancer progression (∼ 10, 000 generations) and the time required for a new driver to accumulate (∼ 1/(U d N 0 s d ) ∼ 1, 000 generations). Thus, our choice of initiating a tumor at one cell versus N 0 does not significantly alter the conclusions of our model. This comparison of timescales also suggests that cancers are almost always near their stationary size: We previously tested this conclusion in simulations and found that it is a excellent approximation of tumor size [28]. If we assume B(n d , n p ) ≈ B(n d , n p ), then a relationship between the number of drivers and passengers in a tumor and its size is obtainable: This final equation suggests that there exist a linear relationship between drivers and passengers among tumors with similar s d and s p , which we assume is the case for tumors of the same tissue of origin. The relationship should be relatively robust to tumor size, but sensitive to the fitness effects of drivers and passengers. Moreover, changes in the functional form of D(N ) will alter the y-intercept of this linear relationship, but not the slope of the relationship. Hence, we can draw conclusions about the relative strength of drivers versus passengers (s d /s p ) without knowing the exact constraints on population size. We tested and verified this prediction of a linear relationship between drivers and passengers in the main text. Our computational model has 5 independent parameters: a mutation rate µ, a mutation's relative likelihood of being a driver versus a passenger T d /T p , the fitness benefit of a driver s d , and disadvantage of a passenger s p , and an initial stationary size N 0 . These parameters vary considerably between tumor types (and the mutation rate even varies within tumor types [26]), so we explored a wide range of values centered around literature best-estimates (Table ). More importantly, our analytical analysis reveals that we can describe our system with two dimensionless parameters, which we then estimated from age-incidence and genomics data (Fig. 2).

Overview of our analytical model for dynamics
In the main text, we demonstrate that dynamics are described by two countervailing forces: an upward velocity v d resulting from accumulating beneficial drivers, and a downward velocity v p resulting from accumulating deleterious passengers. The upward velocity v d was further subdivided into a product of the rate at which new drivers fixate in the population f times their effect on population size once fixated ∆N (Fig.  1B) 2 . The velocities v d and v p are balanced at a critical population size N * , at which the population is approximately equally likely to go extinct or progress to cancer.
While we were able to describe the average behavior of our population in the main text, our system, like cancer, is inherently stochastic. Its complete dynamics is best described by a differential equation with stochastic jumps: In this equation, the change in population size dN is the product of a deterministic component −v p , along with a stochastic component describing the random arrival of new drivers (∆N dn d ). Below, we use this equation to estimate the probability of cancer for any population size P cancer (x) and the mean waiting time to cancer t cancer (x). Lastly, we noticed that simulations differed from the formalism we presented in the main text when we varied the mutation rate µ and explored a broader range of passenger deleteriousness s p (Fig. 2, S4). These discrepancies could be resolved by considering two phenomena neglected by our first derivation: selection against passengers, and passenger's effect on both the fixation probability and clone fitness of drivers. Fortuitously, accounting for these phenomena did not alter Eq. 3, nor the overall framework of our analytical model. Instead, they only affect the rates v p , f , and jump size ∆N in our model. Thus, with the refined formalism, we described dynamics across a very broad range of parameters (Fig. 2,  S4). More importantly, we observe drastic reductions in the probability of adaptation at high mutation rates and encompassing moderately deleterious passengers. These findings suggest novel strategies to cancer therapy. Population size is the state variable of our system and, as such, is all that is needed to describe future dynamics (this is evident from Eq. 3, and can be observed in Fig. S5). By converting population size into a dimensionless parameter x = N/N * (and x 0 = N 0 /N * ), the probability of cancer collapse onto a simple curve P cancer (x) (Fig. 1)-further underscoring the importance of the critical population size. Hence, we will use this dimensionless quantity heavily throughout the remainder of our analysis.

Estimating the probability of cancer
Using Eq. 3 we can describe how the probability of extinction changes in an infinitesimal time due to either passenger accumulation or a rare driver jump: Note that f , ∆N , and v p are all functions of x. In this equation, we see that is the probability of cancer at x is the probability of a jump times the probability of cancer after the jump (f (x)dtP cancer [x+∆N (x)]) plus the probability of decline times the probability of cancer after the decline ( . The probability of cancer after a decline can be expanded via a Taylor series: . This reduces the above equation to: Here θ = (x + ∆N )/x ≈ 1 + s d denotes the logarithmic change in population size after a driver jump. Like ∆N , v p = µ p s p N and f = µ d s d N are also linear in x. Thus, by expanding the logarithm of θ via Maclaurin Series: P cancer [x + ∆N (x)] ≈ P cancer (x) + xLog(θ)P cancer (x) + x 2 Log 2 (θ) + P cancer (x), we arrive at a now solvable differential equation: x 2 Log(θ)P cancer (x) + [Log(θ)x + 2x + 2N * ]P cancer (x) = 0 With this equation, and boundary conditions that follow from our definition of cancer and extinction: , we can solve for the probability of cancer after infinite time: Here, γ(s, x) = 1/Γ(s) x 0 e −t t s−1 dt : Γ(s) = ∞ 0 x s−1 e −x dx is the normalized incomplete gamma function. This solution is parameterized by two dimensionless quantities: θ and x, which represent the jump size in population of driver sweeps and our effective population size respectively.

Estimating the mean time to progression
We can also use Eq. 3 to solve for the waiting time to cancer. This can be accomplished in two ways: (1) we can simulate random driver jumps and deterministic passenger decline directly, and (2) we can approximate the mean waiting time to cancer using a Taylor expansion similar to the strategy we employed to solve for the probability of cancer. These two approaches agree with each other (thus, illustrating their accuracy), and offer key insights into the evolutionary parameters that affect age-incidence curves (Fig. 2, S1).
Eq. 3 can be simulated using a "hybrid" Gillespie algorithm: a meta-simulation of driver-and passengeraccumulation events that we, originally, observed arising from our atomistic simulations of birth, death, and mutational events. The advantage of this technique is that it allows us to quickly simulate billions of tumors, which would be computationally impossible via the more detailed simulations. Because we are confident that we are accurately estimating the rate of driver and passenger accumulation events (Fig. S4), this simplification should retain accuracy. To simulate Eq. 3 directly, we must consider that the instantaneous probability of a driver jump f [x(t)] is a function of a constantly declining population size due to passenger accumulation: Here, x n d is the population size after the last driver jump. Thus, the waiting time between drivers ∆t = t n d +1 − t n d is: , where ζ is an exponentially-distributed random number with mean 1. Using our precise calculations of f , v p and ∆N below, we can now simulate Eq. 3 directly.
We can also solve Eq. 3 for t cancer , using the exact same approximations as we did to estimate P cancer (x). To do this, we begin with a Master Equation for the probability of acquiring a cancer after waiting time t when currently at size x: The mean waiting time to cancer is then: By substituting the Master Equation into this definition, and by utilizing the first-order Taylor series expansion: , we find: The first integral in this solution is simply the definition of our mean waiting time (t cancer (x)). The second integral can be integrated by parts by noting that lim t→∞ tP cancer (x, t) = 0 (otherwise, t cancer (x) would be undefined). Lastly, the third integral reduces to t cancer (x). Thus, we eventually find: Here, ρ c (x) = P cancer (x)t cancer (x). This equation has a nearly identical form to Eq. 4. So we used a similar Second-Order Maclaurin series expansion of θ to approximate its solution: This equation can be solved using Simpson's Method and is in good agreement with the hybrid simulations described in the preceding paragraph (Fig. S1). This calculation of the waiting time to cancer is most illustrative when x 1-the regime that we expect to contain most tumors. In this regime, the mean time increases as −Log(x)/v p , which implies two interesting properties of t cancer . First, x has a very weak, sub-linear, effect on the waiting time and does not significantly alter the shape of incidence curves (Fig.  S1). Second, the waiting time to cancer is dictated by v p (the accumulation rate of passengers), thus offering yet another reason to understand the rate of deleterious passenger accumulation.

Accumulation of deleterious passengers
Passenger mutations accumulate and drag populations down by a rate v p . This quantity is a product of passengers arrival rate µ p N , their fixation probability π p , and their effect on population size once fixated N s p (i.e. v p ≈ µ p s p N ). In the main text, we assume that the fixation probability is approximately neutral (π p ≈ 1/N ); however, when selection is stronger that genetic drift, the fixation probability becomes less than the neutral rate. A number of studies have focused entirely on estimating this fixation probability [9,19,20,30]. In general, estimates of this fixation probability begin by considering the distribution of deleterious alleles in a population of infinite size in mutation-selection balance-where allele frequencies are not changing. At equilibrium, such a population exhibits a Poisson distribution in the number of segregating passengers within cells N np , defined by a characteristic parameter λ p = µ p /s p (Fig. 3C): If we then consider a population of finite size, we find that the allele frequencies fluctuate due to genetic drift. If fluctuations in the fittest class (N np=0 = N e −λp ) are large enough to cause this fittest class to go extinct, then it is irrevocably lost from the population. This irrevocable loss is considered a 'click' of Muller's Ratchet.
The new fittest class-individuals harboring one segregating passenger prior to the 'click'-then relaxes to a new equilibrium that fluctuates, and the process repeats. Estimating the time required for a new fittest class to relax to equilibrium size immediately following a 'click' is non-trivial and dependent upon the parameters of the system: N , s p , and µ p , which can vary by orders of magnitude depending upon the evolutionary system in question; hence there are many estimates of the exact rate of Muller's Ratchet. We present and utilize 3 estimates of the rate of Muller's Ratchet: (1) a solution that ignores the time to equilibration and works decently for most values of s p , µ p , and N considered here (Fig. 2, magenta lines); (2) a travelingwave solution that allows the distribution of segregating passengers to be far from equilibrium, but presumes that that the size of neighboring fitness classes are uncorrelated, accurate for large values of λ p [9]; and (3) a path-integral solution that considers correlations between neighboring fitness classes, but requires that the population be in quasi-equilibrium (i.e. near mutation-selection balance), accurate for small values of λ p [19]. These later two estimates were combined with an estimate of the number of hitchhiking passengers and their effects on driver fixation events to form a very precise description of our model's dynamics (Fig.  3, S4; black lines). We felt that offering these two refined models to readers was useful because they trade-off applicability, simplicity, and accuracy. If we simply ignore the equilibration tome of a population into mutation-selection balance, then we can estimate the rate of Muller's Ratchet with a closed form solution that is applicable to all values of s p , µ p , N investigated here. The other two solutions each apply only to their respective halves of the parameter space and are more complex, but also more accurate. To our knowledge, we are the first authors to present this simplified solution. We obtain the solution by assuming that the probability of a 'click' is approximately the probability of a new passenger fixating within the fittest class: N np=0 = N e −λp . In other words, to a first-approximation, deleterious passengers simply reduce the effective population size of our system. The probability of a lone deleterious allele fixating within this fittest class is describe by a Moran Process [31]. Hence, This refined fixation probability π (1) p is then used to correct the downward velocity due to passengers, using the same formula for v p derived in the main text: This equation links v p to the passenger fixation probabilities calculated above, and the other two fixation probabilities calculated below.
The solution for Muller's Ratchet as a traveling wave, which we apply when λ p < 1, was obtained from [9]: [S11] Because this equation is transcendental, we solved for π (2) p using Brent's Method. When λ p ≥ 1, a quasi-stationary analysis of the mutation classes becomes appropriate. This analysis was first done in [19], resulting in a solution of the form: [S12] The fixation probability is then simply the inverse of the 'click' time: π (3) p = 1/T click . Lastly, there is a discontinuity between the above two solutions at their intersection: λ p = 1. We resolved this by interpolating between the two solutions, as follows:

Effects of deleterious passengers on fixation probability and clone fitness of drivers
The occurrence and fixation of driver mutations are rare events, separated by nearly random time intervals, with a frequency of occurrence f = µ d N π d . Here, π d is the fixation probability of a new mutant driver once it arises in the population. In the first-order model presented in the main text, we estimate that However, this result assumes that there are no other non-neutral alleles in the population. In reality, there are many segregating passengers in the population, and potentially other segregating drivers. The presence of other drivers in the population, which interfere with the fixation of our clone of interest, is a phenomena commonly described as Clonal Interference [18]. Clonal Interference becomes significant in the population once the time required for a driver to fixate [∼ Log(N )/s d generations] approaches the fixation rate (f ≈ µ d N s d ). Nascent precancerous population are in a space of evolutionary parameters where Clonal Interference is particularly negligible: population size is small (N ∼ 10 3 ), and drivers are rare (µ d ∼ 10 −5 ), but strong (s d ∼ 10 −1 ). Thus, we do not consider its effects here. However, for a larger tumor population, clonal interference may become very significant. This is especially true in a poorly-mixed population, where beneficial alleles take longer to sweep through the population [25].
Segregating passenger mutations can also interfere with a driver sweep by 'hitchhiking' on the expanding clone [2,23]. We consider two types of hitchhikers: (1) those that reside in the Initial clone before the new driver arises (denoted δp I ), and (2) those that arise and fixate in the new driver clone as it Sweeps through the population (denoted δp S ). It is necessary to distinguish hitchhikers this way because only the initial hitchhikers (δp I ) significantly alter the fixation probability f , while both types alter the effect size ∆N .
The hitchhikers that accumulate during the sweep will generally arise after the clone is of appreciable size; however, once the driver clone is of appreciable size, it is exceedingly likely that it will fixate so long as it remains the fittest clone in the population.
Here, we consider only the average number of hitchhikers in a driver sweep (δp I and δp S ), rather than their entire distribution of quantities; estimates of the average number of hitchhikers appear to explain dynamics reasonably well (first shown in [23] and also evident from our analysis' good agreement with simulations Fig.  S4). Thus the probability that a new clone fixates in the absence of Clonal Interference is (Fig. 3C): , and the jump size ∆N becomes: We can conclude our analysis of hitchhikers once we obtain δp I and δp S . These quantities were first derived in [23]. We use their results (summarized below), along with a minor necessary adjustment for populations when λ p is large, to complete our analytical model of cancer progression. For a new driver clone to take over the population and fixate, it has been shown that its fitness must be greater than the fittest class in the population [23]. This imposes a maximum on the number of initial hitchhikers δp max I that a successful driver clone can have: A clone that does not satisfy this constraint may proliferate for a while in the population, but it will nevertheless be eventually out-competed by fitter clones. When the mean number of hitchhiking passengers (λ p ) approaches this maximum, hitchhikers dramatically reduce both f and ∆N , thus increasing N * to untenable sizes. This occurs when: Hence, our analysis suggests a limit on the maximum mutation rate that an adapting population can tolerate: µ * p ≈ s d . In simulations, we observe extinction slightly above this threshold (Fig. 3A, S2). This mechanism of collapse, where populations go extinct by failing to acquire new advantageous mutations or adaptations, differs from the traditional model of mutational meltdown. In the traditional model, advantageous mutations are generally ignored and meltdown occurs only because deleterious mutations accumulate too quickly. In our model, however, traditional mutational meltdown is difficult because populations also acquire advantageous mutations faster as the mutation rate increases. Moreover, traditional meltdown occurs only when the population size is small, making it impossible to occur in a large population like cancer. Our discovery of a new mechanism of meltdown that is independent of population size suggests that mutational meltdown may be induced via cancer therapeutics.
The number of initial segregating passengers in a clone when a driver arises (δp I ) can be obtained by considering, once again, the population at mutation selection balance, i.e. Eq. 8. The average number of initial hitchhiking passengers is simply the average of the likelihood of a driver arising in each mutational class, conditional on the driver successfully sweeping through the population: Here, N = δp max I δp I =0 π d (δp I ) is a normalization constant. The above solution fails when λ p is large. In this circumstance, the population is far from mutationselection balance. Rectifying the solution in this case is difficult to do precisely, however we find that a simple correction to Eq. 16 can crudely ameliorate the estimate. Because the assumption of mutationselection balance fails only once the expected number of passengers in the fittest class becomes very small (N np=0 = N e −λp ∼ 1), we propose that the actual fittest surviving class in the population is the first class of passengers with an expected population size that is greater than the size of fluctuations in the population. Because the variance in a birth and death process is the sum of the rates (2N in our model), the Fittest Surviving Class k FSC is: The corrected distribution of δp I then becomes: This simple correct yields a final solution for P cancer that agrees with simulations well (Fig. S4).
Lastly, the number of passengers that accumulate during the selective sweep (δp S ) can be calculated using a recursive relationship. that begins with the probability of accumulating the maximum possible passengers during the sweep δp max I [23]:

A traditional model of cancer progression with drivers and neutral passengers.
In the traditional model of cancer progression used to estimate age-incidence curves, it is assumed that a cancerous population transitions through k intermediate states before malignancy: Simply put, these intermediate states and transitions correspond to the many phenotypic changes that occur within a tumor as it progresses [21]. The instantaneous probabilities of each transition from one state to the next r i can vary in the general case. Nevertheless, it has been shown that this predicts similar age-incidence rates to a model where transition rates are all the same [1]. Thus, for parsimony we only consider the case where all transition rates are the same constant r. Moreover, if the transition rates are drastically different from one another, then dynamics will largely be determined by the slowest rate and the faster rates can be neglected. From a mathematical perspective, this model is agnostic towards the underlying event that transitions a precancerous population from one state to the next. However from a genetic perspective, each transition corresponds to the acquisition of a new driver alteration in the population. If rate-limiting events are nonheritable, then the inferences we draw from age-incidence curves about this traditional model may break down; however, we would like to reiterate that a large variety of events can be drivers in the sense that they are inheritable across cell division: SNMs, SCNAs, alterations in DNA and histone moieties, stable changes in cell signaling cascades, etc. Therefore, we believe it is reasonable to assume that each rate-limiting step is the acquisition of a new driver, as has been presumed for many years [29].
We now consider the properties of this model when neutral passengers also accumulate. These neutral passengers do not alter progression. The precancerous population is now defined by the state C n d ,np . We consider the case where drivers accumulate at a fixed rate r d and passengers accumulate at different fixed rate r p : As before, cancer arises once enough drivers accumulate (C n d =k,np ).
To interpret age-incidence data, as well as genomics data, we are interested in both the waiting time until cancer (t cancer ) and the total number of mutations (n p + k). This model can be simplify by noting that there is a freedom in the units for which we measure time. In our simulations, time was measured in generations and then converted to years. Here, we chose to measure time in units of the driver transition probability r d and will then convert this to years afterwards. Hence, r d = 1 without loss of generality. Consider the quantity τ cancer = t cancer r d , as a dimensionless measure of the waiting time to cancer. Because driver and passenger accumulation events are independent processes in this model, the joint probability of observing a cancer at time τ cancer with n p passenger mutations, P (τ cancer , n p |n d = k, r p ), is: P (τ cancer , n p |n d = k, r p ) = P (τ cancer |n d = k) · P (n p |τ cancer , r p ) [S17] This joint probability distribution provides a framework for identifying our quantities of interest.
The waiting times to cancer in this neutral-passenger model, has been previously shown to be a sum of exponentially-distributed waiting times [1], i.e. an Erlang or Gamma distribution, of the form: Traditionally in this model, it is believed that very few precancerous population have enough time to progress, as lesion formation rates are much greater than cancer incidence rates. Hence, it is believed that age-incidence curves should be fit with only the beginning of this distribution: a power-law distribution presented in the last line is appropriate. We find that although this hypothesis explains age-incidence rates well at mid-age, it fails to explain the plateau in age-incidence rates seen at older ages in most cancer subtypes ( Fig. 2A,  S1).
In this model, the total number of passengers accumulated is a Poisson distribution, if the time of progression t cancer is known: P (n p |τ cancer , r p ) = Poisson[n p | < n p >= t cancer r p ] = e −<np> < n p > np /n p ! [S19] Here, < n p >= t cancer r p is the mean number of expected passengers. The distribution takes this form because each passenger accumulation event occurs with an exponentially-distributed waiting time; a Poisson distribution describes the sum of events with exponentially-distributed waiting times in a fixed time interval. Because we do not know when a new lesion arrises, we must convolute this distribution with our expected distribution of t cancer . The available time for cancer progression depends upon the length of a human life: t human . If t human < t cancer , then the precancerous population will be unobserved in age-incidence and genomics data because the person died of an alternate cause prior to malignancy. Although the actual distribution of human lifetimes is complicated, we can still make inferences about the validity of this model by considering its extremes. Consider two opposing extreme cases: (1) when t human t cancer , all lesions eventually progress and are sequenced (i.e. a human lifetime is much greater than the mean time to cancer); and (2) when t human t cancer , only a few exceptional lesions progress (i.e. the mean time to cancer is much shorter than a human lifetime). We find that this first extreme predicts a much broader and more positively skewed distribution in the number of passengers, than the second case (Fig. S3). Nevertheless, it is still not wide enough, nor positively skewed enough, to explain the observed distribution of passengers in cancer under realistic parameters (Fig. 2B, S3, Table ). In contrast, our model predicts a broader and positively skewed distribution that captures observed passenger histograms well (Fig. 2B).
In the case where t human t cancer , accumulation of passengers follows a binomial process. Each accumulation event has probability p = r d /(r d + r p ) of being a driver and probability (1 − p) of being a passenger. Because the population has infinite time to progress to cancer, the binomial process continues until n d = k drivers accumulate. A binomial process that continues until k successes (i.e. drivers), will have a total number of failures (i.e. passengers) that samples a negative binomial distribution: A negative binomial distribution with p 1 (i.e. passengers greatly outnumber drivers-as is the case in observed) reduces to a Poisson distribution.
In the case where t human t cancer , the waiting time to cancer follows a power law distribution (Eq. 18). This, convoluted with the distribution of passengers expected for a particular t cancer (Eq. 19) yields the expected distribution of passengers for a cancer subtype: Where γ(s, x) is the normalized incomplete gamma function defined previously (Eq. 5). In the above derivation, we eliminated a parameter by considering the quantity: < n p > max = τ h r p , which corresponds to the mean number of passengers expected for a person who lives until the maximum allowable time, τ h . This solution is compared with the other extreme case in Figure S3. Obviously, for both predicted passenger distributions (Eqs. 21 and 20) the total number of mutations n d + n p is the expected number of passengers P (n p |k) plus the number of drivers k, which is constant. success rate Figure S1: Comparing waiting times to age-incidence curves A Mean waiting time to cancer t cancer (x) decreases as initial population size x = N 0 /N * increases. We solved for t cancer (x) from Eq. 3 using two methods: (1) via stochastic 'hybrid' Gillespie simulations (Eq. 6), and (2) via an analytical approximation (Eq. 7). Agreement between these two estimates suggests that our solution is accurate (see Estimating the mean time to progression for details). The mean time to cancer t cancer (x) depends heavily on the probability of adaptation P cancer (x), as t cancer is a conditioned on a population successfully progressing to cancer. Because P cancer (x) has an inflection point at x = 1, t cancer (x) behaves very differently when x > 1, than when x < 1. When x > 1, t cancer is approximately the mean expected velocity of the population integrated over population size, as nearly all cancers succeed. However, when x < 1, t cancer (x) is significantly shorter than would be expected from mean behavior. This is because most populations go extinct. Only the exceptional populations that progress to cancer are weighted in the mean of t cancer (x); these exceptional populations happened to grow much faster than the average population. Hence, the increase in waiting time to cancer is sub-linear with respect to x when x < 1. Most importantly, these results suggest that the shape of our predicted age-incidence cures (below) will depend almost entirely on s d and not x, thereby simplifying the interpretation of data. B Incidence rate verses age for 25 most common cancers in the SEERs databases [22]. Nearly all cancers show incidence rates that rise rapidly at mid-life, but then plateau at old-age. Leukemias tend to have flatter curves, which may suggest that they need fewer drivers for carcinogenesis. Colorectal cancer is exceptional in that it does not plateau and, instead, exhibits a power-law relationship for all ages. Incidence curves are flatter at young ages because of patients with a genetic predisposition to cancer that expedites progression; neither our model or the traditional neutral-model of cancer progression attempt to explain these incidences. C The predicted age-incidence curves derived from simulations can explain observed age-incidence curves in most cancer subtypes well when proper parameters are chosen. The slope of predicted age-incidence curves is described by s d : a larger s d causes the slope of age-incidence curves to decrease. In contrast, the location in the plateau of age-incidence curves is described by the success rate of cancer progression P ∞ multiplied by the lesion formation rate r. While we do not know the precise value of these two quantities (r and P ∞ ) that determine the plateau level, we note that both parameters collectively introduce only one free parameter in the description of age-incidence curves. Both parameters only alter the location of the plateau, not the shape of the distribution (as predicted in A). r = 5 in the predicted age-incidence curves plotted, however we expect this value to vary considerably between the cancer types. In the main text, we argue that r is at least 10 lesions · year −1 in breast epithelial. This estimate was based on the assumption that r =(# of mamospheres in breasts)(# of stem cells per mamosphere)(# of initiating mutations per cell per year) = (10 7 )(10)(10 −7 ). Lower-bounds of literature-derived quantities were used in this estimate ["]. Moreover, the number of lesions observed in normal breasts corroborates this estimate [32,35]. Thus, we eliminate the possibility that age-incidence curves can be explained by models which assume that most lesions eventually progress to cancer. D The actual initial population size (N 0 ) necessary to obtain various success rates of cancer progression from various s d utilized in C. Values were obtained by iteratively simulating various initial sizes until converging to the desired success rate. and Somatic Copy Number Alterations (SCNAs). A A positive linear relationship is observed between driver and passenger SNMs in all cancer subtypes studied here. This suggests that additional SNM passengers are being counterbalanced by additional drivers, and is consistent with our conclusions in the main text. Slope and y-intercept of the best-fit lines can be found in Table S3. B This positive linear relationship is also observed in SCNAs. The similar slopes and y-intercepts of SCNAs to SNMs supports our assumption that SCNAs and SNMs can be aggregated in analysis and modeling.  Figure S3: Neutral-passenger model of cancer progression explains distribution of mutation totals in cancers only with unrealistic parameters.
A We estimated the expected distribution of total mutations in sequenced tumors from a traditional model of cancer progression with neutral passengers (see A traditional model of cancer progression with drivers and neutral passengers. for details). The expected distribution depends upon the time allowed for progression. If the time for progression is much shorter than the mean, we expect power-law waiting times to cancer and a distribution of passengers that follows Eq. 21 (Black lines). If the time to progression is much longer than the mean, then we expect a Negative Binomial distribution (Eq. 20, Gray lines). We compared these distributions for various quantities of drivers necessary for carcinogenesis (k = 3, 5) and various quantities of the mean number of passengers (< n max p >= 100, 500). Because these distributions differed and we found no compelling reason to chose one over the other, we selected the distribution that was a priori more likely to fit the observed distributions of total mutations; we wanted to give the neutral-passenger model of cancer progression every opportunity to succeed. The second scenario (gray lines), where cancers have ample time to progress predicts a distribution with more variance and positive skew. Large variance and positive skew are observed in the true distribution of passengers, therefore we used this distribution. B We investigated the mutation totals of the 11 cancer subtypes that TCGA has sequenced in over 100 tumors [26]. The distribution of mutation totals in these subtypes is compared in a Quantile-Quantile plot to the best-fitting Negative Binomial distribution for each subtype. Deviations from the black line indicate regions of the observed distribution that are poorly explained by the theoretical distribution. While this theoretical distribution explains observed distributions well (excluding Colorectal cancers and Gliablastoma Multiforme), the Maximum-Likelihood Estimators (MLEs) used to fit this distribution suggest that only 1-2 drivers are necessary for progression (Table )-quantities that are inconstant with known biology and age-incidence curves. The probability of progression, determined from the outcome of 3,000 simulations propagated until extinction or rapid growth, across the parameter range of our model. In simulations, we observe parameters where progression occurs, fails, and is rare. A sophisticated analytical framework incorporating selection against passengers, hitchhiking of passengers onto driver mutations, and stochasticity predicts phase patterns well (black lines). This sophisticated analytical model uses two solutions for Muller's Ratchet in various parameter regimes (see Selection against passengers.), an estimate of the quantity of hitchhiking passengers (see Effects of passengers on driver fixation), and a stochastic differential equation to estimate probabilities of progression (see Estimating the probability of cancer). A simplified framework, which offers a closed-form solution, is possible and works reasonably well (magenta). This solution differs from the more precise solution in two ways: (1) a novel estimate of Muller's ratchet is used (Eq. 9), and (2) hitchhikers that accumulate after the new driver clone arises δp S are neglected. A-A P cancer increases as the relative target size of drivers T d verses passengers T p increases, for all parameters. B-B , A P cancer exhibits a local minimum with respect to the selection against passengers s p . When the selection against passengers is very weak, passengers are effectively neutral. When the selection against passengers is too strong, natural selection weeds out passengers and they never accumulate. Deleterious passengers are most effective at preventing cancer when at moderate fitness effects. This local minimum suggests that there may be two types of cancers: those that exist in an environment/genetic context where passengers are weak and buffered by, for example, an activated UPR; and those that succeed by exacerbating passengers' deleterious effects, perhaps by minimizing the mutation rate. C, C , A , B The probability of cancer always increases with s d . This parameter has a profound affect on cancer progression, as can be seen by how little the boundary between success and failure appears to be almost independent of the other parameters. Thus, this parameter is easy to estimate from epidemiological patterns. D, A An increasing mutation rate affects the probability of cancer very little at first; however, once it exceeds a critical value (µ * ), the probability of cancer drops precipitously.   Figure S6: Combination treatments that increase mutation rate and selection against passengers work best. A Using the analytical theory describe in Fig. S4, we can plot the critical population size N * across evolutionary parameters as a contour plot. Optimal therapy would be drug combinations that increase N * most dramatically, which would be the gradient of steepest ascent (blue lines). From this 3-Dimensional perspective the interplay between µ and s p is evident. At low mutation rates, only weak passengers with low s p can fixate. Thus, treating cancers with low mutation rates by increasing s p may be ineffective. At high mutation rates, all passengers fixate and increasing s p increases N * . At intermediate mutation rates, the most effective way to increase N * would be to moderately increase both the mutation rate and s p . B Via simulations, we tested our prediction that the gradient of steepest ascent is optimal for the magenta gradient line in A. 50 cancers with µ = 10 −8 , s p = 0.001 grown to 10 6 cells were treated with combinations of mutagenic and s p increasing therapy. Indeed, moderate increases in both parameters were more effective than would be expected from the lone treatments, thus confirming our prediction. These results suggest that combinatorial therapies may be most effective at treating cancer and that evolutionary modeling can guide clinical decisions. We explored our evolutionary model incorporating driver and passenger mutations across a broad range of parameters. The ranges were motivated by literature estimates that we discussed previously [28]. Note that in simulations µ d = µT d and µ p = µT p ), hence we can explore the entire phase space by only altering µ and T d /T p ; altering all three parameters would be redundant. In Figure 2 we compare our model to epidemiological and genomic data and find that the best-fitting parameters agree well with these prior published estimates.  [10], and 121 melanomas [5].