Clonality and intracellular polyploidy in virus evolution and pathogenesis

In the present article we examine clonality in virus evolution. Most viruses retain an active recombination machinery as a potential means to initiate new levels of genetic exploration that go beyond those attainable solely by point mutations. However, despite abundant recombination that may be linked to molecular events essential for genome replication, herein we provide evidence that generation of recombinants with altered biological properties is not essential for the completion of the replication cycles of viruses, and that viral lineages (near-clades) can be defined. We distinguish mechanistically active but inconsequential recombination from evolutionarily relevant recombination, illustrated by episodes in the field and during experimental evolution. In the field, recombination has been at the origin of new viral pathogens, and has conferred fitness advantages to some viruses once the parental viruses have attained a sufficient degree of diversification by point mutations. In the laboratory, recombination mediated a salient genome segmentation of foot-and-mouth disease virus, an important animal pathogen whose genome in nature has always been characterized as unsegmented. We propose a model of continuous mutation and recombination, with punctuated, biologically relevant recombination events for the survival of viruses, both as disease agents and as promoters of cellular evolution. Thus, clonality is the standard evolutionary mode for viruses because recombination is largely inconsequential, since the decisive events for virus replication and survival are not dependent on the exchange of genetic material and formation of recombinant (mosaic) genomes.

In the present article we examine clonality in virus evolution. Most viruses retain an active recombination machinery as a potential means to initiate new levels of genetic exploration that go beyond those attainable solely by point mutations. However, despite abundant recombination that may be linked to molecular events essential for genome replication, herein we provide evidence that generation of recombinants with altered biological properties is not essential for the completion of the replication cycles of viruses, and that viral lineages (near-clades) can be defined. We distinguish mechanistically active but inconsequential recombination from evolutionarily relevant recombination, illustrated by episodes in the field and during experimental evolution. In the field, recombination has been at the origin of new viral pathogens, and has conferred fitness advantages to some viruses once the parental viruses have attained a sufficient degree of diversification by point mutations. In the laboratory, recombination mediated a salient genome segmentation of foot-and-mouth disease virus, an important animal pathogen whose genome in nature has always been characterized as unsegmented. We propose a model of continuous mutation and recombination, with punctuated, biologically relevant recombination events for the survival of viruses, both as disease agents and as promoters of cellular evolution. Thus, clonality is the standard evolutionary mode for viruses because recombination is largely inconsequential, since the decisive events for virus replication and survival are not dependent on the exchange of genetic material and formation of recombinant (mosaic) genomes. evolutionary dynamics | mutation | quasispecies | recombination | genome segmentation V iruses are the most abundant and ubiquitous genetic elements in our biosphere, with an estimated total number of 10 31 to 10 32 , which means that they outnumber the total cells by a factor of 10. Viruses infect all host phyla in the most diverse environments, with an estimated number of new infections of 10 23 per second, according to metagenomic surveys (1)(2)(3)(4)(5). The fact that viruses can infect all types of unicellular and multicellular organisms suggests that viruses have been (and probably are) key players in the evolution of life. Uniquely, viruses exploit a variety of replication strategies of their genetic material that, unlike cells, can be either RNA or DNA, and either singlestranded, double-stranded, linear, circular, a single molecule, or multiple molecules (segmented genomes, termed multipartite when the segments are encapsidated in different viral particles; see ref. 6 for an overview). Despite often being called autonomous genetic elements, viruses need a cell to express their genetic program and to produce progeny. They are endowed with two of the features that characterize life: the capacity to replicate and to evolve. Outside a cell, viruses behave as inert macromolecular aggregates.
By virtue of their limited amount of genetic material compared with cells, viruses have been instrumental in the development of many fundamental concepts in biology. Viruses have contributed to the understanding of genome organization, as well the programs and regulatory mechanisms that guide genome replication and gene expression. Retroviruses permitted the discovery of a reverse-transcriptase activity, an enzyme that forced the modification of an established dogma of molecular biology in that the flow of genetic information can also go from RNA to DNA (many implications are discussed in ref. 7). Viruses also provided the first evidence of the presence of interrupted genes (introns, exons, and the process of splicing), and they have been instrumental in the establishment of key immunological concepts, such as MHC restriction associated with cytotoxic CD8 cells, among other mechanisms of cellular immunology (see refs. 8-10 for overviews). The presence of endogenous viruses and other virus-like elements in the DNA of differentiated organisms has provided evidence of the long-term relationships between viral elements and the cellular world (11, among many other studies). More recently, viruses are emerging as suitable experimental systems to address problems of biological complexity, a concept that, having its origin in physics, pervades the biological world (12). Thus, viruses are studied because they are disease agents and because they provide genetic entities for basic research.
In the ongoing debate about clonal versus nonclonal evolution in biological systems, particularly cellular parasites (compare, for example, refs. [13][14][15][16][17], it is of obvious interest to examine the extent of clonality of viruses, the most abundant and prolific genetic entities amenable to field analyses and to controlled laboratory experimentation. The debate bears directly on the advantage (historical or present) of sex as a reproductive strategy (18,19).
In the present study, we first define some terms for clarity regarding how we deal with virus genetics, and we review evidence that suggests a predominantly clonal evolution as the standard "way of life" for viruses, despite most viruses keeping the molecular machinery for active recombination. The general availability of recombination leads to the distinction between unproductive or inconsequential recombination and evolutionary meaningful recombination. That is, recombination is occasionally exploited for relevant transitions that we term "discontinuity points." A similar duality exists for mutation at shorter time scales. We illustrate recombination-driven transitions with some field observations, and also with an example of recombinationbased segmentation recently described for the important animal pathogen foot-and-mouth disease virus (FMDV). We propose that, despite sex having the potential to counteract detrimental effects of high mutation rates, high mutability and virus fecundity may have maintained sufficient adaptive potential to allow a This article is a PNAS Direct Submission. 1 To whom correspondence should be addressed. Email: edomingo@cbm.csic.es.
predominantly clonal evolution of viruses without continued evolutionary meaningful recombination.

Clarifications of Terminology: Recombination Mechanisms
We follow Tibayrenc and Ayala (14), and we use the term "clonality" to refer to absence or limited recombination as a requirement for viruses to survive as genetic elements. We use recombination in its broader sense to mean any type of exchange of genetic material between two parental viruses or viruses and cells, and even viral genome alterations that result in the occurrence of insertions or deletions. With our terminology, formation of defective interfering particles (20) is a consequence of recombination events. Two general types of recombination have been described for RNA and DNA viruses: replicative homologous and nonhomologous recombination (depending on the sequence identity at the recombination sites or cross-over points), and nonreplicative recombination (ligation of viral RNA fragments independent of replication) (21,22). The most extensively documented mode of genetic exchange in viruses is replicative homologous recombination that consists of template switches during genome replication to yield mosaic genomes. Viruses replicated by a polymerase that shows limited processivity (limited capacity to remain copying the same template molecule) tend to show high recombination frequencies. Biochemical evidence suggests that processivity is an evolvable trait (23,24), so that long-term selection may have produced polymerases that balance their capacity to pursue template copying without excessive disturbances derived from limited affinity for template nucleic acids. The picornaviruses, coronaviruses, and retroviruses often display high recombination frequencies, as measured by the proportion of progeny recombinants in double infections with genetically marked viruses. Some viruses use strand transfers as a part of their replicative mechanisms (for example during RNA synthesis in coronaviruses, and reversetranscriptase copying of the diploid RNA genome in retroviruses). In contrast, some negative-strand RNA viruses (those whose genome is of opposite polarity to the viral mRNAs in the infected cells) tend to show limited recombination, although it also has been reported in some systems (25). Many negativestrand RNA viruses produce defective interfering particles with high frequency (20). Therefore, recombination mechanisms are available to most if not all viruses characterized to date.
In the present article we use the term "polyploidy" to mean the diverse genomes present in a single replicative unit within an infected cell that will then become a heterogeneous virus population when particles are assembled and exit the cell.
Steps in Virus Diversification: Two Meanings of Clonality Any virus, be it a DNA or RNA virus, needs a cell to replicate, and a few or many thousand progeny genomes can be produced per cell depending on information encoded by the virus and the resources provided by the cell. Genome replication is the first step in which the virus genetic material can mutate. This first diversification is particularly active for the viruses whose polymerases lack a 3′-5′ exonuclease activity that can excise misincorporated nucleotides and that operates in most replicative cellular DNA polymerases (26). Absence of a proofreading activity is a major factor that determines the error-prone replication of RNA viruses and some DNA viruses. Its consequence is the generation of intracellular polyploidy in the sense that each replicative unit yields a heterogeneous collection of nascent genomes (27). Mutation rates (the frequency of occurrence of mutations during the process of genome copying) and mutation frequencies (the frequency of mutant genomes in a viral population) have been estimated by genetic and biochemical methods to be in the range of 10 −3 to 10 −5 substitutions per nucleotide, which means nearly 1 million-fold higher values than for normal cellular genomes (27)(28)(29). The consequence of high mutation rates is that these error-prone replicating viruses form complex and highly dynamic distributions of related but nonidentical genomes, termed "viral quasispecies." In several viral systems, the complexity of viral quasispecies and the total amount of viral particles (viral load) in an infected organism are parameters that have been correlated with pathogenic potential (i.e., invasion of specific organs by subsets of viral variants) and disease progression (27). The mathematical description of quasispecies (30) is one of several interconnected treatments of population dynamics that include the Lotka-Volterra, game dynamical, Price, replicator-mutator, and replicator-mutator-Price equations (31). Mutation is the most prominent feature of the quasispecies description of evolutionary dynamics and, therefore, quasispecies is a suitable theoretical framework for highly variable viruses. Despite the simplification to represent an RNA viral genome as a defined sequence, in reality we are representing a consensus average of many different sequences that are continuously changing, with obvious implications for virus adaptability (27). The formation of mutant spectra (also termed "mutant distributions" or "clouds") is the first step of diversification, which then continues once the virus has been transmitted to recipient hosts, with extended successions of selection and random sampling events. The reiteration of these processes has produced the present day viruses that can be isolated and studied.
At certain point in their life cycle, some viruses may behave as if they were cellular genes. Comparison of standard RNA viruses (sometimes referred to as riboviruses) and retroviruses provides two different meanings of clonality in evolution. The retroviruses include a replication step that consists in the production of a DNA copy of the genomic RNA by the viral reverse transcriptase, and the viral DNA is then inserted in the host cellular DNA where it replicates as if it were a cellular gene. The different rate of evolution between genes in retroviral entities and their cellular counterparts was illustrated by Gojobori and Yokoyama, who measured a rate of evolution for the viral proto-oncogene serine/ threonine-protein kinase (v-mos) gene of the retrovirus Moloney murine sarcoma virus that was 10 6 -fold higher than the rate of its cellular homolog c-mos (32). During the cellular stage, the genetic material of the virus is under the typical evolutionary stasis of the cells because its error rate is dictated by that of the cell. This is the case of human T-cell lymphotropic virus types 1 and 2 (HTLV-1 and HTLV-2). These types have two routes for spreading: infection as particles that leads to proviral integration, and a mitotic stage in which the viral DNA undergoes duplication as part of the cellular DNA. The two viruses have different host cell preferences: CD4 + T cells in the case of HTLV-1 and CD8 + T cells in the case of HTLV-2. Both undergo clonal expansions with their host cells, which are selected by their proliferation capacity, albeit with distinctive features. According to a recent study (33) the two viruses differ in the number of carrying clones in the blood of infected individuals, with HTLV-2 characterized by a small number of highly expanded clones. Thus, these and other retroviruses display clonal expansion because they follow the duplication fate of their carrier cells. This meaning of clonality is different from the one we use for viruses that do not integrate their genetic material in the host DNA. What might be regarded as a sexual replicative stage in viruses by virtue of their genomes becoming part of the cellular genetic material cannot be considered a general feature for viruses.

Unproductive or Inconsequential Recombination
The replicative machinery of viruses has retained the capacity to perform intramolecular and intermolecular recombination, and some viruses may undergo continuous recombination because such events are inherent to the replicative mechanism. However, recombination goes unnoticed because of the lack of appropriate markers to distinguish parental from progeny recombinant genomes. Exuberant recombination was suggested to occur in some plant viruses (22,34), and it has been recently suggested by application of new-generation (deep) sequencing to the analysis of recombinant intermediates in poliovirus-infected cells (35). The study revealed multiple hidden, imperfect, and unproductive recombination events (the generation step), followed by a few successful events that give rise to progeny (the resolution step). What these results suggest is that in some virus recombination may be extremely active, but that most recombinants are selected against immediately following their generation. They are subjected to intracellular negative selection, as many newly arising mutations probably are, reflected also in multiple inconsequential and transient selection events, which are increasingly unveiled by deep-sequencing analyses (36)(37)(38). This state of affairs leads to the distinction between "occurrence" and "biological consequences" of recombination, in particular recombination that can either rescue new viable viruses or that may mediate a salient transition in genome structure. The distinction does not modify the two main biological objectives postulated for recombination: the rescuing of viable genomes from unfit parents, and the exploration of distant areas of sequence space for evolutionary innovation (27).

Historical and Current Recombination-Based Transitions in Viruses
Some viruses that occupy a well-established niche were generated by recombination. Historically, western equine encephalitis virus, a mosquito-borne alphavirus pathogen, may have arisen by recombination between a Sindbis-like and an Eastern equine encephalitis-like virus (39). Many of the circulating virulent polioviruses that cause disease outbreaks have been generated by recombination between vaccine poliovirus strains and other circulating enterovirus genomes (40)(41)(42)(43). Viral multidrug resistance is sometimes attained through recombination between two viruses, each displaying resistance to some of the drugs (44)(45)(46)(47)(48). The current epidemiological picture for the ongoing AIDS epidemics is that, in addition to the circulating standard subtypes (each characterized by a range of consensus sequences), there are 53 circulating recombinant forms, meaning that they have acquired epidemiological identity. This figure should be regarded as transient and it is probably growing as this report is being processed, because surveys have identified multitudes of "unique" recombinant forms that have not reached epidemiological significance. It is highly unlikely that HIV-1 recombination only occurred when the virus had diversified into subtypes. Rather, the molecular information (estimates of recombination frequency under laboratory conditions) suggests that recombination was potentially equally efficient at the onset of the epidemics, but that it went unnoticed because of lack of markers to identify it. At present it is difficult to compare reported mutation versus recombination rates for the same virus because of the different procedures involved in the two measurements (49). Mutation rates have been estimated in 10 −3 to 10 −5 substitutions per nucleotide copied for RNA viruses (27). An estimate for HIV-1 in vivo yielded 1.4 × 10 −4 recombination events per site and generation, which is about fivefold greater than the average point mutation rate (50).
A remarkable example of recombination-mediated evolutionary transition occurred with FMDV subjected to more than 200 passages in BHK-21 cells at high multiplicity of infection that favored complementation among newly generated defective genomes (51)(52)(53). The evolutionary transition was akin to a process of genome segmentation triggered by the accumulation of mutations during virus passage, and favored by the increase of stability of viral particles containing shorter genomes because of an internal deletion (Fig. 1). It has not been possible to compare mutation and recombination rates in the course of FMDV passages that led to genome segmentation. Multiple, low-level recombinants (internal deletions) at the capsid-coding region were identified, as part of a continuous dynamics of mutation and recombination (54). The result was that the monopartite FMDV genome evolved by recombination-mediated events toward two genomes that infected and killed cells by complementation. No such transition has been observed in nature, and it was probably facilitated by the high multiplicity of infection (multiple particles infecting the same cell), and the occurrence of a constellation of mutations for which a segmented form was more fit than the unsegmented version (53). Thus, important recombination events may occasionally occur in viruses despite their existence as replicative entities not necessitating recombination for survival. The fact that near-clades can be distinguished during evolution of most viruses supports predominant clonal evolution in the sense emphasized in the present article. Evidence of clonality has been obtained in viruses as diverse as picornaviruses (55), the avian influenza viruses (56), severe acute respiratory syndrome coronavirus (57), and Dengue viruses (58,59).

Factors Favoring and Limiting Consequential Virus Recombination
Recombination acquires biological significance when the two parental genomes that are the substrate for recombination have diverged sufficiently for mosaic genomes to offer new phenotypes to the scrutiny of selection. This process requires coinfection of the same cell by at least two divergent viruses and that the two parental genomes coincide in the same or a proximal intracellular replicating ensemble. Coinfection is limited by several epidemiological, cellular, and molecular mechanisms. Epidemiologically, a host individual must be infected by the two viruses, a situation that is not common but that could be favored by the prior persistence of one of the viruses in the host. At the cellular level, viruses have developed superinfection exclusion mechanisms by which a cell infected by a virus is refractory to a second infection by a related virus (ref. 60 and references therein). However, in favor of the occurrence of recombination, there is evidence that some cell subsets in tissues and organs may be prone to be multiply infected at levels significantly higher than expected from a double virus hit of the same cell (61,62), thus favoring recombination. Therefore, meaningful recombination may be a relatively rare event, thereby permitting viral lineages (or near-clades) to maintain their identity for extended time periods in defined geographical areas. Most engineered chimeric viruses, or viruses whose gene order has been modified, display decreased fitness relative to their corresponding parental genomes (63). Coevolution appears to have produced sets of genes to function coordinately, therefore favoring linkage disequilibrium and maintenance of identifiable near-clades. These interconnected influences support the predominantly clonal mode of viral evolution.

Recapitulation and Model
With the information succinctly exposed in this article, a distinction between irrelevant and meaningful recombination in viruses has been made. To summarize our view, recombination is Fig. 2. Schematic depiction of the predominantly clonal evolution of viruses. From an initial infection (origin), multiple sublineages are generated and new branches are continuously arising (not shown and indicated by points at the tip of braches). At any branch, during replication, recombination takes place (small double-headed arrows on all branches). Biologically meaningful divergence is indicated by the generation of red and blue branches. When this has happened, recombination at the discontinuity point (large doubleheaded arrow) is biologically meaningful because it generates mosaic (redblue) genomes with new potential phenotypes. Clonal evolution then continues until a new discontinuity point is reached. The scheme does not imply space (intrahost or interhost) or temporal (a single host or multiple host infections) parameters, as justified in the text.
continuously active in many viruses (perhaps most viruses) but biologically relevant recombination is limited by a variety of intervening epidemiological, cellular, and molecular factors. Because biologically relevant recombination is not a requirement for virus replication and evolution (contrary to the case of organisms in which sexual mechanisms are established), we suggest that the virus life cycles can be considered as predominantly clonal in the terms proposed by Tibayrenc and Ayala (14,15) for cellular parasites. The model we propose (Fig. 2) is that viruses produce continuously diversifying subclades with recombination being a constant trait both in the initial clonal lineages and when biologically meaningful divergence has been attained. It is then when a discontinuity point can give rise to meaningful recombination using the same molecular machinery that is continuously available and acting. Application of new-generation sequencing to analysis of replicative intermediates should either support or correct our proposal. The process shown in Fig. 2 can represent either intrahost or interhost (at the epidemiological level) events at widely different time scales, with never-ending successions of blocks, such as those depicted in Fig. 2, with ever expanding subbranches, and frequent extinction events (dead-end tips of branches). Again, the model of predominant clonal evolution of viruses does not imply that recombination cannot play an important evolutionary role. It does at the discontinuity points but it is not part of the "norm" or "way of life" of the majority of viruses. Exceptions are the retroviruses that integrate their genomes in the cell genetic material at some stages of their replication cycles, thereby deviating from clonality as defined herein.
ACKNOWLEDGMENTS. Work in Madrid is supported by Grants BFU-2011-23604 and P2013/ABI-2906 (PLATESA from Comunidad Autónoma de Madrid) and Fundación R. Areces; Centro de Investigación en Red de Enfermedades Hepáticas y Digestivas is funded by Instituto de Salud Carlos III; E.M. is supported by a fellowship from Ministerio de Economía y Competitividad; and C.P. is supported by the Miguel Servet program of the FIS Instituto de Salud Carlos III (CP14/00121).