New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology
Molecular evidence for a single evolutionary origin of domesticated rice
Contributed by Barbara A. Schaal, March 25, 2011 (sent for review January 28, 2011)

Abstract
Asian rice, Oryza sativa, is one of world's oldest and most important crop species. Rice is believed to have been domesticated ∼9,000 y ago, although debate on its origin remains contentious. A single-origin model suggests that two main subspecies of Asian rice, indica and japonica, were domesticated from the wild rice O. rufipogon. In contrast, the multiple independent domestication model proposes that these two major rice types were domesticated separately and in different parts of the species range of wild rice. This latter view has gained much support from the observation of strong genetic differentiation between indica and japonica as well as several phylogenetic studies of rice domestication. We reexamine the evolutionary history of domesticated rice by resequencing 630 gene fragments on chromosomes 8, 10, and 12 from a diverse set of wild and domesticated rice accessions. Using patterns of SNPs, we identify 20 putative selective sweeps on these chromosomes in cultivated rice. Demographic modeling based on these SNP data and a diffusion-based approach provide the strongest support for a single domestication origin of rice. Bayesian phylogenetic analyses implementing the multispecies coalescent and using previously published phylogenetic sequence datasets also point to a single origin of Asian domesticated rice. Finally, we date the origin of domestication at ∼8,200–13,500 y ago, depending on the molecular clock estimate that is used, which is consistent with known archaeological data that suggests rice was first cultivated at around this time in the Yangtze Valley of China.
Domestication is a complex evolutionary process in which human use of plant and animal species leads to genetically based morphological and/or physiological diversification of domesticated taxa from their wild ancestors (1). The process of domestication provides insights into the nature of selection and the rise of species differences (1). Understanding the origins of domesticated species impacts our understanding of the evolutionary mechanisms surrounding domestication (e.g., founder events, selection, and parallel evolution) and the cultural context by which human societies cultivate and become dependent on specific species for food, fiber, and other uses.
Asian rice, Oryza sativa L., is one of world's oldest and most important crop species, having been domesticated beginning some ∼8,000–9,000 y ago (2–4). Asian rice feeds more than one-half of the global population and has become a key model system for plant biology (5). Several genetic studies have shown that O. rufipogon, which remains extant in South and Southeast Asia, is the wild progenitor of domesticated rice (6). Although others have suggested that O. nivara may be the progenitor of rice (7), there is evidence to indicate that this species is an annual ecotype of O. rufipogon (6, 8, 9).
Genetic analysis has established that rice consists of several genetically differentiated variety groups, with the two main groups being indica and japonica (10). Sometimes described as subspecies, indica and japonica have been recognized since ancient China (11) and are the most widely grown rice varieties. Several studies have shown strong genetic differentiation between indica and japonica (8, 12–20), and molecular studies suggest divergence time estimates of 86–440 ky between these two variety groups (14, 15, 17, 21), far older than the ∼9,000-y archaeological estimate for rice cultivation (2, 4).
Despite recent advances in genetics and archaeology, there continues to be debate on the origin(s) of domesticated rice (22, 23). Several models to explain the origin of rice have been advanced over the last half century; these models can be broadly classified as advocating either a single origin or multiple origins for this important crop species. Single-origin models posit that domesticated rice originated from wild rice (Fig. 1), with differentiation of indica and japonica occurring after domestication of the cultivated species (24). Molecular evidence for this model is largely based on recent studies that show that the key domestication gene sh4 that confers nonshattering (25) and the prog1 locus responsible for the erect habit (26) have nearly identical sequences shared by both subspecies of rice. There is also recent molecular evidence for the single-origin model from a Bayesian demographic analysis of multilocus microsatellite data (27).
Schematics of the single- (A) vs. double- (B) founder models. (A) In the single domestication event, both domesticated subspecies originated from the same O. rufipogon ancestral population. (B) In the double-founder model, indica and tropical japonica were domesticated independently from different O. rufipogon populations. O. rufipogon, O. sativa ssp. indica, and O. sativa ssp. tropical japonica are indicated by the subscripts r, I, and j, respectively. The times τB and τ represent the length of the bottleneck and time thereafter during the two-population epoch. Likewise, τ2B and τ2 represent the length of time of the bottleneck and time thereafter for the three-population epoch. Symmetric migration (μ) between the populations is represented by arrows, and N is the population size.
Multiple-origin proponents, however, attribute this sharing of key domestication genes between indica and japonica as arising from hybridization between the two variety groups some time after their independent domestication (3, 22, 23). They suggest a model in which indica and japonica were domesticated separately from predifferentiated ancestral O. rufipogon populations (Fig. 1). This multiple-origin model can readily explain genetic differentiation observed in domesticated rice and thus has gained support from phylogenetic analyses that show distinct clades in O. sativa for indica and japonica, with different O. rufipogon accessions associated with each clade (9, 15–19). Moreover, archaeological studies show evidence for rice domestication in the Yangtze Valley beginning ∼8,000–9,000 y ago (2–4, 28) as well as early (and putatively separate) cultivation of rice in the Ganges in India beginning ∼4,000 y ago (3).
The recent origins of domesticated crop species (<10,000 y) increase the likelihood that ancestral polymorphisms persist in domesticated taxa through incomplete lineage sorting, which results in sequence similarities that do not necessarily reflect species and population relationships (29, 30). Simulation studies have shown that data concatenation can generate misleading phylogenetic relationships, because it ignores the different evolutionary histories of distinct loci (31). Indeed, incomplete lineage sorting leads to incongruent gene trees when multiple loci are analyzed independently, which has been recently shown in the genus Oryza (32, 33), but phylogenetic studies of rice have still largely relied on data concatenation across multiple loci under the assumption that the predominant phylogenetic information inherent in the data will swamp out any conflicting signal (9, 15, 17–19).
Conflicting gene trees because of coalescent stochasticity have been a problem for species delimitation, but statistical methods that combine population genetics with phylogenetics now allow for a more accurate inference of recent evolutionary history (reviewed in ref. 30). Phylogenetic analyses using the multispecies coalescent (MSC) are able to detect signals of species differentiation even before their gene trees are reciprocally monophyletic (29, 30, 34, 35).
In this study, we provide evidence for a single domestication of rice. We identified SNPs through direct resequencing of >250 kb of sequence from 630 gene fragments across three rice chromosomes in wild and cultivated rice accessions. We then use two methods to infer the evolutionary history of domesticated Asian rice—a diffusion-based approach to demographic modeling on the SNP data (36) and a Bayesian evolutionary approach to phylogenetic analysis, implementing the multispecies coalescent (35) using previously published phylogenetic datasets. Both approaches support a single origin for rice, and we are also able to estimate a date for the domestication of rice consistent with the previously published archaeological studies.
Results
Selective Sweeps in Three Rice Chromosomes.
We resequenced portions of 630 genes on rice chromosomes 8, 10, and 12 at ∼100-kb intervals in multiple accessions (Table S1) of O. sativa indica (n = 20) and tropical japonica (n = 16) as well as O. rufipogon (n = 20) and a single accession of O. nivara, O. barthii, and O. meridionalis (data may be downloaded from http://puruggananlab.bio.nyu.edu/Rice_data/). We obtained 255.9 ± 2.09 kb of sequence data for each accession. A total of 2,800 SNPs in indica, 2,070 SNPs in tropical japonica, and 7,274 SNPs in O. rufipogon were identified. As expected, the mean silent site nucleotide diversity (π) was lower in O. sativa (π = 0.0037 for indica and 0.0028 for tropical japonica) compared with O. rufipogon (π = 0.0079) across all three chromosomes.
The classification of accessions was confirmed based on results of a STRUCTURE (37) population stratification analysis (Fig. S1). These results suggest that our sample had four ancestral populations (K = 4), which corresponds to O. sativa ssp. indica, O. sativa ssp. tropical japonica, and two O. rufipogon clusters. There was only a marginal difference, however, in the likelihood values between K = 3 and K = 4, which is associated with the splitting of O. rufipogon into two subpopulations.
We used the chromosome scan data to first identify putative selective sweeps, regions of the genome that show evidence of recent positive selection, on all three chromosomes so that we could exclude these in our subsequent demographic analyses of rice origins (see below). We applied two different methods to identify regions with a genetic signature of a selective sweep within and between the domesticated species; these were based on (i) local reduction in nucleotide diversity and (ii) patterns of the multipopulation allele frequency spectrum (AFS; Materials and Methods and SI Text). We identified a total of 20 regions within the three chromosomes that had evidence for selective sweeps from at least one of these tests. We found that regions with reduced diversity were widespread in both indica and in particular, tropical japonica (Fig. 2, Fig. S2, and Table S2), consistent with the expectation of strong artificial selection during rice domestication. We were also able to find putative selective sweeps that were specific to indica or tropical japonica as well as regions with sweeps shared by both variety groups (Fig. 2 and Fig. S3).
Summary of selective sweep mapping results. Candidate selective sweep regions identified in each test for selection for (A) chromosome 8, (B) chromosome 10, and (C) chromosome 12. Results for different subspecies are indicated by red for indica and blue for tropical japonica. CLR, composite likelihood ratio test based on the multipopulation allele frequency spectrum; DIV, diversity test based on regions of no variation; TJ, tropical japonica; IND. Vertical bars in the CLR track correspond to single fragments, because it was calculated for each fragment separately. Colors in the track correspond to the respective population, with black bars indicating a sweep in both indica and tropical japonica. The four shared sweep regions are marked with a dot.
Demographic Inference.
We examined several demographic models for the origin and history of domesticated rice (Fig. 1) using a diffusion-based approach implemented in the program ∂a∂i (36), which calculates the likelihood of a demographic model given an observed multipopulation AFS. The multipopulation AFS is the joint distribution of allele frequencies of diallelic variants from a genomic region sequenced in multiple individuals from each population. The program ∂a∂i also computes the expected AFS under various demographic scenarios by numerically solving a multipopulation diffusion equation describing the effects of mutation, drift, and migration. Likelihoods of each model can be computed based on the product of the Poisson likelihoods for each entry of the AFS.
Based on 2,057 putatively neutral segregating sites (SI Text), we found that the single-origin models outperformed models of separate domestication of indica and tropical japonica (Table 1 and Fig. S4). The single-origin model provided a better fit to the data, even when putative sweep regions were excluded (Table 1). Each model has the same number of parameters, making the 22 log-likelihood unit difference between single- and double-founder models highly significant, even after correcting for linkage. We cannot, however, distinguish whether indica or tropical japonica was founded first under these scenarios. The maximum-likelihood parameter values inferred for both single-founder models (with and without sweeps) are presented in Table S3.
Single- and double-founder domestication models and their log likelihoods with and without selective sweep regions
The parameters for population size and migration are very similar across all models, regardless of whether indica or tropical japonica was founded first. Although we present data for symmetric migration, we also tested the effect of four types of migration on the models to determine which one fits the AFS best (Table S4). These four models include symmetric migration, asymmetric migration, no migration from O. rufipogon, and no migration at all. We did not impose a founding bottleneck for indica and tropical japonica to simplify the parameter space when making these comparisons. Models with asymmetric migration have a slightly better fit, but the increase in log likelihood is not enough to justify the three additional migration parameters in the model.
Finally, a previous study suggests that selection could have genome-wide effects on the site frequency spectrum (18). To model the effects of selection, we reran our demographic models with weak positive selection (selection parameter = 1) occurring in all populations during the two- and three-population epochs. We found that, even after running the models with weak positive selection, the single-founder models still outperform the double-founder models (Table S5).
Bayesian Reanalyses of Previously Published Phylogenetic Data.
Because the results of our demographic analysis challenge the currently accepted view that indica and japonica have independent origins, we examined the phylogeny of rice domestication. Previous phylogenetic studies concatenated data from multiple unlinked loci, a common practice but one that has been found to be problematic when dealing with recent speciation events (30, 31, 35). A Bayesian approach that accounts for gene tree heterogeneity while estimating the species tree can circumvent this issue, and this has been implemented in the program *BEAST. Unfortunately, our resequenced gene fragments were too short (∼500 bp) for *BEAST to perform well, and instead, we reanalyzed previously published phylogenetic datasets that have been used to argue for the independent domestications of indica and japonica (15–17, 19, 38) (Table 2). We also used a recently published dataset that examined the evolution of the rice endosperm starch biosynthetic pathway (39).
Results from *BEAST reanalyses of previously published phylogenetic datasets
Our *BEAST analyses of two datasets by Zhu and Ge (15) and Londo et al. (16), each composed of less than five loci, resulted in several equivocal topologies, with at least 90% of the trees in the posterior set disputing the monophyly of indica and japonica (results not shown). The other four datasets, however, each with more than five loci, showed strong support for a single origin of domesticated rice when they were reanalyzed using *BEAST (Table 2 and Fig. 3). We note that even the inclusion of O. nivara, which was suggested as an alternative ancestral species for indica (7, 9), still revealed a closer relationship between indica and japonica than each to any wild rice species (Fig. 3). Both datasets agree that indica and japonica are derived from one common ancestor, even if all five groups comprising Asian domesticated rice are represented in the Yu et al. (39) dataset and O. rufipogon population structure was considered (Fig. 3).
Results from *BEAST analyses of phylogenetic datasets. Nodes supported with high posterior probability (≥95%) by all datasets are shown with a dot. I, J, R, and N represent indica, tropical japonica, O. rufipogon, and O. nivara, respectively. The numbers above the bars in the graphs indicate the percentage of trees in the posterior probability distribution with a given topology or that support a single origin of rice. (A) Alternative species trees and their proportions in the posterior distribution resulting from analyses of the Tang et al. (17) (red) and Zhu et al. (38) (yellow) data. (B) Only one well-supported species tree was recovered from the Rakshit et al. (19) data. (C) All trees in the posterior distribution, despite inclusion of the five main cultivar groups [aromatics (Ar), Aus, temperate japonica (TmJ), tropical japonica (TrJ), and indica (I)] in an analysis of the Yu et al. (39) dataset, support a single origin for domesticated taxa. Ri and Rc indicate O. rufipogon from India/Indochina and China, respectively.
Applying a strict molecular clock of 6.5 × 10−9 substitutions/site per y for the grasses (40) resulted in an estimate for the mean time of the onset of domestication of 8,200 y before present (B.P.) [95% highest posterior density (HPD) = 4,400–12,100 y B.P.]. The estimate for the mean age for the indica–japonica split is 3,900 y B.P. (95% HPD = 1,700–6600 y B.P.). Age estimates are higher when the mutation rate of 3.8 × 10−9 substitutions/site per y estimated from the chromosome scan data (SI Text) was applied. Using this molecular clock rate, we estimate the split of O. sativa from O. rufipogon commencing 13,500 y ago (95% HPD = 7,400–20,000 y B.P.) and the two domesticated rice variety groups splitting 6,700 y B.P. (95% HPD = 3,700–10,000 y B.P.).
Discussion
The origin of Asian rice has long been a puzzle to biologists (22, 23), and over the last two decades, the multiple-origins domestication model proposing the independent domestication of indica and tropical japonica has gained support largely from molecular data analyzed by traditional phylogenetic methods (9, 15–19). These phylogenetic methods, however, can lead to heterogeneities in inferred gene tree topologies, particularly among recently evolved species (30, 33). This ambiguity has prompted development of alternative phylogenetic inference methods, including those that use the multispecies coalescent (30, 35).
We have reassessed the phylogeny of domesticated rice using previously published datasets, five of which have been used to argue for a separate origin for indica and tropical japonica rice. Our study with the same data, reanalyzed in a multispecies coalescent framework, showed strong support for only a single origin of domesticated rice. Even the inclusion of O. nivara in two of the datasets that we have analyzed still revealed a closer relationship between indica and tropical japonica than each to any wild rice species. Two other datasets (15, 16) resulted in several equivocal topologies because of insufficient phylogenetic signal from too few loci (<5). This is not surprising, because simulations have shown that the probability of obtaining the correct species tree increases to 0.75 despite shallow tree depth when at least five loci are included (29, 41).
Previously detected population structure in O. rufipogon (16) violates one of the assumptions of the multispecies coalescent model. Accounting for this when we analyzed the data by Yu et al. (39), however, still showed indica and japonica as more closely related. Neither was affiliated with any wild rice group (Indian/Indochinese or Chinese), which would be expected if they were independently domesticated. There also seems to be phylogenetic support for the Indian/Indochinese O. rufipogon population as directly ancestral to domesticated rice. A larger sampling, however, will be necessary before a specific population can be identified as the ancestor of O. sativa, and there is also the possibility that such an ancestral population may be extinct.
The finding that domesticated rice has a single origin was also supported by demographic modeling using resequencing data of 630 nuclear loci from rice chromosomes 8, 10, and 12. The presence of selective sweeps shared by indica and tropical japonica may bias our inferences, because these shared sweeps could be related to domestication, and the same or very similar haplotypes may be fixed in the domesticated varieties (either from a single-origin event or parallel evolution or through postdomestication hybridization). There are, in fact, four putative cases of shared sweeps on these three chromosomes (Fig. 2), and these colocalize with known domestication quantitative trait loci (QTL) involved in traits for panicle length, plant height, days to heading, grain weight, and grain number (Fig. S2). Shared alleles between indica and japonica among known domestication genes have also been reported previously, most notably, red pericarp rc (42), nonshattering sh4 (25, 43), and plant architecture prog1 (26) loci.
These and other shared sweeps were initially thought to arise from introgression between variety groups as a result of postdomestication hybridization (3, 22). Aside from the rc, sh4, and prog1 domestication genes, shared alleles have also been observed between indica and japonica at diversification genes that contribute to phenotypic diversity between rice cultivars. These diversification genes include BADH2 fragrance gene (44), the sd1 semidwarfing gene (45), the Pi-ta disease resistance locus (45), the starch biosynthetic gene Wx (45), and the GS3 grain length gene (46). In most of these cases, however, only a minor fraction of varieties carry the introgressed allele; this is in contrast to domestication alleles shared between indica and japonica, which are at or close to fixation in both variety groups (25, 26, 42, 43). Indeed, the alleles at some of these diversification loci seem to have been introgressed more recently as a result of recent breeding efforts (45).
In general, most rice cultivars surveyed in a genome-wide SNP assay show very little evidence level of introgression (45). Nevertheless, introgression between japonica and indica may obscure evidence for multiple origins of domesticated rice. Eckert and Carstens (41), however, have shown that coalescent-based methods of phylogenetic inference are still robust, despite moderate amounts of historical gene flow. Moreover, we find strong support for a single origin in our demographic modeling, even when putative selective sweep regions, including those shared between domesticated rice groups, are eliminated. Finally, our modeling took gene flow/migration into account in the demographic inference, and a single-origin model was still favored. Together, our results indicate that many of the shared selective sweeps observed among rice domestication genes (as opposed to diversification loci) arise not from introgression between already domesticated indica and japonica but instead, reflect the single origin of this cultivated crop species.
Previous studies have estimated the divergence time between indica and japonica at ∼86–440 ky ago (14, 15, 17, 21), long predating the domestication of rice. These estimates were derived from application of a molecular clock to divergence estimates between pairs of sequences from O. sativa ssp. japonica and O. sativa ssp. indica, and the inferred divergence time was interpreted as evidence that they were derived independently from diverged source populations of O. rufipogon. This interpretation of divergence time estimates is not warranted, however, and there is no need to invoke the existence of a deeply structured source population to account for the ancient coalescent time. In the case of recent population divergence, the common ancestor of a pair of alleles drawn at random from each population will have existed long before the appearance of the populations themselves. Therefore, the inferred coalescent times of indica and japonica alleles may greatly exceed the time since domestication, even if the alleles were derived from a single panmictic progenitor population. This failure to consider ancestral variation has led to erroneous inferences in a number of contexts (47) and seems to have unduly influenced perceptions about rice origins.
When ancestral variation is taken into account with the multispecies coalescent, the timing of divergence between O. sativa and O. rufipogon and between indica and japonica is found to be much more recent. The exact divergence time estimate is dependent on which molecular clock rate that we use. If we use an estimate for nucleotide substitution rates in the grasses (40), we find a divergence time between O. rufipogon and O. sativa at ∼8,200 y ago and between tropical japonica and indica at ∼3,900 y ago. If we apply the molecular clock rate estimated from the chromosome scan data (SI Text), we obtain an earlier mean date of domestication for rice (13,500 y B.P.). The former molecular estimates are in remarkable agreement with archaeological estimates for the onset of rice domestication in the Yangtze Valley (∼8,000–9,000 y ago) and the expansion of indica rice in South Asia (∼4,000 y ago) (3, 28), and even the latter date still falls within the upper boundary of archaeological dating estimates of rice phytoliths collected from the lower Yangtze (4).
Although our analyses are consistent with a single origin of rice, one possibility is that both indica and japonica originated from highly differentiated O. rufipogon gene pools that were not sampled by both us and the other previously published phylogenetic studies. We think this is unlikely, because to obtain our results, both of these gene pools (one for indica and another for japonica) must not be represented in our sampling. Moreover, if these gene pools existed, they would have split from each other at about the time of rice domestication to be consistent with our estimates of the timing of the indica/japonica split.
Archaeological studies have been interpreted as corroborating phylogenetic evidence for multiple origins of rice. Two centers of rice domestication—the Yangtze River Valley of China and the Ganges in India—have been identified based on the discovery of nonshattering rice spikelet bases in archaeological sites from these regions (3). The oldest archaeological evidence for rice domestication comes from the Yangtze Valley, where japonica or a japonica-like domesticated rice seems to have been present as early as ∼8,000–9,000 y ago (2–4, 28). Rice domestication in the Ganges has also been observed, but rice here seems to be a minor crop (or was gathered wild) and only substantially grew in importance about 4,000 y after rice domestication in the Yangtze Valley (3).
It has been suggested that ancient peoples may have brought japonica westward along the Silk Road with other crops such as millet, apricots, and peaches (3). Hybridization of this japonica with local proto-indica cultivars along with active selection could have rapidly led to the rise and expansion of present day O. sativa ssp. indica (3). Other models suggest hybridization of a single domesticate with locally differentiated O. rufipogon populations, leading to the present day indica and tropical japonica (48, 49). Although they differ in details, these models are consistent with a single origin of domesticated rice in the Yangtze Valley followed by spread and hybridization of this original domesticate that eventually led to indica (48).
The question of the origin of domesticated rice (or any domesticated species) is a complex problem, because human activity may have eroded genetic signatures that hamper attempts to reconstruct the evolutionary history of these recent human-associated species. Demographic factors, such as rampant admixture, compounded by the effects of prolonged bottleneck during the process of domestication may obscure genetic evidence for domestication models, including those that indicate multiple origins for cultivated species (50). It is clear, however, from our study that incomplete lineage sorting during the coalescent process can explain previous phylogenetic conclusions of the multiple origins of rice. As greater amounts of genome-wide data become available, it would be interesting to see if the results of our analysis are supported by these methods of reconstructing evolutionary history and demographic processes associated with the recent speciations characteristic of domestication.
Several other domesticated taxa, because of marked intraspecific phenotypic or genetic differentiation, also seem to have multiple evolutionary origins. Barley (51), grapes (52), and cucurbits (53) as well as livestock species such as sheep (54) and cattle (55) have been shown to have arisen more than one time, indicating that different cultures have reinvented these domesticated species several times rather than obtaining them through diffusion from other farming societies. Rice was also thought to be a clear example of a domesticated species with multiple origins (9, 15–19), suggesting that the Neolithic cultures of China and India separately led to the domestication of this cereal crop species. It now seems that rice, however, may have arisen only in one geographical region of Asia and that, from this single origin, we now find a food species with a wide geographical and cultural reach that has led to its becoming the major food crop for much of the world's population.
Materials and Methods
Resequencing of Gene Fragments on Three Rice Chromosomes.
Our resequencing panel consisted of 20 accessions of O. rufipogon, 36 landrace accessions of O. sativa (20 indica and 16 tropical japonica), and one each of O. nivara, O. meridionalis, and O. barthii obtained from the International Rice Research Institute and the US Department of Agriculture (Table S1). DNA was extracted, and ∼500-bp gene fragments from protein-coding genes spaced at ∼100-kb intervals on chromosomes 8, 10, and 12 were resequenced. Details of sequencing, population structure, and diversity analyses are in SI Text.
Selective Sweep Mapping.
We used two different methods to map selective sweeps related to rice domestication from O. rufipogon based on local reductions in diversity as well as the multipopulation AFS used in the demographic inference. Details of these analyses are in SI Text.
Demographic Inference.
We tested various single-origin and double-origin models (Fig. 1) using ∂a∂i (36) (source code available on request). The single-origin models consisted of either indica from japonica or japonica from indica demographic scenarios (Fig. 1). The double-origin models can be categorized as either indica first or japonica first serial founder models, and they posit that each domesticate originated independently, one preceding the other (Fig. 1). We incorporated bottlenecks in all of our models for the founding of indica and tropical japonica populations. Details of the modeling are found in SI Text.
*BEAST Analyses of Published Rice Sequence Datasets.
Using the program Species Tree Ancestral Reconstruction/Bayesian Evolutionary Analysis by Sampling Trees (*BEAST v1.6.1) (35), we reanalyzed six previously published phylogenetic datasets (15–17, 19, 38, 39). We should note that the study by Tang et al. (17) chose atypical, highly divergent loci in their analyses. Moreover, the dataset by Yu et al. (39) included in our analysis contains possible selected loci, but none of these loci show selective sweeps shared by both indica and japonica. MrModeltest (56) was run on each locus to determine the best-fit nucleotide evolution models. A Yule prior, which assumes that lineages split at a constant rate (57), was specified for the species tree. Each dataset was analyzed independently in *BEAST using a strict molecular clock for the reference locus based on the mean substitution neutral rate for grasses (0.0065 substitutions/site per Myr) (46), from which the rate of the other genes was estimated. Other details on the phylogenetic analyses are in SI Text.
Acknowledgments
The authors would like to thank Dorian Fuller, Andrew Doust, and Joseph Heled for helpful discussions, Chris Smith for helping to develop the SNP quality control pipeline, and Xianfa Xie for help in choice of some accessions. We would also like to thank Dennis Widjaja, Kelly Clemenza, Naeha Bhambra, Hannah Chaudry, and Silvia Gerard-Martinez for help in processing sequence data. This work was funded in part by the National Science Foundation Plant Genome Research Program.
Footnotes
↵1J.M. and M.S. contributed equally to this work.
- 2To whom correspondence may be addressed. E-mail: schaal{at}wustl.edu or mp132{at}nyu.edu.
Author contributions: J.M., J.M.F., S.J., B.A.S., C.D.B., A.R.B., and M.D.P. designed research; J.M., J.M.F., S.R., and P.H. performed research; A.R., P.H., and B.A.S. contributed new reagents/analytic tools; J.M., M.S., N.G., J.M.F., C.D.B., and A.R.B. analyzed data; and J.M., M.S., N.G., B.A.S., A.R.B., and M.D.P. wrote the paper.
The authors declare no conflict of interest.
Data deposition: Because of the complexity of the data as multiple sequence alignments, there is no public database that can accommodate the format. We are, therefore, making the data available as a zipped file at http://puruggananlab.bio.nyu.edu/Rice_data/ as indicated in Results.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1104686108/-/DCSupplemental.
References
- ↵
- ↵
- Higham C,
- Lu TLD
- ↵
- ↵
- Liu L,
- Lee G-A,
- Jiang L,
- Zhang J
- ↵
- International Rice Genome Sequencing Project 2005
- ↵
- Oka HI
- ↵
- ↵
- Jacobs SW,
- Everett LJ
- Lu BR,
- Naredo MEB,
- Juliano AB,
- Jackson MT
- ↵
- Cheng C,
- et al.
- ↵
- ↵
- Matsuo T,
- Futsuhara Y,
- Kikuchi F,
- Yamaguchi H
- ↵
- ↵
- Nakano MA,
- Yoshimura A,
- Iwata N
- ↵
- ↵
- ↵
- Londo JP,
- Chiang YC,
- Hung KH,
- Chiang TY,
- Schaal BA
- ↵
- ↵
- ↵
- ↵
- ↵
- Ma J,
- Bennetzen JL
- ↵
- ↵
- ↵
- ↵
- Li C,
- Zhou A,
- Sang T
- ↵
- ↵
- ↵
- ↵
- Knowles LL,
- Carstens BC
- ↵
- ↵
- Kubatko LS,
- Degnan JH
- ↵
- ↵
- Cranston KA,
- Hurwitz B,
- Ware D,
- Stein L,
- Wing RA
- ↵
- ↵
- Heled J,
- Drummond AJ
- ↵
- ↵
- ↵
- Zhu Q,
- Zheng X,
- Luo J,
- Gaut BS,
- Ge S
- ↵
- Yu G,
- Olsen KM,
- Schaal BA
- ↵
- Gaut BS,
- Morton BR,
- McCaig BC,
- Clegg MT
- ↵
- ↵
- ↵
- ↵
- Kovach MJ,
- Calingacion MN,
- Fitzgerald MA,
- McCouch SR
- ↵
- ↵
- ↵
- ↵
- Vaughan DA,
- Lu BR,
- Tomooka N
- ↵
- ↵
- Allaby RG,
- Fuller DQ,
- Brown TA
- ↵
- Morrell PL,
- Clegg MT
- ↵
- ↵
- Sanjur OI,
- Piperno DR,
- Andres TC,
- Wessel-Beaver L
- ↵
- Pedrosa S,
- et al.
- ↵
- Loftus RT,
- MacHugh DE,
- Bradley DG,
- Sharp PM,
- Cunningham P
- ↵
- Nylander JAA
- ↵
Citation Manager Formats
Sign up for Article Alerts
Article Classifications
- Biological Sciences
- Evolution