Multiple lines of evidence for the origin of domesticated chili pepper, Capsicum annuum, in Mexico

Edited by Dolores R. Piperno, Smithsonian National Museum of Natural History and Smithsonian Tropical Research Institute, Fairfax, Washington, DC, and approved December 4, 2013 (received for review September 6, 2013)
April 21, 2014
111 (17) 6165-6170


The novelty of the information of this manuscript resides in the addition of species distribution modeling and paleobiolinguistics data, combined with genetic and existing archaeobotanical data, to trace back the geographic origin of a crop, namely domesticated pepper, Capsicum annuum. Furthermore, the utilization of a geographic framework of reference for the four types of data has allowed us to combine these independent data types into a single hypothesis about the origin of this crop. Our results suggest that food crops in Mexico had a multiregional origin with chili pepper originating in central-east Mexico, maize in the Balsas River Basin and common bean in the Lerma–Santiago River Basin, resembling similar finds for the Fertile Crescent and China.


The study of crop origins has traditionally involved identifying geographic areas of high morphological diversity, sampling populations of wild progenitor species, and the archaeological retrieval of macroremains. Recent investigations have added identification of plant microremains (phytoliths, pollen, and starch grains), biochemical and molecular genetic approaches, and dating through 14C accelerator mass spectrometry. We investigate the origin of domesticated chili pepper, Capsicum annuum, by combining two approaches, species distribution modeling and paleobiolinguistics, with microsatellite genetic data and archaeobotanical data. The combination of these four lines of evidence yields consensus models indicating that domestication of C. annuum could have occurred in one or both of two areas of Mexico: northeastern Mexico and central-east Mexico. Genetic evidence shows more support for the more northern location, but jointly all four lines of evidence support central-east Mexico, where preceramic macroremains of chili pepper have been recovered in the Valley of Tehuacán. Located just to the east of this valley is the center of phylogenetic diversity of Proto-Otomanguean, a language spoken in mid-Holocene times and the oldest protolanguage for which a word for chili pepper reconstructs based on historical linguistics. For many crops, especially those that do not have a strong archaeobotanical record or phylogeographic pattern, it is difficult to precisely identify the time and place of their origin. Our results for chili pepper show that expressing all data in similar distance terms allows for combining contrasting lines of evidence and locating the region(s) where cultivation and domestication of a crop began.
The analysis of plant macroremains, morphological variation in crop varieties, and identification of wild progenitor species (as determined through their ability to hybridize with the crop) constitute traditional methods for studying crop origins (1, 2). Currently, analysis of microremains such as starch grains, accelerator mass spectrometry (AMS) 14C radiocarbon dating, along with biochemical and molecular genetic analyses of wild and domesticated populations are also used to date and locate geographic areas of domestication (3, 4).
This set of approaches is extended here with two additional methods, species distribution modeling and paleobiolinguistics, integrating these in a comprehensive study of the origin of domesticated chili pepper, Capsicum annuum L., the world’s most widely grown spice. C. annuum is one of five domesticated pepper species, which also include Capsicum baccatum L., Capsicum chinense Jacq., Capsicum frutescens L., and Capsicum pubescens Ruiz & Pav. The ∼30 species of Capsicum are all native to the Americas (5). Comparing karyotypes of wild and domesticated C. annuum (var. glabriusculum and var. annuum, respectively), Pickersgill (6) identified Mexico as the general region of domestication of this pepper. Loaiza-Figueroa et al. (7) used allozyme similarity to identify putative wild ancestral populations for chili pepper in a larger collection of wild and domesticated populations. They narrowed the likely domestication area to the eastern Mexican states of Tamaulipas, Nuevo León, San Luís Potosí, Veracruz, and Hidalgo. Since these investigations, others have sought to determine genetic relationships among wild and domesticated populations of chili pepper (8, 9).
The oldest macroremains unambiguously identified as Capsicum pepper were retrieved from preceramic strata of dry caves in two states of Mexico: Puebla (Tehuacán Valley; refs. 10, 11) and Tamaulipas (Ocampo caves; ref. 12) (Fig. 1A). These were found with macroremains of maize (Zea mays), squash (Cucurbita spp.), and other species used by humans, all of which, at both sites, were indirectly dated through associations in archaeological strata, suggesting a rough date for the chili pepper macroremains of around 9000–7000 B.P. (13). Subsequently, remains of maize from Tehuacán were dated directly by AMS and found to be more recent, 5600 y calibrated B.P. (14). AMS dating applied to bottle gourd and squash from Ocampo also yielded more recent ages, 6400–6000 y calibrated B.P. (15). Whereas no AMS dates have been recorded for the Tehuacán and Ocampo remains of chili pepper, remains from Guilá Naquitz and Silvia’s Caves in the arid eastern valley of Oaxaca state were dated indirectly by AMS to 1400–500 B.P. (16). Rock shelters in the seasonally dry tropical forest of the Central Balsas watershed (state of Guerrero) have produced phytoliths and starch grain residue for domesticated maize and squash (Cucurbita sp.) dated by association to around 9000 B.P. (17). However, no remains of Capsicum pepper have been found at that site.
Fig. 1.
Possible area of Mexico for Capsicum annuum domestication based on (A) archaeological, (B) paleoclimatic, mid-Holocene, (C) linguistic, and (D) genetic data. In addition to the strength of evidence (between 0 and 1), the maps show: (A) Location of the oldest archaeological remains of chili: Romero Cave, Ocampo, Tamaulipas; Coxcatlán Cave, Tehuacán Valley, Puebla. (C) Location of the homeland of Proto-Otomanguean (dotted circle) and of the four subgroups of current Otomanguean languages (see Fig. S5 legend and Table 1; for a more detailed view of the subgroup distribution, see Fig. S5). Open circles represent approximate locations of protolanguages with a reconstructed word for chili. (D) Open circles indicate the location of the wild chili samples used in the genetic distance analysis (29). For explanation of values, see Materials and Methods.
Species distribution modeling (SDM) can be used to predict areas that are environmentally suitable for a species from the sites where it is known to occur (18). In SDM, locations of the known current distribution of a species are compiled; values for climatic predictor variables at these locations and a large set of random (background) locations are extracted from spatial databases; and the climatic values are used to fit a model that estimates the similarity of the climate in any location to climatic conditions at known occurrence locations, using a machine-learning algorithm such as MaxEnt (19). The model is then used to predict the climatic suitability for a species across an area of interest. This prediction can be made using current climate data, but the model can also be “transferred” in time, by using past or future climate data simulated by global climate models (GCMs). This approach has been used for many purposes, including to predict the effect of climate change on the geographic distribution of crop wild relatives (20) and to successfully locate unknown Capsicum populations (21).
Crop origins can also be studied using paleobiolinguistics (PBL), which employs the comparative method of historical linguistics to reconstruct the biodiversity known to human groups of the remote, unrecorded past (2224). By comparing words for a species in modern languages, terms for plants and animals in ancestral languages can be retrieved. The presence of words for a species in an ancestral language is an indication of the species’ significance to speakers of that language (25, 26), if not their status as domesticated plants. PBL uses Automated Similarity Judgment Program (ASJP) chronology for estimating the latest date at which a protolanguage was spoken based on lexical similarity (27). Lexical similarity found among related languages is calibrated with historical, epigraphic, and archaeological divergence dates for 52 language groups. In addition, the general area in which an ancestral language was spoken, i.e., the protolanguage homeland, can be approximately determined by locating the area where its modern descendant languages are found to be most diverse (28).
In this paper, we complement existing archaeobotanical data with ecological, paleobiolinguistic, and molecular diversity data to identify the region of initial intensification of human interest in chili pepper that led to crop domestication. The novelty of our approach resides in the addition of SDM and PBL to this type of analysis and the expression of all lines of evidence in comparable spatially explicit units (distance to the area of origin) that allows for their integration into a single prediction.


Archaeological Evidence.

The remains from Tehuacán and Ocampo constitute at present the oldest macrobotanical evidence for preceramic chili pepper in the New World. Although these chili specimens cannot be identified as cultivated or domesticated, their archaeological association with domesticated remains of important crops, such as maize and squash, is strongly suggestive of ancient intensive human interaction with chili in these areas. Based on this evidence, we assumed that the nearer a place may be to either of these sites, the more likely the location was part of the region where the crop was first grown and domesticated (Fig. 1A).

Ecological Evidence.

Wild chili pepper (C. annuum var. glabriusculum), the ancestor of domesticated C. annuum (6), is a perennial shrub that produces dozens of erect, globular, pea-sized fruits. The fruits are consumed and dispersed by frugivorous birds, which pass the seed through their digestive system. Generally found in the northern half of Mexico, the wild chili pepper is associated with a nurse plant—often a hackberry (Celtis pallida Torey), a mesquite (Prosopis sp.), or columnar cacti. As one moves further southwards, wild chili pepper is found more frequently in human-disturbed landscapes—fence rows, home gardens, and roadsides (29). Based on our own collecting localities and those of herbarium specimens and gene bank accessions (29), we estimate that wild chili peppers grow currently in environments with a median annual average temperature of 24 °C and between 20 °C and 26 °C for 90% of the locations. The coldest locations with known wild pepper populations are mostly in the central Mexican highlands, the warmest locations in the southern coastal regions of Mexico and Guatemala. The median annual rainfall of these locations is 907 mm, and between 495 and 2,253 mm for 90% of the locations, with the driest locations in the northwestern part of the distribution (e.g., Baja California and the Sonoran Desert) and the wettest locations in southeastern Mexico.
The MaxEnt species distribution model had an internal (training) fit area under the curve (AUC) of the receiver operating characteristic (ROC) curve of 0.89. The average cAUC (bias corrected) obtained with fivefold cross-validation was 0.80, which suggests that the model has very good predictive power (30). The two most important predictor variables (based on permutation importance) were mean temperature of the coldest quarter (53%), followed by annual precipitation (14%).
Under the climate conditions of the mid-Holocene (about 6000 B.P.), the regions predicted to be most suitable for wild chili pepper include areas along the western and eastern coasts of Mexico, southeast Mexico and northern Guatemala (Fig. 1B). The central highlands were clearly unsuitable for this species during this period. The correlation coefficient between the predicted suitability for the current climate (Fig. S1) and the mid-Holocene climate was 0.92. Despite this overall similarity, there were important differences between these predictions, with areas in the southeast of Mexico more suitable and areas in the northeast less suitable during the mid-Holocene (Fig. S2).

Paleobiolinguistic Evidence.

Brown (22) surveyed the reconstructed vocabularies of 30 protolanguages of Mesoamerica (southern half of Mexico and northern Central America) and abutting areas for terms for 41 different crops, including chili pepper. His survey presented for each protolanguage the estimated date it was spoken at the latest, making it possible to stratify reconstructed words for crops chronologically (Table 1) (27, 28).
Table 1.
Reconstruction of terms for Capsicum in selected protolanguages of Mesoamerica and abutting areas
Years before presentProtolanguageReconstructed word for chiliLocation of modern descendant languagesGenetic affiliation
5976Eastern Otomanguean*(h)saH3, *kiMexicoOtomanguean
4542Mixtecan*(H)yaʔ, Hyah, Hθaʔ2MexicoOtomanguean
4018Uto-AztecanNRUS Southwest, Mexico, Central AmericaUto-Aztecan
3472Southern Uto-AztecanNRMexico, Central AmericaUto-Aztecan
3434Kiowa-TanoanNRUS SouthwestKiowa-Tanoan
3000LencanNRCentral AmericaLencan
2774MisumalpankumaCentral AmericaMisumalpan
2576Northern Uto-AztecanNRUS SouthwestUto-Aztecan
2220Mayan*i:hkMexico, Central AmericaMayan
1865YumanNRUS SouthwestYuman
1737NumicNRUS SouthwestUto-Aztecan
1587TakicNRUS SouthwestUto-Aztecan
1509General Aztec*či:lMexicoUto-Aztecan
NR, not reconstructable.
Sources for each language are listed in SI Materials and Methods under Paleobiolinguistics.
Explanations for phonetic representation of pepper words are listed in SI Materials and Methods under Paleobiolinguistics.
Proto-Otomanguean is the oldest (∼6500 B.P.) protolanguage of the New World for which a word for chili pepper reconstructs (31). All daughter languages of Proto-Otomanguean, as defined by Kaufman (32), show reconstructed terms for chili pepper (Table 1). Given that estimated dates are to be understood as the latest dates at which ancestral languages were spoken, it is plausible that speakers of Proto-Otomanguean actually had a word for chili pepper hundreds, if not thousands, of years before ∼6500 B.P. The oldest protolanguage of Table 1 not belonging to the Otomanguean family is Proto-Totozoquean (∼4300 B.P.), for which a term for chili pepper does not reconstruct. Non-Otomanguean languages for which a term for chili pepper reconstructs are Proto-Misumalpan (∼2800 B.P.), Proto-Sonoran (∼2400 B.P.), and Proto-Mayan (∼2200 B.P.). Thus, the earliest non-Otomanguean dates for Capsicum in Mesoamerica and abutting regions are over 3,700 y more recent than the oldest date, suggesting that speakers of a prehistoric Otomanguean language or languages may have been among the first cultivators or domesticators of chili pepper. Note that the current word—chili—is derived from the General Aztec language, Nahuatl, which reconstructs to a much more recent date (∼1500 B.P.; Table 1).
The area of maximum diversity of a language family has been viewed traditionally by linguists as suggestive of the location of a family’s ancestral language (e.g., ref. 28).We use this phylogenetic diversity information in locating the Otomanguean homeland by identifying where languages of the four subgroups of the family—Mazatecan-Zapotecan, Amuzgo-Mixtecan, Tlapanecan-Chorotegan, and Otopamean-Chinantecan (32)—are currently spoken in closest proximity (Fig. 1C).

Genetic Evidence.

During the fall of 2006 and 2007, expeditions were conducted in the southern United States and throughout Mexico to sample populations of wild C. annuum (29). This provided the most complete set of wild C. annuum from Mexico available to date. Based largely on this set, 139 wild types distributed over the entire exploration area were chosen as were 49 domesticated types that are endemic landraces (ancho, puya, and guajillo) (33). This collection was screened with 17 simple sequence repeat (SSR) DNA markers (34, 35). These markers were chosen for this study because of their consistency of amplification and polymorphism within our sample. For each wild plant, a distance was calculated to the domesticated group based on the average proportion of shared SSR alleles. These distances were then spatially interpolated to produce in each grid cell an estimated genetic similarity between wild pepper populations (if any occurred in the cell) and the group of domesticated chili peppers (regardless of where they occurred). This molecular-marker–based analysis of genetic similarity between wild and domesticated types revealed a broad area of high similarity in the northeastern quadrant of Mexico (Fig. 1D), including the states of Tamaulipas, Nuevo León, San Luís Potosí, and Veracruz. In contrast, genetic similarity between wild and cultivated types was generally low in southern and northwest Mexico, confirming earlier results (7).

Consensus Model.

The four lines of evidence—archaeological, ecological, paleobiolinguistic, and genetic—were all expressed as a spatial model and they can therefore be combined into a single consensus model represented geographically through mapping. Each type of evidence has its particular strengths and weaknesses, discussed below, which need to be taken into consideration when producing a consensus model. Because these merits and demerits are difficult to quantify (some are simply unknown), assigning differential weights to each line of evidence is problematic. Our solution is to present a number of different consensus maps based on several different weighting combinations (Fig. 2 and Fig. S3).
Fig. 2.
Consensus models of the likelihood that cultivated chili pepper originated in an area. The models were obtained by combining the four lines of evidence for the origin of domesticated chili pepper (Fig. 1). (A) equal weights; (B) genetics 1/2, all others 1/6 weight; and (C) archaeology 1/10, all others 1/3 weight. After combining, the values were scaled between 0 and 1 and then squared to give more weight to the higher values.
The first map, Fig. 2A, was established using equal weighting for each type of evidence (each weighted as making a 1/4 contribution). According to this model, areas in central-east Mexico and northeastern Mexico are the most likely area of origin of chili pepper. The second model assigned a high weight to genetic evidence (weighted 1/2) and equal but lower weights to the other three lines of evidence (each weighted 1/6). This assumes that genetic data might be superior to one or more of the other lines of evidence used because, for example, it might suffer less from sampling bias. This results in primary support for northeastern Mexico and only secondary support for central-east Mexico (Fig. 2B). The third approach assigned a low weight to archaeology (1/10) and equal higher weights to the other three lines of evidence (each weighted 1/3). This weighting was motivated by the observation that the current archaeological data are assembled from macroremains of only two sites in Mexico. This weighting produces a consensus model resembling the equal weighting of Fig. 2A because both central-east Mexico and northeastern Mexico result as equally plausible geographic candidates for chili pepper domestication (Fig. 2C). Additional information, from other sites and microremains, yet to be discovered, would justify a stronger weighting for archaeobotanical data.
Another weighting strategy produces different models based on randomly assigning combinations of weights for the four types of evidence. This approach allows us to explore the universe of possible weight combinations given different interpretations of the individual lines of evidence. Fig. S3 shows the percentile distribution obtained for this approach. The resulting maps suggest again that either central-east Mexico or northeastern Mexico or, conceivably, both areas were locations of the domestication of C. annuum.


We have embraced the template of multidisciplinary approaches to study crop origins proposed first by de Candolle (36) and later by Harlan and de Wet (37). Confidence in a crop-origin hypothesis is increased when supported by multiple, independent lines of evidence, and improved understanding comes from new evidence in each field and concomitant predictions in other fields (38). Our multidisciplinary approach depends on the independence and strength of evidence from the different fields, each of which has its strengths and weaknesses.
Current archaeobotanical data for chili pepper is mainly based on macroremains from only two sites. In addition to the identification of ancient chili remains at additional sites, our understanding could benefit from the investigation of microfossil data such as starch grains (39) in Mesoamerican sites. Availability of microfossils may provide information on the more ancient distribution and importance of chili peppers and potentially also help distinguish domesticated from nondomesticated remains (as in the case of maize) (40).
The quality of species distribution models depends on having a representative sample of the current distribution of the wild species, the quality of climate data, particularly the modeled past climate data, and the algorithm used. Our sample size was large and the species is widespread, suggesting that the SDM approach should work well (41), as confirmed by a high cAUC score (30). Backcasted climate data for the mid-Holocene is, of course, uncertain; furthermore, we did not consider climate variation during that period. Nevertheless, because we use an ensemble of climate models (Fig. S4), our predictions should be relatively robust (42).
Utilization of linguistic data assumes an understanding of language development, including information relating to language origin, dispersal, and diffusion of traits across languages that is still emerging as new computer approaches are increasingly applied in linguistic analysis (43, 44). PBL provides an assessment of when species acquired substantial salience for prehistoric groups, whether they were merely harvested, cultivated, or eventually domesticated. If a word for a biological species reconstructs for a protolanguage, this is evidence that the species was known to and probably of considerable importance to speakers of the language as shown by Berlin et al. (25) for two closely related Mayan languages, Tzeltal and Tzotzil (Tzeltalan) and by Balée and Moore (26) in a study of plant names in five Eastern Amazonian Tupi-Guaraní languages.
Genetic data are generally based on the analysis of contemporary populations of the wild ancestor of the crop. The wild populations included in this study constitute the largest and most widespread sample used in genetic analyses for this species (29). However, we do not know to what extent the distribution and genetic structure of these populations have changed over the past 6,000 y. Hence, we modeled the past distribution of wild chili peppers based on the assumption that their climatic requirements are the same as today’s wild chili pepper population. Correlation between the suitability scores for ancient and current distributions is high (0.92), suggesting that, whereas climate change over the past 6,000 y has likely shifted the species distribution, for the most part, the historical and current ranges of this species overlap. Another potentially confounding factor is gene flow between domesticated and wild chili peppers, which may cause similarities that are not due to ancestor–descendant relationships (45). However, this would not seem very important for chili peppers because they are mostly a self-fertilizing species with minimal outcrossing, which is confirmed by the high levels of homozygosity observed for wild chili pepper populations analyzed here Dataset S1, Microsat info).
The concept of origin of C. annuum used in this study encompasses wild plant protection, management, cultivation, and domestication. Within this continuum of increasingly close interaction between humans and plants, distinguishing among these four stages for most crops is difficult. However, with respect to chili pepper, the fact that a Proto-Otomanguean word for the crop was retained in daughter languages attests to its high salience for speakers of the protolanguage. Furthermore, Proto-Otomanguean speakers may have been actively engaged in cultivation, as suggested by the reconstruction of words for a range of plants, including staple crops such as maize and squash, but also other crops such as avocado and nopal (22). Plausibly, then, the saliency of chili pepper among Proto-Otomanguean speakers reflects cultivation and perhaps incipient domestication and not merely use of a wild plant species.
When analyzed separately, our four lines of evidence do not all suggest the same geographic area as being the most likely place of chili pepper origin. Nevertheless, we identify central-east Mexico as a likely region of initial cultivation or incipient domestication because that interpretation most parsimoniously reconciles all evidence (Fig. 2A). This area extends from southern Puebla and northern Oaxaca to southern Veracruz and encompasses the valley of Tehuacán (Fig. 2A). The Coxcatlán Cave from which preceramic macroremains of chili pepper have been recovered (13) is situated in this valley. Species distribution modeling shows that many parts of the identified area were suited for the wild progenitor of C. annuum around the time of first cultivation or domestication in the mid-Holocene and there are currently populations of wild chili pepper that are genetically similar to the domesticated species (Fig. 1D). Near to the valley is the likely center of the Otomanguean homeland. Proto-Otomanguean, spoken in mid-Holocene times some 6,500 y ago, is the oldest ancestral language of the New World for which a term for chili pepper reconstructs. Speakers of contemporary Otomanguean languages live in or close to the region. Otomanguean people, then, may have been the first in the New World to transform wild chilies into the domesticated spice and condiment so widely enjoyed today.
By expressing all data as a distance, whether geographical (archaeological and linguistic data), climatic, or genetic, we have developed a method to bring together different lines of evidence about crop origins into a single framework of analysis. This approach has led to the discovery that the origin of domesticated chili peppers may have been located further south than previously thought (7) and in different regions of Mexico than proposed for common bean (46) or maize (47). Thus, our data do not suggest a single, nuclear area for crop domestication in Mesoamerica, but rather a multiregional model as suggested also for the Southwest Asian (48) and Chinese (49) centers of agricultural origins.

Materials and Methods


We used two locations for which there is evidence of the earliest use of chili: Romero's Cave (near Ocampo, Tamaulipas) and Coxcatlán Cave (Tehuacán Valley, Puebla) (Fig. 1A). We connected these locations by their shortest path, and then computed the distance d (in kilometers) to this path for cells on a raster with 1-km2 spatial resolution. We truncated the distances at 1,000 km and used an inverse squared distance decay function, scaled between 0 and 1, (1 − (d/1,000)) as a measure of the likelihood that chili was domesticated in a location (grid cell).

Species Distribution Modeling.

We used SDM to assess spatial variation in suitability for wild C. annuum var. glabriusculum, the ancestor of domesticated C. annuum (6), during climatic conditions of the mid-Holocene (about 6,000 y ago) (Fig. 1B). Locations where wild Capsicum populations currently occur were from collections made in the fall of 2006 and 2007 (Dataset S2, Coordinates_w_SSR info) (29) and from additional records obtained from the Global Biodiversity Information Facility (GBIF) (Dataset S3, Coordinates GBIF). We used the SDM algorithm MaxEnt (19) to predict suitability during the mid-Holocene according to nine global climate models (SI Materials and Materials).


The different languages considered are listed in Table 1 and the respective information sources are compiled in SI Materials and Methods. Protolanguage dates (Table 1) were calculated through use of ASJP chronology (27). The center of phylogenetic diversity of Otomanguean languages was located from the distribution of places where Otomanguean languages are currently spoken (50), by determining the area where languages spoken in close geographic proximity to one another are found to be affiliated with the largest number of major divisions of the family.

Genetic Distance Analysis.

Genetic distance between 139 wild and 49 domesticated pepper accessions (Datasets S2 and S3, Coordinates GBIF) were assessed with data from 17 microsatellite markers (SSRs) developed before this study as described in Dataset S1, Microsat info).

Consensus Model.

All four data sources were used to create a spatial model on a common raster. All models had values between 0 and 1, with higher scores indicating that a location is more likely to be the area where domestication occurred. We combined these four sources of data into a single consensus model by assigning weights to each indicator. To get a more pronounced differentiation between sites, we squared the values, after first rescaling them between 0 and 1.


Suzanne K. Fish, Eric W. Holman, and Søren Wichmann provided help for this project in its early phase. Holman and Wichmann also read and commented on the manuscript as did Gene Anderson, Roger Blench, Eric Campbell, Charles Clement, Norman Hammond, Matt Hufford, Sarah Metcalfe, Barbara Pickersgill, Anthony J. Ranere, Brian Stross, and Eric Votava. K.H.K. thanks Horacio Villalón and Sergio Hernández Verdugo for contributions of wild Capsicum; Heather Zornetzer for assistance during field collection; and Derek van den Abeelen, Raúl Durán, Tiffany Chan, Jonathan Kong, and James Kami for assistance in the laboratory work for the genetic analyses; and the Fulbright program, the University of California Institute for Mexico and the United States (UC MEXUS), and the Department of Plant Sciences (Graduate Student Research assistantship) for funding. We thank the World Climate Research Programme's Working Group on Coupled Modelling (CMIP5) and the climate modeling groups for making their model output available.

Supporting Information

Supporting Information (PDF)
Supporting Information
Image_S01 (PDF)
Supporting Information
Image_S02 (PDF)
Supporting Information
Image_S04 (PDF)
Supporting Information


B Smith The Emergence of Agriculture (Scientific American Library, New York, 1995).
P Gepts, et al. Biodiversity in Agriculture: Domestication, Evolution, and Sustainability (Cambridge Univ Press, Cambridge, UK, pp 630. (2012).
P Gepts, Domestication as a long-term selection experiment. Plant Breed Rev 24, 1–44 (2004).
JM Burke, JC Burger, MA Chapman, Crop evolution: From genetics to genomics. Curr Opin Genet Dev 17, 525–532 (2007).
PW Bosland, Chiles: History, cultivation and uses. Spices, Herbs and Edible Fungi, ed G Charambous (Elsevier, New York, 1994).
B Pickersgill, Relationships between weedy and cultivated forms in some species of chili peppers (genus Capsicum). Evolution 25, 683–691 (1971).
F Loaiza-Figueroa, K Ritland, JAL Cancino, SD Tanskley, Patterns of genetic variation of the genus Capsicum (Solanaceae) in Mexico. Plant Syst Evol 165, 159–188 (1989).
A Aguilar-Meléndez, PL Morrell, ML Roose, SC Kim, Genetic diversity and structure in semiwild and domesticated chiles (Capsicum annuum; Solanaceae) from Mexico. Am J Bot 96, 1190–1202 (2009).
EJ Votava, GP Nabhan, PW Bosland, Genetic diversity and similarity revealed via molecular analysis among and within an in situ population and ex situ accessions of chiltepin (Capsicum annuum var. glabriusculum). Conserv Genet 3, 123–129 (2002).
CE Smith, Plant remains. The Prehistory of the Tehuacan Valley, ed DS Byers (Univ of Texas Press, Austin, TX), pp. 220–255 (1967).
CE Smith, Current archaeological evidence for the beginning of American agriculture. Studies in the Neolithic and Urban Revolutions, The V. Gordon Childe Colloquium, ed Manzanilla L (British Archaeological Reports, Oxford, UK), pp 81–101. (1987).
PC Mangelsdorf, RS McNeish, GR Willey, Origins of Middle American agriculture. Natural Environment and Early Cultures, ed RC West (Univ. of Texas Press, Austin, Texas), pp. 427–445 (1965).
A McClung de Tapia, The origins of agriculture in Mesoamerica and Central America. The Origins of Agriculture: An International Perspective, eds CW Cowan, PJ Watson (Smithsonian Institution Press, Washington, DC), pp. 143–171 (1992).
A Long, B Benz, D Donahue, A Jull, L Toolin, First direct AMS dates on early maize from Tehuacán, Mexico. Radiocarbon 31, 1035–1040 (1989).
BD Smith, Reconsidering the Ocampo Caves and the era of incipient cultivation in Mesoamerica. Latin American Antiquity 8, 342–383 (1997).
L Perry, KV Flannery, Precolumbian use of chili peppers in the Valley of Oaxaca, Mexico. Proc Natl Acad Sci USA 104, 11905–11909 (2007).
AJ Ranere, DR Piperno, I Holst, R Dickau, J Iriarte, The cultural and chronological context of early Holocene maize and squash domestication in the Central Balsas River Valley, Mexico. Proc Natl Acad Sci USA 106, 5014–5018 (2009).
J Elith, JR Leathwick, Species distribution models: Ecological explanation and prediction across space and time. Annu Rev Ecol Evol Syst 40, 677–697 (2009).
SJ Phillips, RP Anderson, RE Schapire, Maximum entropy modeling of species geographic distributions. Ecol Modell 190, 231–259 (2006).
A Jarvis, A Lane, RJ Hijmans, The effect of climate change on crop wild relatives. Agric Ecosyst Environ 126, 13–23 (2008).
A Jarvis, et al., Use of GIS for optimizing a collecting mission for a rare wild pepper (Capsicum flexuosum Sendtn.) in Paraguay. Genet Resour Crop Evol 52, 671–682 (2005).
CH Brown, Development of agriculture in prehistoric Mesoamerica: The linguistic evidence. Pre-Columbia Foodways: Interdisciplinary Approaches to Food, Culture and Markets in Ancient Mesoamerica, eds JE Staller, MD Carrasco (Springer, Berlin), pp. 71–107 (2010).
CS Fowler, Some ecological clues to Proto-Numic homelands. Desert Research Institute Publications in the Social Sciences 8, 105–117 (1972).
JH Hill, Proto-Uto-Aztecan: A community of cultivators in central Mexico? Am Anthropol 103, 913–934 (2001).
B Berlin, DE Breedlove, RM Laughlin, PH Raven, Cultural significance and lexical retention in Tzeltal-Tzotzil ethnobotany. Meaning in Mayan Languages, ed MA Edmonson (Mouton, The Hague), pp. 143–164 (1973).
WL Balée, D Moore, Similarity and variation in plant names in five Tupi-Guarani languages (Eastern Amazonia). Bulletin of the Florida Museum of Natural History 35, 209–262 (1991).
EW Holman, et al., Automated dating of the world's language families based on lexical similarity. Curr Anthropol 52, 841–875 (2011).
S Wichmann, A Muller, V Velupillai, Homelands of the world's language families: A quantitative approach. Diachronica 27, 247–276 (2010).
KH Kraft, JD Luna-Ruiz, P Gepts, A new collection of wild populations of Capsicum in Mexico and the southern United States. Genet Resour Crop Evol 60, 225–232 (2013).
RJ Hijmans, Cross-validation of species distribution models: Removing spatial sorting bias and calibration with a null model. Ecology 93, 679–688 (2012).
CH Brown, CR Clement, P Epps, E Luedeling, S Wichmann, The paleobiolinguistics of domesticated chili pepper (Capsicum spp.). Ethnobiology Letters 4, 1–11 (2013).
TS Kaufman, Early Otomanguean homeland and cultures: Some premature hypotheses. (University of Pittsburgh Working Papers in Linguistics 1), pp 91–136. (1990).
KH Kraft, J de Jesús Luna-Ruíz, P Gepts, Different seed selection and conservation practices for fresh market and dried chile farmers in Aguascalientes, Mexico. Econ Bot 64, 318–328 (2010).
JM Lee, SH Nahm, YM Kim, BD Kim, Characterization and molecular genetic mapping of microsatellite loci in pepper. Theor Appl Genet 108, 619–627 (2004).
Y Minamiyama, M Tsuro, M Hirai, An SSR-based linkage map of Capsicum annuum. Mol Breed 18, 157–169 (2006).
A de Candolle L'Origine des Plantes Cultivées [The Origin of Cultivated Plants] (Appleton, New York, pp 468. (1882).
JR Harlan, JMJ de Wet, On the quality of evidence for origin and dispersal of cultivated plants. Curr Anthropol 14, 51–62 (1973).
AJ Ammerman, LL Cavalli-Sforza The Neolithic Transition and the Genetics of Populations in Europe (Princeton Univ. Press, Princeton, NJ, pp 176. (1984).
L Perry, et al., Starch fossils and the domestication and dispersal of chili peppers (Capsicum spp. L.) in the Americas. Science 315, 986–988 (2007).
I Holst, JE Moreno, DR Piperno, Identification of teosinte, maize, and Tripsacum in Mesoamerica by using pollen, starch grains, and phytoliths. Proc Natl Acad Sci USA 104, 17608–17613 (2007).
MS Wisz, et al., Effects of sample size on the performance of species distribution models. Divers Distrib 14, 763–773 (2008).
MB Araújo, M New, Ensemble forecasting of species distributions. Trends Ecol Evol 22, 42–47 (2007).
C Perreault, S Mathew, Dating the origin of language using phonemic diversity. PLoS ONE 7, e35289 (2012).
R Bouckaert, et al., Mapping the origins and expansion of the Indo-European language family. Science 337, 957–960 (2012).
R Papa, P Gepts, Asymmetric gene flow and introgression between wild and domesticated populations. Introgression from Genetically Modified Plants into Wild Relatives and Its Consequences, eds D Den Nijs, D Bartsch, J Sweet (CABI, Oxon, UK), pp. 125–138 (2004).
M Kwak, JA Kami, P Gepts, The putative Mesoamerican domestication center of Phaseolus vulgaris is located in the Lerma-Santiago basin of Mexico. Crop Sci 49, 554–563 (2009).
Y Matsuoka, et al., A single domestication for maize shown by multilocus microsatellite genotyping. Proc Natl Acad Sci USA 99, 6080–6084 (2002).
DQ Fuller, G Willcox, RG Allaby, Cultivation and domestication had multiple origins: Arguments against the core area hypothesis for the origins of agriculture in the Near East. World Archaeol 43, 628–652 (2011).
DJ Cohen, The beginnings of agriculture in China: A multiregional view. Curr Anthropol 52, S273–S293 (2011).
Avila-Blomberg A de, Moreno-Díaz NG (2008) Distribución de las lenguas indígenas de México [Distribution of the indigenous languages of Mexico]. Comisión Nacional para el Conocimiento y Uso de la Biodiversidad, Mexico, DF.

Information & Authors


Published in

Go to Proceedings of the National Academy of Sciences
Go to Proceedings of the National Academy of Sciences
Proceedings of the National Academy of Sciences
Vol. 111 | No. 17
April 29, 2014
PubMed: 24753581


Submission history

Published online: April 21, 2014
Published in issue: April 29, 2014


Suzanne K. Fish, Eric W. Holman, and Søren Wichmann provided help for this project in its early phase. Holman and Wichmann also read and commented on the manuscript as did Gene Anderson, Roger Blench, Eric Campbell, Charles Clement, Norman Hammond, Matt Hufford, Sarah Metcalfe, Barbara Pickersgill, Anthony J. Ranere, Brian Stross, and Eric Votava. K.H.K. thanks Horacio Villalón and Sergio Hernández Verdugo for contributions of wild Capsicum; Heather Zornetzer for assistance during field collection; and Derek van den Abeelen, Raúl Durán, Tiffany Chan, Jonathan Kong, and James Kami for assistance in the laboratory work for the genetic analyses; and the Fulbright program, the University of California Institute for Mexico and the United States (UC MEXUS), and the Department of Plant Sciences (Graduate Student Research assistantship) for funding. We thank the World Climate Research Programme's Working Group on Coupled Modelling (CMIP5) and the climate modeling groups for making their model output available.


This article is a PNAS Direct Submission.



Kraig H. Kraft
Department of Plant Sciences, Section of Crop and Ecosystem Sciences, University of California, Davis, CA 95616-8780;
Present address: Committee on Sustainability Assessment, Philadelphia, PA 19147.
Cecil H. Brown
Department of Anthropology, Northern Illinois University, Pensacola, FL 32503-6634;
Gary P. Nabhan
Southwest Center, University of Arizona, Tucson, AZ 85721-0185;
Eike Luedeling
World Agroforestry Centre, Nairobi, Kenya;
José de Jesús Luna Ruiz
Centro de Ciencias Agropecuarias, Universidad Autónoma de Aguascalientes, Aguascalientes, Mexico CP 20131;
Geo Coppens d’Eeckenbrugge
Centre de coopération Internationale en Recherche Agronomique pour le Développement, Unité Mixte de Recherche 5175 Centre d'Ecologie Fonctionnelle et Evolutive, Campus Centre National de la Recherche Scientifique, Montpellier Cedex 5, France; and
Robert J. Hijmans
Department of Environmental Science and Policy, University of California, Davis, CA 95616
Department of Plant Sciences, Section of Crop and Ecosystem Sciences, University of California, Davis, CA 95616-8780;


To whom correspondence should be addressed. E-mail: [email protected].
Author contributions: K.H.K., C.H.B., J.d.J.L.R., R.J.H., and P.G. designed research; K.H.K., C.H.B., J.d.J.L.R., G.C.d., R.J.H., and P.G. performed research; K.H.K., C.H.B., E.L., J.d.J.L.R., R.J.H., and P.G. contributed new reagents/analytic tools; K.H.K., C.H.B., G.P.N., E.L., G.C.d., R.J.H., and P.G. analyzed data; and K.H.K., C.H.B., G.P.N., E.L., J.d.J.L.R., G.C.d., R.J.H., and P.G. wrote the paper.

Competing Interests

The authors declare no conflict of interest.

Metrics & Citations


Note: The article usage is presented with a three- to four-day delay and will update daily once available. Due to ths delay, usage data will not appear immediately following publication. Citation information is sourced from Crossref Cited-by service.

Citation statements



If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by


    View Options

    View options

    PDF format

    Download this article as a PDF file


    Get Access

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Personal login Institutional Login

    Recommend to a librarian

    Recommend PNAS to a Librarian

    Purchase options

    Purchase this article to get full access to it.

    Single Article Purchase

    Multiple lines of evidence for the origin of domesticated chili pepper, Capsicum annuum, in Mexico
    Proceedings of the National Academy of Sciences
    • Vol. 111
    • No. 17
    • pp. 6117-6528







    Share article link

    Share on social media