Skip to main content

Main menu

  • Home
  • Articles
    • Current
    • Special Feature Articles - Most Recent
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • List of Issues
  • Front Matter
    • Front Matter Portal
    • Journal Club
  • News
    • For the Press
    • This Week In PNAS
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Editorial and Journal Policies
    • Submission Procedures
    • Fees and Licenses
  • Submit
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Accessibility Statement
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian

User menu

  • Log in
  • My Cart

Search

  • Advanced search
Home
Home
  • Log in
  • My Cart

Advanced Search

  • Home
  • Articles
    • Current
    • Special Feature Articles - Most Recent
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • List of Issues
  • Front Matter
    • Front Matter Portal
    • Journal Club
  • News
    • For the Press
    • This Week In PNAS
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Editorial and Journal Policies
    • Submission Procedures
    • Fees and Licenses
  • Submit
Research Article

Resolving the evolution of extant and extinct ruminants with high-throughput phylogenomics

Jared E. Decker, J. Chris Pires, Gavin C. Conant, Stephanie D. McKay, Michael P. Heaton, Kefei Chen, Alan Cooper, Johanna Vilkki, Christopher M. Seabury, Alexandre R. Caetano, Gary S. Johnson, Rick A. Brenneman, Olivier Hanotte, Lori S. Eggert, Pamela Wiener, Jong-Joo Kim, Kwan Suk Kim, Tad S. Sonstegard, Curt P. Van Tassell, Holly L. Neibergs, John C. McEwan, Rudiger Brauning, Luiz L. Coutinho, Masroor E. Babar, Gregory A. Wilson, Matthew C. McClure, Megan M. Rolf, JaeWoo Kim, Robert D. Schnabel, and Jeremy F. Taylor
  1. Divisions of aAnimal Sciences,
  2. bBiological Sciences, and
  3. hDepartment of Veterinary Pathobiology, University of Missouri, Columbia MO 65211;
  4. cUnited States Department of Agriculture, Agricultural Research Service, Meat Animal Research Center, Clay Center NE 68933;
  5. dAustralian Centre for Ancient DNA, School of Earth and Environmental Sciences, University of Adelaide, Adelaide SA 5005, Australia;
  6. eAgrifood Research Finland MTT, FIN-31600, Jokioinen, Finland;
  7. fVeterinary Pathobiology, Texas A&M University, College Station TX 77843;
  8. gEmbrapa Recursos Geneticos e Biotecnologia, Brasilia-DF, C.P. 02372, 70770-900, Brasil;
  9. iHenry Doorly Zoo, Omaha NE 68107;
  10. jInternational Livestock Research Institute, P.O. Box 30709, Nairobi 00100, Kenya;
  11. kSchool of Biology, University of Nottingham, Nottingham NG7 2RD, United Kingdom;
  12. lThe Roslin Institute, and R(D)SVS, University of Edinburgh, Roslin, Midlothian EH25 9PS, United Kingdom;
  13. mSchool of Biotechnology, Yeungnam University, Gyeongsan, Republic of Korea;
  14. nDepartment of Animal Science, Chungbuk National University, Cheongju, Republic of Korea;
  15. oUnited States Department of Agriculture, Agricultural Research Service, Bovine Functional Genomics Laboratory, Beltsville MD 20705;
  16. pDepartment of Animal Sciences, Washington State University, Pullman WA 99164;
  17. qAnimal Genomics, AgResearch, Invermay, PB 50034, Mosgiel, New Zealand;
  18. rDepartamento de Zootecnia, ESALQ-USP, Av. Padua Dias, 11, Piracicaba, SP 13418-900, Brasil;
  19. sDepartment of Livestock Production, University of Veterinary and Animal Sciences, Lahore 54000, Pakistan; and
  20. tCanadian Wildlife Service, 200 4999 98th Avenue Northwest, Edmonton AB, Canada T6B 2X3

See allHide authors and affiliations

PNAS November 3, 2009 106 (44) 18644-18649; https://doi.org/10.1073/pnas.0904691106
Jared E. Decker
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
J. Chris Pires
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Gavin C. Conant
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Stephanie D. McKay
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael P. Heaton
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kefei Chen
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alan Cooper
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Johanna Vilkki
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Christopher M. Seabury
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alexandre R. Caetano
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Gary S. Johnson
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Rick A. Brenneman
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Olivier Hanotte
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Lori S. Eggert
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Pamela Wiener
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jong-Joo Kim
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kwan Suk Kim
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Tad S. Sonstegard
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Curt P. Van Tassell
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Holly L. Neibergs
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
John C. McEwan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Rudiger Brauning
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Luiz L. Coutinho
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Masroor E. Babar
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Gregory A. Wilson
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Matthew C. McClure
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Megan M. Rolf
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
JaeWoo Kim
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Robert D. Schnabel
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jeremy F. Taylor
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: taylorjerr@missouri.edu
  1. Edited by James E. Womack, Texas A&M University, College Station, TX, and approved September 14, 2009 (received for review April 29, 2009)

  • Article
  • Figures & SI
  • Info & Metrics
  • PDF
Loading

Abstract

The Pecorans (higher ruminants) are believed to have rapidly speciated in the Mid-Eocene, resulting in five distinct extant families: Antilocapridae, Giraffidae, Moschidae, Cervidae, and Bovidae. Due to the rapid radiation, the Pecoran phylogeny has proven difficult to resolve, and 11 of the 15 possible rooted phylogenies describing ancestral relationships among the Antilocapridae, Giraffidae, Cervidae, and Bovidae have each been argued as representations of the true phylogeny. Here we demonstrate that a genome-wide single nucleotide polymorphism (SNP) genotyping platform designed for one species can be used to genotype ancient DNA from an extinct species and DNA from species diverged up to 29 million years ago and that the produced genotypes can be used to resolve the phylogeny for this rapidly radiated infraorder. We used a high-throughput assay with 54,693 SNP loci developed for Bos taurus taurus to rapidly genotype 678 individuals representing 61 Pecoran species. We produced a highly resolved phylogeny for this diverse group based upon 40,843 genome-wide SNP, which is five times as many informative characters as have previously been analyzed. We also establish a method to amplify and screen genomic information from extinct species, and place Bison priscus within the Bovidae. The quality of genotype calls and the placement of samples within a well-supported phylogeny may provide an important test for validating the fidelity and integrity of ancient samples. Finally, we constructed a phylogenomic network to accurately describe the relationships between 48 cattle breeds and facilitate inferences concerning the history of domestication and breed formation.

  • ancient DNA
  • Pecorans
  • domestication

The Pecorans are one of the most diverse groups of mammals, ranging in size from the diminutive duiker (adult weight 9–24 kg, shoulder height 0.45–0.51 m) to the giant giraffe (adult weight 500–1,250 kg, shoulder height 4.5–5.8 m). They are indigenous to all continents except South America and Australia (1) and live in a wide variety of environments. The ruminants are believed to have rapidly radiated in the Mid-Eocene (1), and due to this rapid radiation, the Pecoran phylogeny has proven difficult to resolve, with 11 of the 15 possible rooted phylogenies describing relationships among the Antilocapridae, Giraffidae, Cervidae, and Bovidae having been argued as representations of the true phylogeny (2, 3). A supermatrix analysis of nucleotide sequence data from 16 genes has resolved some of the nodes within the Pecoran “Tree of Life (3)” and has provided the most strongly supported available phylogeny to which we compare the results of our analyses. However, many of the nodes within this phylogeny either have little support or are completely unresolved (e.g., the genus Caprinae), and extinct taxa have yet to be phylogenetically placed with confidence (e.g., aurochs). These weakly supported phylogenies have hampered evolutionary studies and conservation efforts for this intriguingly diverse group.

The number and location of prehistoric domestication events for the extinct aurochs (Bos primigenius) has also been controversial (4–8), and the ancestry of many of the derived modern breeds of cattle is unknown. Genome-wide single nucleotide polymorphism (SNP) data captured using high-throughput assays provide a method to perform rapid genomic surveys and have recently been used to resolve the history of human populations (9, 10). However, these studies were restricted to a single species, and the remarkable power of these analyses (with >500,000 informative sites) was not fully captured because population relationships depicted using neighbor-joining trees fail to identify multiple ancestral relationships for historically admixed populations. We report an inter-generic, large-scale phylogenomic analysis which applied a genome-wide SNP assay developed for one species to many distantly related species. We also report the application of a genome-wide SNP assay to capture data for ancient DNA samples.

Results

Genotype Fidelity.

We have genotyped 16,353 animals representing 61 cattle breeds and 70 species, as divergent from Bos taurus as the Savannah elephant (Table S1), with the Illumina BovineSNP50 BeadChip (11, 12) according to Illumina protocols (13). To examine the quality of genotype calls in these outgroup species, we first sequenced the SNP site and flanking regions for rs17871403 in 14 species, with pronghorn the most divergent of the sequenced species (Table S2). This SNP was chosen because it has been well characterized in cattle and is a member of a SNP panel that is widely used for parentage analysis (14). Of the genotypes produced by the BovineSNP50 assay (Illumina) for this SNP in these species, 99.13% were concordant with the sequence when we allowed for genotype ambiguity (i.e., WW and SS) (see Methods). One of the six genotyped North American mountain goats and one of the eight genotyped caribou had discordant BovineSNP50 and sequence-based genotype calls (Table S2). This analysis of a single SNP across multiple species suggests a genotyping error rate for BovineSNP50 loci of only 0.87%.

We next aligned all 40,843 SNP probe sequences, which are 50 bases in length, to the international sheep genomics consortium (www.sheephapmap.org) genome assembly (available at https://isgcdata.agresearch.co.nz/ and in an annotated form at http://www.livestockgenomics.csiro.au/sheep/oar1.0.php) and found that only 26,098 (63.9%) could be uniquely aligned, primarily due to the incomplete status of the assembly. Of these SNP, 829 had an unknown base (N) identified at the position of the SNP, and for the remaining 25,269 SNPs, there were 308,518 genotypes called in 17 sheep. Genotype calls were in agreement with the genotype predicted from the respective sequence base for 298,311 genotypes (96.7%). There were 1,834 heterozygous genotypes and 8,373 genotypes that were homozygous for an allele not predicted by the sequence assembly. This suggests a BovineSNP50 genotyping error rate of between 2.7 and 3.3% in the outgroup species.

Finally, when minor allele frequencies (MAF) averaged over 40,843 SNPs were plotted against average genotype call rates, samples from outgroup species with the lowest call rates had higher than expected MAF (Fig. S1). This appears to be indicative of DNA quality issues since, for example, DNA for the Capra ibex samples was extracted from irradiated blood samples that had been stored under refrigeration for several years. On removing these samples, there was almost no correlation between MAF and call rate (Fig. S1). This indicates that as genetic distance from cattle increases and call rate decreases, spurious heterozygote and alternate homozygote genotype calls rarely arise, indicating support for the quality of these data.

Resolution of the Pecoran Phylogeny.

Using genotypes for 40,843 SNPs scored with the BovineSNP50 BeadChip (see Methods), we produced a completely bifurcating tree with highly supported nodes for 61 Pecoran species, that contains species that diverged up to 29 million years ago (Fig. 1) (15). There were 39,695 parsimony-informative characters using all 678 animals and, remarkably, 21,019 with cattle excluded. Within the Bovidae, only nine nodes had support <100%. We propose 17 relationships and increase the support for 16 previously proposed nodes within the infraorder, when compared to the supermatrix phylogeny of Marcot (3). A striking observation from the phylogeny is that taxonomic classifications of families and subfamilies mirror the topology of the cladogram, since higher taxa form monophyletic groups. This is an improvement over earlier phylogenies, as previously questionable groupings are now shown to be monophyletic.

Fig. 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 1.

Strict consensus cladogram (no branch lengths) of 17 most parsimonious trees based on 40,843 SNP genotypes. *, Denotes paraphyletic group.

Ancient DNA Samples.

Currently, PCR-based and non-PCR-based multiple strand displacement amplification (MDA) approaches are used to perform whole genome amplification (16, 17). MDA requires high-quality DNA over 2 kb in length and was found to be inefficient for the ancient bison DNA. Consequently, we used a universal linker-based PCR amplification performed with the GenomePlex Whole Genome Amplification kit (Sigma-Aldrich) to amplify the minute amounts of damaged DNA preserved in bone samples from two ancient Russian Bison priscus specimens and test whether the Illumina iSelect platform could be used to analyze samples derived from extinct species. The first, sample BS662, was collected from permafrost deposits at Alyoshkina Zaimka, Siberia, and is approximately 20,000 years old (18). The second, ACAD012, was collected from Sur'ya 5 cave in the Ural Mountains and has been accelerator mass spectrometry radiocarbon dated to 34,460 ± 290 years BP. Due to the low amounts of DNA from the ancient specimens and the short DNA fragment lengths produced in the whole genome amplification of degraded ancient samples, the genotype call rates for these samples were much lower than for modern bison (Table S1). However, when these ancient samples were included in the Bovini phylogeny (Fig. 1), BS662 was basal to the modern Bison bison clade as expected, but ACAD012 fell within the modern Hereford cattle clade. When we sequenced several overlapping fragments that had been individually amplified from the hypervariable mitochondrial control region of sample ACAD012, we identified variability within the overlapping regions. This is consistent with the sample having been contaminated with modern DNA or being extremely degraded, as also suggested by our genotype data and consequently the sample was removed from the study. A replicate whole genome amplification (library identification KCMU02) was produced from the B. priscus sample used to generate BS662, and when this sample was included in the data set, it was sister to BS662, and both remained sister to modern bison within the phylogeny. However, in the preparation of this library, we avoided the initial DNA fragmentation step within the amplification protocol that appeared to greatly improve the quality and quantity of produced genotypes, as KCMU02 produced a higher genotype call rate (54.9 vs. 45.8%) and far lower heterozygosity (11.5 vs. 39.6%) than did BS662 (Table S3). While only 76.1% of the 12,279 genotypes that were called in both samples were identical, 99.7% of the homozygous genotypes, the only genotype class that has the potential to be phylogenetically informative (see Methods), were identical between the replicates.

Relationships Among Cattle Breeds.

Phylogenetic relationships were also inferred for 48 cattle breeds (n = 372 animals) (Table S1) using parsimony, with most nodes being highly supported (bootstrap values >70%). To accommodate heterozygotes, data were first coded with heterozygotes as polymorphic (noninformative) and then as an independent character state (see Methods). When coded as polymorphic, the topology of the cladogram corresponded to the known geographic origins of breeds (Fig. 2A). Interestingly, however, when heterozygotes were coded as distinct characters, the topology changed and no longer clearly reflected the biogeography of breed origins (Fig. 2B).

Fig. 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 2.

Consensus of most parsimonious cladograms of 48 cattle breeds. (A) Most parsimonious cladogram of 48 cattle breeds with heterozygotes coded as polymorphic. Geographic origins were retrieved from the literature (21). (B) Most parsimonious cladogram of 48 cattle breeds with heterozygotes coded as a third and separate character state. Values at nodes are percent bootstrap support from 1,000 pseudoreplicates. Dotted lines connect clades of a breed between the two cladograms. B. t. indicus is represented by the Gir, Sahiwal, Nelore, and Guzerat breeds, with all other breeds being B. t. taurus (Table S1). *, Denotes paraphyletic group.

To further resolve the issue of breed origins, we constructed phylogenetic networks which can reveal conflicting signals in the data (Fig. 3 and Fig. S2). In Fig. S2, Bos taurus indicus and Bos taurus taurus are distinct groups with long edges between the subspecies. Within B. t. taurus, using the Reynolds et al. (19) distance metric and parsimony cladograms (Fig. 2), African taurine cattle were inferred to be more divergent from European cattle than are the Asian B. t. taurus breeds, with 100% bootstrap support in cladograms (Fig. 2 and Figs. S2 and S3). Because SNP were almost exclusively discovered from European B. t. taurus samples (12), there is a strong ascertainment bias toward SNP common within European B. t. taurus on the BovineSNP50 BeadChip, leading to severe biases in estimates of genetic distance that have prevented us from accurately dating the nodes separating European, African, and Asian cattle (Figs. S3 and S4). Furthermore, the data were recalcitrant to correction for ascertainment (see Methods). The network with individuals at node tips (Fig. 3) appears to accurately depict the admixed nature of many populations, for example, the relationship of Belgian Blue to Holsteins and Shorthorns, and Jersey to Iberian and British breeds. The network also reveals pedigree relationships, with sire HO020740 being an interior node to son HO020879.

Fig. 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 3.

Phylogenetic network depicting common ancestry for 372 animals representing 48 cattle breeds.

Discussion

The genotype validation results suggest that BovineSNP50 genotype errors are uncommon, are randomly distributed, and are independent of call rate in the outgroup species. While Ovis aries and B. taurus are not the most distantly related species surveyed in this study (Fig. 1), their most recent common ancestor was at the base of the Bovidae clade. The use of O. aries as a representative for the other species is supported by its 67.2% genotype call rate (Table S1), which was similar to (±7%), or lower than, that for all species and breeds, with the exceptions of Axis deer, Ibex, and Pronghorns, which had call rates <60%.

Despite large amounts of missing data within outgroup species or for the ancient DNA samples, by constructing a larger initial data matrix, which includes more taxa and data than used in previous analyses (20–23), we have produced a highly-resolved phylogeny for a rapidly radiated infraorder, which includes extant and extinct species and in which relationships between and within families have been unresolved. Common ancestry can confound studies of speciation and the evolutionary origins and importance of particular traits; the highly resolved phylogeny presented here can control for this issue by allowing the use of phylogenetically independent contrasts (24). Further, it facilitates informed conservation efforts, as both ancestral relationships and diversity are clearly defined (25), allowing the identification of species and populations within species to target for preservation. With small data sets, the estimated bootstrap support values can be biased due to the presence of a strong correlation between the samples. Large data sets, such as reported here, accurately estimate the support for internal nodes, since nearly independent pseudosamples can be generated for the construction of bootstrap trees.

We demonstrate that reliable genotypes can be produced from ancient DNA samples, but that more work is needed to optimize amplification and genotyping protocols. We suspect that the much higher than expected heterozygosities for these samples are due either to template damage or the nonspecific binding of small, possibly exogenous, DNA fragments to the SNP probes. Despite challenges in library optimization, we placed replicate B. priscus samples as sister to modern bison with strong support and have therefore established the feasibility of high-throughput genotyping of ancient samples. Our results also suggest that the fidelity of the produced genotypes may be assessed by their incorporation into a well-resolved phylogeny and that samples producing unreliable genotypes may be identified and removed from further analysis by this process.

Incongruence between the two breed phylogenies occurred as a result of persistent signatures of admixture, which has been well documented in the histories of several breeds. Thus, the conflicting breed phylogenies oversimplify the complex relationships that exist among populations due to geographic isolation, introgression, migration, and admixture. Networks were effective in revealing both geographic isolation and admixture. There were long branches between B. t. taurus and B. t. indicus, indicating divergence long before domestication. The networks are also consistent with the biogeography of breeds, with European, East Asian, and African taurine cattle forming separate clusters reflecting a predomestication or early postdomestication divergence for these lineages. The West African B. t. taurus N′Dama breed diverges from edges shared with B. t. indicus in Fig. 3, and admixture proportions from 0.2–8.6% with African B. t. indicus have previously been estimated for N′Dama populations (26). Fig. 3 also reveals the biogeographical history of European cattle, which is based upon migrations out of the Fertile Crescent, with domesticated cattle moved sequentially through Turkey, the Balkans, and Italy (27), then radiating through Central Europe and France, and finally into the British Isles (Figs. 2 and 3 and Figs. S2 and S3). These data also support a second route to the Iberian peninsula by sea from Africa or the Fertile Crescent leading to subsequent admixture with European cattle (4), as the Spanish breeds found in the New World are basal to German and French breeds (Figs. 2 and 3). This pattern of geographic dispersal is interrupted only in a few cases in which breed histories document admixture, such as the Belgian Blue, which was formed between 1840 and 1890 by the crossing of local cattle with Friesian and Shorthorn imported from the Netherlands and England, respectively (28) (Fig. 3). Fig. 3 reveals numerous breed relationships, such as the relationship of the Jersey to both Iberian and British breeds (28), indicating that many exportations and crossbreeding experiments were performed by early pastoralists. Importantly, this figure reveals that the history of breed formation in cattle has been complicated and has involved bottlenecks, evolution in isolation, coancestry, migration, and admixture.

In all analyses, African cattle were the earliest diverged taurine cattle. Consequently, our results now confine the domestication debate to two distinct hypotheses: (i) The occurrence of major domestication events in the Fertile Crescent and Indus Valley (7) were followed by minor captures of aurochs in Africa, East Asia, and Europe (4, 6) or (ii) three separate domestication events occurred in the Fertile Crescent, Indus Valley, and Africa, with a fourth independent domestication in East Asia less likely (5, 8).

The largest previous supermatrix analysis of artiodactyls included 3,823 parsimony-informative characters and required several years of data collection (3). We produced 21,019 parsimony-informative characters at a rate of 1,152 samples in 6 days for $100 per sample. Where high-density SNP assays are available for sister species, our approach could affordably be applied to the analysis of other orders and families. Such rapid and inexpensive data generation will transform studies of evolution and domestication through the creation of highly resolved phylogenies, including both extant and extinct species. Genome-wide SNP genotyping assays developed for one species can be used for rapid phylogenomic analysis across a broad taxonomic range and are powerful tools for population and evolutionary studies.

Methods

Whole Genome Amplification of Ancient DNA.

Ancient DNA was extracted from fossil bison bone specimens using the standard phenol/chloroform/Amicon Ultra-4 method (17). DNA extractions, omniplex library preparations, and PCRs were set-up and performed in a geographically isolated, dedicated ancient DNA facility at the University of Adelaide, Australia. To generate a library of genomic fragments from limited ancient DNA extract, DNA was amplified using the PCR-based GenomePlex Whole Genome Amplification kit (WGA2; Sigma-Aldrich) according to the following protocol: 10 μL DNA were thoroughly mixed with 2 μL library preparation buffer and 1 μL library stabilization solution, and denatured at 95 °C for 2 min. After denaturation, 1 μL library preparation enzyme was added to generate omniplex libraries, followed by a series of incubations at 16 °C for 20 min, 24 °C for 20 min, 37 °C for 20 min, and 75 °C for 5 min in a thermal cycler (Corbett Life Science). The omniplex libraries were next amplified using a limited number of genomic amplification cycles. PCR amplification was conducted in a 75-μL reaction volume containing 14 μL omniplex library, 7.5 μL amplification master mix, 48.5 μL nuclease-free water, and 5 μL WGA DNA polymerase. The PCR amplification conditions were initial denaturation at 95 °C for 3 min, followed by 15 cycles of 94 °C for 15 s and 65 °C for 5 min. GenomePlex-amplified ancient DNA products were finally purified using the GenElute PCR Clean-Up kit (Sigma-Aldrich). Ancient DNA libraries were verified by PCR amplification and sequencing of the hypervariable mtDNA control region before analysis with the BovineSNP50 BeadChip (Illumina). A second amplification, labeled KCMU02, of the sample that produced BS662 was constructed using the same protocol as above, except the genomic fragmentation step within the WGA2 protocol was omitted.

Sample Selection.

Table S1 shows the numbers of animals genotyped from each species or cattle breed. In taxa or breeds where <10 animals were genotyped, all animals were sampled. If >10 animals were genotyped, animals with the highest genotype call rates and earliest birth dates were selected. When pedigree information was available, closely related animals were avoided, except in Angus and Holstein where 10 old animals (born in the 1950s, 1960s, and 1970s) and 10 recently born animals (born in the late 1990s and 2000s) were selected. When more than 50 animals within a breed had call rates of at least 98% and no pedigree information was available, 10 animals were sampled at random. Samples belonging to recently formed crossbred breeds were removed from the analysis, as these samples distort parsimony phylogenies. Genotypes for the two ancient Bison samples were included despite their much lower genotype call rates, which were expected due to DNA degradation and fragmentation, and the use of whole genome amplification, which affect the fidelity of the Infinium assay. The provenance of all samples included in the analyses is provided in Table S4.

SNP Selection.

The BovineSNP50 BeadChip (Illumina) consists of SNP primarily discovered by the sequencing of reduced representation libraries (11), the alignment of random shotgun reads from six cattle breeds to the Hereford assembly, or from the draft assembly of the bovine genome (12). To improve genotype quality for B. t. indicus and the outgroup species, we manually adjusted genotype call clusters in Illumina BeadStudio to improve genotype calls. Where pedigree information was available, such as in O. aries and B. bison, the rate of misinheritances was minimized. A set of 40,843 SNP was selected from the 54,693 loci queried by the assay. Loci selected for analysis were all located on autosomes, had a call rate of at least 80% in 36 (75%) B. t. taurus breeds, and were not monomorphic in all breeds. This strategy was effective in selecting informative SNP with fnew genotype errors (Table S5). Data are available at http://animalsciences.missouri.edu/animalgenomics/publications/php.

Genotype Calls in Outgroup Species.

Almost 96% of the beads on the BovineSNP50 BeadChip query Infinium II SNP, in which adenine and thymine share a fluorescent probe and guanine and cytosine share a different fluorescent probe. For samples in which all four bases are present at a single locus, AA, AT, and TT genotypes produce indistinguishable fluorescence intensities, as do GG, GC, and CC. Thus, A/T or C/G SNP discovered in B. t. taurus were limited in the assay design (1.8 and 2.2%, respectively, and use Infinium I chemistry). However, in species diverged from B. t. taurus where all four bases could be present, genotypes are WW (W is the IUPAC code for A or T bases) for one homozygote class, SS (S is the IUPAC code for G or C bases) for the alternate homozygote, and NN (ambiguous) for the heterozygote class. This ambiguity is evident when sequences and genotypes for outgroup species were compared (Table S2). The WW and SS genotypes were identified in BeadStudio as AA and BB genotype calls.

Phylogenetic Analysis.

Most parsimonious trees were inferred from the genotypes using TNT version 1.1 (29). In the analyses involving the outgroup species, phylogenetic signal was obtained only from the homozygous genotypes, and AA homozygotes were coded as “0,” BB homozygotes were coded as “1,” heterozygotes were coded as a polymorphic character state (i.e., “[0,1]”), and missing genotypes were coded as “?.” However, in the analyses of the cattle breeds, an additional data set was created in which heterozygotes were identified by a unique character state (i.e., AA = 0, AB = 1, BB = 2). A heuristic search was conducting using the search technology in TNT, and the search level was initially set to 20. Specifically, we used the SPR-TBR algorithm followed by random sectorial searches, constrained sectorial searches, exclusive sectorial searches, and 10 rounds of tree-drifting. The complete search was replicated 20 times, with 10 rounds of tree fusing at the conclusion of these 20 replicates. A subset of the samples from the tribe Bovini was independently analyzed along with the ancient bison samples to validate the quality of the data generated from these ancient samples. A data set with 714 samples from all taxon groups was first used to construct the most parsimonious trees. After excluding samples with low quality DNA, low bootstrap support, and/or nonsensical placement in the cladogram (i.e., elephant and horse as sister to B. taurus), a final data set with 678 samples was used to construct most parsimonious trees. The cladogram was rooted with Antilocapra americana. Using these 678 samples, bootstrap support was calculated using 1,000 pseudoreplicates, and for expediency, the SPR-TBR heuristic search was used.

Allele frequencies were estimated for 40,843 SNP in 22 breeds (Table S6), and these frequencies were used to estimate pairwise Reynolds distances (19) among the breeds (Fig. S3). Several attempts were made to correct estimates of genetic distance for SNP ascertainment bias. First, distances were calculated from haplotype frequencies. Haplotypes were inferred for the autosomes of all genotyped animals in our collection within each breed group (Table S6) using fastPhase (30). From these haplotyped samples, haplotypes were extracted for the study animals for 885 nonoverlapping loci, each comprising six SNP for which the intermarker distance was <50 kb for contiguous SNP. Haplotype frequencies were estimated for each of the 885 loci within each breed group and were used to estimate Reynolds distances between breeds. Next, we formed weighted distances by averaging individual SNP distances weighted according to the frequency of unascertained SNP (31) possessing the MAF observed in each of the two populations. Finally, we also subsampled approximately 3,000 or approximately 8,000 SNP such that the resulting MAF distribution conformed to the unascertained distribution of bovine SNP (31) in Angus or Holstein, respectively. The subsample size was determined by the severity of underrepresentation of SNP within the MAF range 0.005–0.015 and indicates that ascertainment bias was more severe for Angus than for Holstein. Reynolds and Nei genetic distances corrected for sample size (Table S6) were estimated for each subsample and were averaged across 1,000 bootstrap replicates. Distances were used to construct neighbor-joining and UPGMA trees with Phylip (32). None of the approaches taken to correct for ascertainment bias were able to establish a tree in which branch lengths were clock-like. Biases in the allele frequency spectrum differ within B. t. taurus breeds (Fig. S4) causing the distances between breeds to not be clock-like.

Figures of phylogenies and cladograms were produced in MrEnt3 (33), and phylogenetic networks were constructed using SplitsTree version 4.10 (34). Distances based upon allele frequencies at 40,843 SNP were used to construct a network of 22 breeds. Due to memory limitations in SplitsTree, genotypes at 14,023 SNP were used to construct a network of 372 individuals belonging to 48 breeds. Default settings in SplitsTree were used to construct the networks.

Acknowledgments

This project was supported by National Research Initiative (NRI) grant nos. 2006–35616-16697, 2008–35205-18864, and 2008–35205-04687 from the U.S. Department of Agriculture Cooperative State Research, Education, and Extension Service (CSREES), 13321 from the Missouri Life Science Research Board and DP0773602 from the Australian Research Council. J.J.K. and K.W.K. were supported by the Technology Development Program for Agriculture and Forestry, Ministry of Agriculture, Forestry and Fisheries, Republic of Korea. We acknowledge the contribution of DNA samples from UK breed societies and cattle breeders as well as the Rare Breeds Survival Trust. We appreciate the critical review and useful comments of Alejandro Rooney. We thank Oliva Handt for help constructing ancient bison libraries. Technical assistance was provided by David Morrice and Karen Troup (Ark Genomics, The Roslin Institute, Edinburgh, UK). We gratefully acknowledge access to Bovine HapMap Project genotypes.

Footnotes

  • 1To whom correspondence should be addressed. E-mail: taylorjerr{at}missouri.edu
  • Author contributions: J.E.D. designed research; J.E.D., S.D.M., K.C., A.C., M.C.M., M.M.R., J.W.K., R.D.S., and J.F.T. performed research; M.P.H., K.C., A.C., J.V., C.M.S., A.R.C., G.S.J., R.A.B., O.H., L.S.E., P.W., J.-J.K., K.S.K., T.S.S., C.P.V.T., H.L.N., L.L.C., M.E.B., and G.A.W. contributed new reagents/analytic tools; J.E.D., J.C.P., G.C.C., J.C.M., R.B., R.D.S., and J.F.T. analyzed data; and J.E.D. and J.F.T. wrote the paper.

  • The authors declare no conflict of interest.

  • This article is a PNAS Direct Submission.

  • This article contains supporting information online at www.pnas.org/cgi/content/full/0904691106/DCSupplemental.

  • Freely available online through the PNAS open access option.

References

  1. ↵
    1. Foss SE,
    2. Prothero DR
    1. Foss SE,
    2. Prothero DR
    (2007) Introduction to The Evolution of Artiodactyls, eds Foss SE, Prothero DR (Johns Hopkins University Press, Baltimore, MD), pp 1–3.
  2. ↵
    1. Gatesy J,
    2. Yelon D,
    3. DeSalle R,
    4. Vrba ES
    (1992) Phylogeny of the Bovidae (Artiodactyla, Mammalia), based on mitochondrial ribosomal DNA sequences. Mol Biol Evol 9:433–446.
    OpenUrlAbstract
  3. ↵
    1. Foss SE,
    2. Prothero DR
    1. Marcot JD
    (2007) in The Evolution of Artiodactyls, Molecular phylogeny of terrestrial artiodactyls: Conflicts and resolution, eds Foss SE, Prothero DR (Johns Hopkins University Press, Baltimore, MD), pp 4–18.
  4. ↵
    1. Beja-Pereira A,
    2. et al.
    (2006) The origin of European cattle: Evidence from modern and ancient DNA. Proc Natl Acad Sci USA 103:8113–8118.
    OpenUrlAbstract/FREE Full Text
  5. ↵
    1. Bradley DG,
    2. MacHugh DE,
    3. Cunningham P,
    4. Loftus RT
    (1996) Mitochondrial diversity and the origins of African and European cattle. Proc Natl Acad Sci USA 93:5131–5135.
    OpenUrlAbstract/FREE Full Text
  6. ↵
    1. Gotherstrom A,
    2. et al.
    (2005) Cattle domestication in the Near East was followed by hybridization with aurochs bulls in Europe. Proc Biol Sci 272:2345–2350.
    OpenUrlAbstract/FREE Full Text
  7. ↵
    1. Loftus RT,
    2. MacHugh DE,
    3. Bradley DG,
    4. Sharp PM,
    5. Cunningham P
    (1994) Evidence for two independent domestications of cattle. Proc Natl Acad Sci USA 91:2757–2761.
    OpenUrlAbstract/FREE Full Text
  8. ↵
    1. Mannen H,
    2. et al.
    (2004) Independent mitochondrial origin and historical genetic differentiation in North Eastern Asian cattle. Mol Phylogenet Evol 32:539–544.
    OpenUrlCrossRefPubMed
  9. ↵
    1. Li JZ,
    2. et al.
    (2008) Worldwide human relationships inferred from genome-wide patterns of variation. Science 319:1100–1104.
    OpenUrlAbstract/FREE Full Text
  10. ↵
    1. Jakobsson M,
    2. et al.
    (2008) Genotype, haplotype and copy-number variation in worldwide human populations. Nature 451:998–1003.
    OpenUrlCrossRefPubMed
  11. ↵
    1. Van Tassell CP,
    2. et al.
    (2008) SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries. Nat Methods 5:247–252.
    OpenUrlCrossRefPubMed
  12. ↵
    1. Matukumalli LK,
    2. et al.
    (2009) Development and characterization of a high density SNP genotyping assay for cattle. PloS ONE 4:e5350.
    OpenUrlCrossRefPubMed
  13. ↵
    1. Steemers FJ,
    2. et al.
    (2006) Whole-genome genotyping with the single-base extension assay. Nat Methods 3:31–33.
    OpenUrlCrossRefPubMed
  14. ↵
    1. Heaton MP,
    2. et al.
    (2002) Selection and use of SNP markers for animal identification and paternity analysis in U.S. beef cattle. Mamm Genome 13:272–281.
    OpenUrlCrossRefPubMed
  15. ↵
    1. Hassanin A,
    2. Douzery EJ
    (2003) Molecular and morphological phylogenies of Ruminantia and the alternative position of the Moschidae. Syst Biol 52:206–228.
    OpenUrlAbstract/FREE Full Text
  16. ↵
    1. Dean FB,
    2. et al.
    (2002) Comprehensive human genome amplification using multiple displacement amplification. Proc Natl Acad Sci USA 99:5261–5266.
    OpenUrlAbstract/FREE Full Text
  17. ↵
    1. Iwamoto K,
    2. et al.
    (2007) Evaluation of whole genome amplification methods using postmortem brain samples. J Neurosci Methods 165:104–110.
    OpenUrlCrossRefPubMed
  18. ↵
    1. Shapiro B,
    2. et al.
    (2004) Rise and fall of the Beringian steppe bison. Science 306:1561–1565.
    OpenUrlAbstract/FREE Full Text
  19. ↵
    1. Reynolds J,
    2. Weir BS,
    3. Cockerham CC
    (1983) Estimation of the coancestry coefficient: Basis for a short-term genetic distance. Genetics 105:767–779.
    OpenUrlPubMed
  20. ↵
    1. Rokas A,
    2. Carroll SB
    (2005) More genes or more taxa? The relative contribution of gene number and taxon number to phylogenetic accuracy. Mol Biol Evol 22:1337–1344.
    OpenUrlAbstract/FREE Full Text
  21. ↵
    1. Wiens JJ
    (1998) Does adding characters with missing data increase or decrease phylogenetic accuracy? Syst Biol 47:625–640.
    OpenUrlAbstract/FREE Full Text
  22. ↵
    1. Wiens JJ
    (2003) Missing data, incomplete taxa, and phylogenetic accuracy. Syst Biol 52:528–538.
    OpenUrlAbstract/FREE Full Text
  23. ↵
    1. Heath TA,
    2. Zwickl DJ,
    3. Kim J,
    4. Hillis DM
    (2008) Taxon sampling affects inferences of macroevolutionary processes from phylogenetic trees. Syst Biol 57:160–166.
    OpenUrlFREE Full Text
  24. ↵
    1. Felsenstein J
    (1985) Phylogenies and the comparative method. Am Nat 125:1–15.
    OpenUrlCrossRef
  25. ↵
    1. Moritz C
    (1995) Uses of molecular phylogenies for conservation. Phil Trans R Soc Lond 349:113–118.
    OpenUrlAbstract/FREE Full Text
  26. ↵
    1. MacHugh DE,
    2. Shriver MD,
    3. Loftus RT,
    4. Cunningham P,
    5. Bradley DG
    (1997) Microsatellite DNA variation and the evolution, domestication and phylogeography of taurine and zebu cattle (Bos taurus and Bos indicus) Genetics 146:1071–1086.
    OpenUrlPubMed
  27. ↵
    1. Pellecchia M,
    2. et al.
    (2007) The mystery of Etruscan origins: Novel clues from Bos taurus mitochondrial DNA. Proc Biol Sci 274:1175–1179.
    OpenUrlAbstract/FREE Full Text
  28. ↵
    1. Porter V
    (1991) Cattle: A Handbook to the Breeds of the World (Christopher Helm Publishers Ltd, London, UK).
  29. ↵
    1. Goloboff PA,
    2. Farris JS,
    3. Nixon KC
    (2008) TNT, a free program for phylogenetic analysis. Cladistics 24:774–786.
    OpenUrlCrossRef
  30. ↵
    1. Scheet P,
    2. Stephens M
    (2006) A fast and flexible statistical model for large-scale population genotype data: Applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78:629–644.
    OpenUrlCrossRefPubMed
  31. ↵
    1. The Bovine HapMap Consortium
    (2009) Genome wide survey of SNP variation uncovers the genetic structure of cattle breeds. Science 324:528–532.
    OpenUrlAbstract/FREE Full Text
  32. ↵
    1. Felsenstein J
    (1989) PHYLIP—phylogeny inference package (version 3.2) Cladistics 5:164–166.
    OpenUrl
  33. ↵
    1. Zuccon A,
    2. Zuccon D
    (2008) MrEnt v. 3. Program distributed by the authors. Available at http://www.mrent.org/frame1.htm.
  34. ↵
    1. Huson DH,
    2. Bryant D
    (2006) Application of phylogenetic networks in evolutionary studies. Mol Biol Evol 23:254–267.
    OpenUrlAbstract/FREE Full Text
PreviousNext
Back to top
Article Alerts
Email Article

Thank you for your interest in spreading the word on PNAS.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Resolving the evolution of extant and extinct ruminants with high-throughput phylogenomics
(Your Name) has sent you a message from PNAS
(Your Name) thought you would like to see the PNAS web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Citation Tools
Resolving the evolution of extant and extinct ruminants with high-throughput phylogenomics
Jared E. Decker, J. Chris Pires, Gavin C. Conant, Stephanie D. McKay, Michael P. Heaton, Kefei Chen, Alan Cooper, Johanna Vilkki, Christopher M. Seabury, Alexandre R. Caetano, Gary S. Johnson, Rick A. Brenneman, Olivier Hanotte, Lori S. Eggert, Pamela Wiener, Jong-Joo Kim, Kwan Suk Kim, Tad S. Sonstegard, Curt P. Van Tassell, Holly L. Neibergs, John C. McEwan, Rudiger Brauning, Luiz L. Coutinho, Masroor E. Babar, Gregory A. Wilson, Matthew C. McClure, Megan M. Rolf, JaeWoo Kim, Robert D. Schnabel, Jeremy F. Taylor
Proceedings of the National Academy of Sciences Nov 2009, 106 (44) 18644-18649; DOI: 10.1073/pnas.0904691106

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Request Permissions
Share
Resolving the evolution of extant and extinct ruminants with high-throughput phylogenomics
Jared E. Decker, J. Chris Pires, Gavin C. Conant, Stephanie D. McKay, Michael P. Heaton, Kefei Chen, Alan Cooper, Johanna Vilkki, Christopher M. Seabury, Alexandre R. Caetano, Gary S. Johnson, Rick A. Brenneman, Olivier Hanotte, Lori S. Eggert, Pamela Wiener, Jong-Joo Kim, Kwan Suk Kim, Tad S. Sonstegard, Curt P. Van Tassell, Holly L. Neibergs, John C. McEwan, Rudiger Brauning, Luiz L. Coutinho, Masroor E. Babar, Gregory A. Wilson, Matthew C. McClure, Megan M. Rolf, JaeWoo Kim, Robert D. Schnabel, Jeremy F. Taylor
Proceedings of the National Academy of Sciences Nov 2009, 106 (44) 18644-18649; DOI: 10.1073/pnas.0904691106
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Mendeley logo Mendeley

Article Classifications

  • Biological Sciences
  • Evolution
Proceedings of the National Academy of Sciences: 106 (44)
Table of Contents

Submit

Sign up for Article Alerts

Jump to section

  • Article
    • Abstract
    • Results
    • Discussion
    • Methods
    • Acknowledgments
    • Footnotes
    • References
  • Figures & SI
  • Info & Metrics
  • PDF

You May Also be Interested in

Water from a faucet fills a glass.
News Feature: How “forever chemicals” might impair the immune system
Researchers are exploring whether these ubiquitous fluorinated molecules might worsen infections or hamper vaccine effectiveness.
Image credit: Shutterstock/Dmitry Naumov.
Reflection of clouds in the still waters of Mono Lake in California.
Inner Workings: Making headway with the mysteries of life’s origins
Recent experiments and simulations are starting to answer some fundamental questions about how life came to be.
Image credit: Shutterstock/Radoslaw Lecyk.
Cave in coastal Kenya with tree growing in the middle.
Journal Club: Small, sharp blades mark shift from Middle to Later Stone Age in coastal Kenya
Archaeologists have long tried to define the transition between the two time periods.
Image credit: Ceri Shipton.
Illustration of groups of people chatting
Exploring the length of human conversations
Adam Mastroianni and Daniel Gilbert explore why conversations almost never end when people want them to.
Listen
Past PodcastsSubscribe
Panda bear hanging in a tree
How horse manure helps giant pandas tolerate cold
A study finds that giant pandas roll in horse manure to increase their cold tolerance.
Image credit: Fuwen Wei.

Similar Articles

Site Logo
Powered by HighWire
  • Submit Manuscript
  • Twitter
  • Facebook
  • RSS Feeds
  • Email Alerts

Articles

  • Current Issue
  • Special Feature Articles – Most Recent
  • List of Issues

PNAS Portals

  • Anthropology
  • Chemistry
  • Classics
  • Front Matter
  • Physics
  • Sustainability Science
  • Teaching Resources

Information

  • Authors
  • Editorial Board
  • Reviewers
  • Subscribers
  • Librarians
  • Press
  • Cozzarelli Prize
  • Site Map
  • PNAS Updates
  • FAQs
  • Accessibility Statement
  • Rights & Permissions
  • About
  • Contact

Feedback    Privacy/Legal

Copyright © 2021 National Academy of Sciences. Online ISSN 1091-6490