Comparative sequencing provides insights about the structure and conservation of marsupial and monotreme genomes
- Elliott H. Margulies*,
- NISC Comparative Sequencing Program*,†,‡,
- Valerie V. B. Maduro*,
- Pamela J. Thomas†,
- Jeffery P. Tomkins§,
- Chris T. Amemiya¶,
- Meizhong Luo∥, and
- Eric D. Green*,†,**
- *Genome Technology Branch and †NISC, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892; §Clemson University Genomics Institute, Department of Genetics and Biochemistry and Life Science Studies, Clemson University, Clemson, SC 29634; ¶Benaroya Research Institute at Virginia Mason, Seattle, WA 98101; and ∥Arizona Genomics Institute, Department of Plant Sciences, University of Arizona, Tucson, AZ 85721
-
Communicated by Francis S. Collins, National Institutes of Health, Bethesda, MD, November 18, 2004 (received for review August 30, 2004)
Abstract
Sequencing and comparative analyses of genomes from multiple vertebrates are providing insights about the genetic basis for biological diversity. To date, these efforts largely have focused on eutherian mammals, chicken, and fish. In this article, we describe the generation and study of genomic sequences from noneutherian mammals, a group of species occupying unusual phylogenetic positions. A large sequence data set (totaling >5 Mb) was generated for the same orthologous region in three marsupial (North American opossum, South American opossum, and Australian tammar wallaby) and one monotreme (platypus) genomes. These ancient mammalian genomes are characterized by unusual architectural features with respect to G + C and repeat content, as well as compression relative to human. Approximately 14% and 34% of the human sequence forms alignments with the orthologous sequence from platypus and the marsupials, respectively; these numbers are distinctly lower than that observed with nonprimate eutherian mammals (45–70%). The alignable sequences between human and each marsupial species are not completely overlapping (only 80% common to all three species) nor are the platypus-alignable sequences completely contained within the marsupial-alignable sequences. Phylogenetic analysis of synonymous coding positions reveals that platypus has a notably long branch length, with the human–platypus substitution rate being on average 55% greater than that seen with human–marsupial pairs. Finally, analyses of the major mammalian lineages reveal distinct patterns with respect to the common presence of evolutionarily conserved vertebrate sequences. Our results confirm that genomic sequence from noneutherian mammals can contribute uniquely to unraveling the functional and evolutionary histories of the mammalian genome.
Footnotes
-
↵ ** To whom correspondence should be addressed at: National Human Genome Research Institute, National Institutes of Health, 50 South Drive, Building 50, Room 5222, Bethesda, MD 20892. E-mail: egreen{at}nhgri.nih.gov.
-
↵ ‡ National Institutes of Health Intramural Sequencing Center (NISC) Comparative Sequencing Program: Leadership provided by Robert W. Blakesley, Gerard G. Bouffard, Nancy F. Hansen, Baishali Maskeri, and Jennifer C. McDowell.
-
Author contributions: E.H.M. and E.D.G. designed research; E.H.M., N.C.S.P., V.V.B.M., and P.J.T. performed research; E.H.M., J.P.T., C.T.A., M.L., and E.D.G. contributed new reagents/analytic tools; E.H.M. and P.J.T. analyzed data; and E.H.M. and E.D.G. wrote the paper.
-
Abbreviations: NISC, National Institutes of Health Intramural Sequencing Center; mya, million years ago; N.A., North American; S.A., South American; BAC, bacterial artificial chromosome; TBA, threaded blockset aligner; MCS, multispecies conserved sequence; 4D, 4-fold degenerate; SINEs, short interspersed nucleotide elements; LINEs, long interspersed nucleotide elements.
-
Data deposition: The sequences reported in this paper have been deposited in the GenBank database [accession nos. AC127465, AC129065, AC129066, AC129885, AC142561, AC144364, AC144365, AC144600, AC144690, AC144691, AC144755, and AC144756 (N.A. opossum); AC147869, AC147870, AC147871, AC147872, AC147873, AC147874, AC148151, and AC148214 (S.A. opossum); AC127464, AC129882, AC129883, AC129884, AC130185, AC138553, AC144363, AC144689, AC144753, AC144754, AC144788, AC146535, and AC146754 (platypus); and AC145041, AC145042, AC145183, AC145184, AC145249, AC145250, AC145407, AC145408, AC145409, and AC145841 (wallaby)]. See Table 3, which is published as supporting information on the PNAS web site for specific versions of all GenBank accession nos. used in this study.
-
Freely available online through the PNAS open access option.
- Copyright © 2005, The National Academy of Sciences





