A sea urchin genome project: Sequence scan, virtual map, and additional resources
- R. Andrew Camerona,b,c,d,
- Gregory Mahairasc,e,
- Jonathan P. Rastb,
- Pedro Martineza,f,
- Ted R. Biondib,
- Steven Swartzelle,
- James C. Wallacee,
- Albert J. Poustkag,
- Brian T. Livingstonh,
- Gregory A. Wrayi,
- Charles A. Ettensohnj,
- Hans Lehrachg,
- Roy J. Brittena,k,
- Eric H. Davidsonb, and
- Leroy Hoodl
- aStowers Institute for Medical Research, Kansas City, MO 64110; bDivision of Biology, California Institute of Technology, Pasadena, CA 91125; eDepartment of Molecular Biotechnology, University of Washington, Seattle, WA 98195; fDepartment of Anatomy and Cell Biology, University of Bergen, 5009-Bergen, Norway; gMax-Planck-Institut für Molekulare Genetik, D-14195 Berlin, Germany; hSchool of Biological Sciences, University of Missouri, Kansas City, MO 64110; iDepartment of Biology, Duke University, Durham, NC 27708; jDepartment of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213; kKerckhoff Marine Laboratory, California Institute of Technology, Corona del Mar, CA 92625; and lInstitute for Systems Biology, Seattle, WA 98105
-
Contributed by Eric H. Davidson
Abstract
Results of a first-stage Sea Urchin Genome Project are summarized here. The species chosen was Strongylocentrotus purpuratus, a research model of major importance in developmental and molecular biology. A virtual map of the genome was constructed by sequencing the ends of 76,020 bacterial artificial chromosome (BAC) recombinants (average length, 125 kb). The BAC-end sequence tag connectors (STCs) occur an average of 10 kb apart, and, together with restriction digest patterns recorded for the same BAC clones, they provide immediate access to contigs of several hundred kilobases surrounding any gene of interest. The STCs survey >5% of the genome and provide the estimate that this genome contains ≈27,350 protein-coding genes. The frequency distribution and canonical sequences of all middle and highly repetitive sequence families in the genome were obtained from the STCs as well. The 500-kb Hox gene complex of this species is being sequenced in its entirety. In addition, arrayed cDNA libraries of >105 clones each were constructed from every major stage of embryogenesis, several individual cell types, and adult tissues and are available to the community. The accumulated STC data and an expanding expressed sequence tag database (at present including >12,000 sequences) have been reported to GenBank and are accessible on public web sites.
Footnotes
-
↵ c R.A.C. and G.M. contributed equally to this work.
-
↵ d To whom reprint requests should be addressed at: Division of Biology, 156-29, California Institute of Technology, Pasadena, CA 91125. E-mail: acameron{at}caltech.edu.
-
Data deposition: The sequences reported in this paper have been deposited in the GenBank database (accession nos. AZ135866–AZ211804).
-
Article published online before print: Proc. Natl. Acad. Sci. USA, 10.1073/pnas.160261897.
-
Article and publication date are at www.pnas.org/cgi/doi/10.1073/pnas.160261897
- Abbreviations:
- STCs,
- sequence tag connectors;
- BAC,
- bacterial artificial chromosome;
- EST,
- expressed sequence tag
- Copyright © The National Academy of Sciences








