Submit Papers Directly to PNAS via Track II  Sign up for PNAS Online eTocs
Link: Info for AuthorsLink: Editorial BoardLink: AboutLink: SubscribeLink: AdvertiseLink: ContactLink: Sitemap Link: PNAS Home
Proceedings of the National Academy of Sciences
Link: Current Issue "" Link: Archives "" Link: Online Submission ""  Link: Advanced Search

Published online on November 26, 2007, 10.1073/pnas.0709013104
PNAS | December 4, 2007 | vol. 104 | no. 49 | 19428-19433


This Article
Right arrow Figures Only
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Supporting Information
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a colleague
Right arrow Related articles in PNAS
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My File Cabinet
Right arrow Download to citation manager
Right arrow Request Copyright Permission
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via ISI Web of Science (1)
Google Scholar
Right arrow Articles by Clamp, M.
Right arrow Articles by Lander, E. S.
PubMed
Right arrow PubMed Citation
Right arrow Articles by Clamp, M.
Right arrow Articles by Lander, E. S.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg  
What's this?

 Previous Article  | Table of Contents |  Next Article 

From the Cover
BIOLOGICAL SCIENCES / GENETICS
Distinguishing protein-coding and noncoding genes in the human genome

Michele Clamp*,{dagger}, Ben Fry*, Mike Kamal*, Xiaohui Xie*, James Cuff*, Michael F. Lin{ddagger}, Manolis Kellis*,{ddagger}, Kerstin Lindblad-Toh*, and Eric S. Lander*,§,||,{dagger}

*Broad Institute of Massachusetts Institute of Technology and Harvard, 7 Cambridge Center, Cambridge, MA 02142; Department of Biology and {ddagger}Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139; §Whitehead Institute for Biomedical Research, 9 Cambridge Center, Cambridge, MA 02142; and ||Department of Systems Biology, Harvard Medical School, Boston, MA 02115

Contributed by Eric S. Lander, October 3, 2007 (received for review August 1, 2007)

Although the Human Genome Project was completed 4 years ago, the catalog of human protein-coding genes remains a matter of controversy. Current catalogs list a total of {approx}24,500 putative protein-coding genes. It is broadly suspected that a large fraction of these entries are functionally meaningless ORFs present by chance in RNA transcripts, because they show no evidence of evolutionary conservation with mouse or dog. However, there is currently no scientific justification for excluding ORFs simply because they fail to show evolutionary conservation: the alternative hypothesis is that most of these ORFs are actually valid human genes that reflect gene innovation in the primate lineage or gene loss in the other lineages. Here, we reject this hypothesis by carefully analyzing the nonconserved ORFs—specifically, their properties in other primates. We show that the vast majority of these ORFs are random occurrences. The analysis yields, as a by-product, a major revision of the current human catalogs, cutting the number of protein-coding genes to {approx}20,500. Specifically, it suggests that nonconserved ORFs should be added to the human gene catalog only if there is clear evidence of an encoded protein. It also provides a principled methodology for evaluating future proposed additions to the human gene catalog. Finally, the results indicate that there has been relatively little true innovation in mammalian protein-coding genes.

comparative genomics


Author contributions: M.C. and E.S.L. designed research; M.C., B.F., M. Kamal, X.X., J.C., M.F.L., M. Kellis, K.L.-T., and E.S.L. performed research; M.C., B.F., M. Kamal, X.X., J.C., M.F.L., M. Kellis, K.L.-T., and E.S.L. analyzed data; and M.C. and E.S.L. wrote the paper.

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/cgi/content/full/0709013104/DC1.

{dagger}To whom correspondence may be addressed. E-mail: mclamp{at}broad.mit.edu or lander{at}broad.mit.edu

© 2007 by The National Academy of Sciences of the USA


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg    What's this?

Related articles in PNAS:

In This Issue

PNAS 2007 104: 19161-19162. [Full Text]  



This article has been cited by other articles in HighWire Press-hosted journals:


Home page
ScienceHome page
P. P. Amaral, M. E. Dinger, T. R. Mercer, and J. S. Mattick
The Eukaryotic Genome as an RNA Machine
Science, March 28, 2008; 319(5871): 1787 - 1789.
[Abstract] [Full Text] [PDF]