Previous Article |
Table of Contents
| Next Article
GENETICS
A computational and experimental approach to validating annotations and gene predictions in the Drosophila melanogaster genome
, 
, 
, 



, ¶
*Howard Hughes Medical Institute and
Department of Molecular and Cell Biology, University of California, Life Sciences Addition, Berkeley, CA 94720-3200; and ¶Department of Genome Sciences, Lawrence Berkeley National Laboratory, One Cyclotron Road, Mailstop 64-121, Berkeley, CA 94720
Contributed by Gerald M. Rubin, December 17, 2004
Five years after the completion of the sequence of the Drosophila melanogaster genome, the number of protein-coding genes it contains remains a matter of debate; the number of computational gene predictions greatly exceeds the number of validated gene annotations. We have assembled a collection of >10,000 gene predictions that do not overlap existing gene annotations and have developed a process for their validation that allows us to efficiently prioritize and experimentally validate predictions from various sources by sequencing RT-PCR products to confirm gene structures. Our data provide experimental evidence for 122 protein-coding genes. Our analyses suggest that the entire collection of predictions contains only
700 additional protein-coding genes. Although we cannot rule out the discovery of genes with unusual features that make them refractory to existing methods, our results suggest that the D. melanogaster genome contains
14,000 protein-coding genes.
gene number | validation | genome annotation
Abbreviations: Mb, megabases; oligo, oligonucleotide; sjc, splice-junction conserved.
Data deposition: The sequences reported in this paper have been deposited in the GenBank database (accession nos. CX309415CX309654).
M.Y., A.M.B., and S.M. contributed equally to this work.
To whom correspondence should be addressed at: Department of Molecular and Cell Biology, University of California, Life Sciences Addition, Room 539, Berkeley, CA 94720-3200. E-mail: myandell{at}fruitfly.org.
© 2005 by The National Academy of Sciences of the USA
![]()
CiteULike
Complore
Connotea
Del.icio.us
Digg What's this?
This article has been cited by other articles in HighWire Press-hosted journals:
![]() |
B. L. Cantarel, I. Korf, S. M.C. Robb, G. Parra, E. Ross, B. Moore, C. Holt, A. Sanchez Alvarado, and M. Yandell MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes Genome Res., January 1, 2008; 18(1): 188 - 196. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. F. Lin, J. W. Carlson, M. A. Crosby, B. B. Matthews, C. Yu, S. Park, K. H. Wan, A. J. Schroeder, L. S. Gramates, S. E. St. Pierre, et al. Revisiting the protein-coding gene catalog of Drosophila melanogaster using 12 fly genomes Genome Res., December 1, 2007; 17(12): 1823 - 1836. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Perrimon and B. Mathey-Prevot Applications of High-Throughput RNA Interference Screens to Problems in Cell and Developmental Biology Genetics, January 1, 2007; 175(1): 7 - 16. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Stajich and H. Lapp Open source tools and toolkits for bioinformatics: significance, and where are we? Brief Bioinform, September 1, 2006; 7(3): 287 - 296. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. T. Levine, C. D. Jones, A. D. Kern, H. A. Lindfors, and D. J. Begun Novel genes derived from noncoding DNA in Drosophila melanogaster are frequently X-linked and exhibit testis-biased expression PNAS, June 27, 2006; 103(26): 9935 - 9939. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Ashburner and C. M. Bergman Drosophila melanogaster: A case study of a model genomic sequence and its consequences Genome Res., December 1, 2005; 15(12): 1661 - 1667. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Inagaki, K. Numata, T. Kondo, M. Tomita, K. Yasuda, A. Kanai, and Y. Kageyama Identification and expression analysis of putative mRNA-like non-coding RNA in Drosophila Genes Cells, December 1, 2005; 10(12): 1163 - 1173. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. J. Wagstaff and D. J. Begun Molecular Population Genetics of Accessory Gland Protein Genes and Testis-Expressed Genes in Drosophila mojavensis and D. arizonae Genetics, November 1, 2005; 171(3): 1083 - 1101. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. H. Brown, S. S. Gross, and M. R. Brent Begin at the beginning: Predicting genes with 5' UTRs Genome Res., May 1, 2005; 15(5): 742 - 747. [Abstract] [Full Text] [PDF] |
||||