A computational and experimental approach to validating annotations and gene predictions in the Drosophila melanogaster genome
- Mark Yandell*,†,‡,
- Adina M. Bailey†,§,
- Sima Misra†,§,
- ShengQiang Shu§,
- Colin Wiel§,
- Martha Evans-Holm§,
- Susan E. Celniker¶, and
- Gerald M. Rubin*,§,¶
- *Howard Hughes Medical Institute and §Department of Molecular and Cell Biology, University of California, Life Sciences Addition, Berkeley, CA 94720-3200; and ¶Department of Genome Sciences, Lawrence Berkeley National Laboratory, One Cyclotron Road, Mailstop 64-121, Berkeley, CA 94720
-
Contributed by Gerald M. Rubin, December 17, 2004
-
Fig. 1.
Priority scores have predictive value. (A) The priority score distribution for all genscan predictions in our collection (gray bars) vs. the priority score distribution of the control set (black bars). (B) All genscan predictions (gray bars) vs. all validated genscan predictions (black bars). (C) All genscan predictions (gray bars) vs. the Heidelberg set (black bars).
-
Fig. 2.
Priority scores can be used to prioritize gene predictions for validation in a cost-effective manner. Fraction of all validated gene predictions obtained (y axis) vs. the number of PCRs performed (x axis) if choosing randomly (black line) or choosing on the basis of priority score (black diamonds) from the collection of 10,136 genscan and fgenesh predictions.
-
Fig. 3.
The distribution of genscan exon probabilities for predicted genes within D. melanogaster intergenic regions closely approximates that of random sequence. Distribution of exon probabilities assigned by genscan to predictions lying within 2 Mb of random sequence 0.25 G:A:T:C (light gray bars), 62 Mb of D. melanogaster release 3.1 intergenic regions (gray bars), and the first 2 Mb of the D. melanogaster chromosome arm 2L (black bars).
Footnotes
- Copyright © 2005, The National Academy of Sciences








