Skip to main content
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Accessibility Statement
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian
  • Log in
  • My Cart

Main menu

  • Home
  • Articles
    • Current
    • Special Feature Articles - Most Recent
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • List of Issues
  • Front Matter
  • News
    • For the Press
    • This Week In PNAS
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Editorial and Journal Policies
    • Submission Procedures
    • Fees and Licenses
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Accessibility Statement
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian

User menu

  • Log in
  • My Cart

Search

  • Advanced search
Home
Home

Advanced Search

  • Home
  • Articles
    • Current
    • Special Feature Articles - Most Recent
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • List of Issues
  • Front Matter
  • News
    • For the Press
    • This Week In PNAS
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Editorial and Journal Policies
    • Submission Procedures
    • Fees and Licenses

New Research In

Physical Sciences

Featured Portals

  • Physics
  • Chemistry
  • Sustainability Science

Articles by Topic

  • Applied Mathematics
  • Applied Physical Sciences
  • Astronomy
  • Computer Sciences
  • Earth, Atmospheric, and Planetary Sciences
  • Engineering
  • Environmental Sciences
  • Mathematics
  • Statistics

Social Sciences

Featured Portals

  • Anthropology
  • Sustainability Science

Articles by Topic

  • Economic Sciences
  • Environmental Sciences
  • Political Sciences
  • Psychological and Cognitive Sciences
  • Social Sciences

Biological Sciences

Featured Portals

  • Sustainability Science

Articles by Topic

  • Agricultural Sciences
  • Anthropology
  • Applied Biological Sciences
  • Biochemistry
  • Biophysics and Computational Biology
  • Cell Biology
  • Developmental Biology
  • Ecology
  • Environmental Sciences
  • Evolution
  • Genetics
  • Immunology and Inflammation
  • Medical Sciences
  • Microbiology
  • Neuroscience
  • Pharmacology
  • Physiology
  • Plant Biology
  • Population Biology
  • Psychological and Cognitive Sciences
  • Sustainability Science
  • Systems Biology
Research Article

Reliable prediction of transcription factor binding sites by phylogenetic verification

Xiaoman Li, Sheng Zhong, and Wing H. Wong
PNAS November 22, 2005 102 (47) 16945-16950; https://doi.org/10.1073/pnas.0504201102
Xiaoman Li
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sheng Zhong
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Wing H. Wong
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  1. Edited by Michael S. Waterman, University of Southern California, Los Angeles, CA (received for review May 20, 2005)

  • Article
  • Figures & SI
  • Info & Metrics
  • PDF
Loading

Abstract

We present a statistical methodology that largely improves the accuracy in computational predictions of transcription factor (TF) binding sites in eukaryote genomes. This method models the cross-species conservation of binding sites without relying on accurate sequence alignment. It can be coupled with any motif-finding algorithm that searches for overrepresented sequence motifs in individual species and can increase the accuracy of the coupled motif-finding algorithm. Because this method is capable of accurately detecting TF binding sites, it also enhances our ability to predict the cis-regulatory modules. We applied this method on the published chromatin immunoprecipitation (ChIP)-chip data in Saccharomyces cerevisiae and found that its sensitivity and specificity are 9% and 14% higher than those of two recent methods. We also recovered almost all of the previously verified TF binding sites and made predictions on the cis-regulatory elements that govern the tight regulation of ribosomal protein genes in 13 eukaryote species (2 plants, 4 yeasts, 2 worms, 2 insects, and 3 mammals). These results give insights to the transcriptional regulation in eukaryotic organisms.

  • cross-species conservation
  • motif finding
  • ribosomal protein genes

Footnotes

  • ↵ b To whom correspondence should be sent at the present address: Division of Biostatistics, Department of Medicine, Indiana University, Indianapolis, IN 46202. E-mail: shawnli{at}iupui.edu.

  • Author contributions: X.L. and W.H.W. designed research; X.L. performed research, contributed new reagents/analytic tools, and analyzed data; and X.L., S.Z., and W.H.W. wrote the paper.

  • Conflict of interest statement: No conflicts declared.

  • This paper was submitted directly (Track II) to the PNAS office.

  • Abbreviations: ChIP, chromatin immunoprecipitation; CSC, cross-species conservation; MSM, marginally significant motif; TF, transcription factor; TSS, transcription start site; RPG, ribosomal protein gene; Sc, Saccharomyces cerevisiae; NLC, network-level conservation.

  • ↵ d Coregulated genes are those that are regulated by the same TF or TF modules.

  • ↵ e Appendix 1, which is published as supporting information on the PNAS web site, gives an example showing that motif instances may not always align correctly.

  • ↵ f The anchor species is where the motif-finding problem arises; i.e., if we are interested in finding the motifs in a certain species, then this species is called the anchor species. We give this name to this species to differentiate it from all other species that are used to help finding the motifs (the genes from the anchor species are called anchor genes).

  • ↵ g A grouping of MSMs is a collection of similar MSMs, where each MSM in the group belongs to a different species. See Appendix 2, which is published as supporting information on the PNAS web site, for how to obtain groupings of MSMs.

  • ↵ h Although we use “upstream sequence” in describing the method, in practice, the method should be applied to any regions that may contain cis-regulatory elements.

  • ↵ i For the yeast species, we downloaded alignment of upstream orthologs from ref. 25. For the two plant species, we did local alignment of orthologous upstream and used the best-aligned regions of 100-bp length for every orthologous pair. The 100-bp cutoff is arbitrarily chosen, but, to our knowledge, our method is not so sensitive to the background-substitution matrices. For the three mammalian species, two insect species, and two worm species, we download the available alignments of the RPG upstream sequences from University of California, Santa Cruz genome browser web site.

  • ↵ j This cutoff is arbitrary. From our experience, this cutoff works well for all the data sets from different species we used. In the text, we have another empirical P value cutoff, 1 × 10-19, which is used to report motifs.

  • ↵ k To avoid overfitting, we exclude the motif instances on the current group of orthologous genes (the group of genes to be scanned by the ancient motif) from constructing the ancient motif.

  • ↵ l phylocon on average outputs 60 predictions with some redundancies. compareprospector outputs ranked ordered motifs, for which we performed the manual search within the top three motifs.

  • ↵ m The criterion for matching with transfac motifs is that there should be at most one mismatch when we compare the putative motifs with the transfac ones (see the legend of Table 5 and the supplementary files of ref. 18).

  • Freely available online through the PNAS open access option.

  • Copyright © 2005, The National Academy of Sciences
View Full Text
PreviousNext
Back to top
Article Alerts
Email Article

Thank you for your interest in spreading the word on PNAS.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Reliable prediction of transcription factor binding sites by phylogenetic verification
(Your Name) has sent you a message from PNAS
(Your Name) thought you would like to see the PNAS web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Citation Tools
Reliable prediction of transcription factor binding sites by phylogenetic verification
Xiaoman Li, Sheng Zhong, Wing H. Wong
Proceedings of the National Academy of Sciences Nov 2005, 102 (47) 16945-16950; DOI: 10.1073/pnas.0504201102

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Request Permissions
Share
Reliable prediction of transcription factor binding sites by phylogenetic verification
Xiaoman Li, Sheng Zhong, Wing H. Wong
Proceedings of the National Academy of Sciences Nov 2005, 102 (47) 16945-16950; DOI: 10.1073/pnas.0504201102
Digg logo Reddit logo Twitter logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Mendeley logo Mendeley
Proceedings of the National Academy of Sciences of the United States of America: 102 (47)
Table of Contents

Submit

Sign up for Article Alerts

Jump to section

  • Article
    • Abstract
    • Methods
    • Results
    • Discussion
    • Acknowledgments
    • Footnotes
    • References
  • Figures & SI
  • Info & Metrics
  • PDF

You May Also be Interested in

Surgeons hands during surgery
Inner Workings: Advances in infectious disease treatment promise to expand the pool of donor organs
Despite myriad challenges, clinicians see room for progress.
Image credit: Shutterstock/David Tadevosian.
Setting sun over a sun-baked dirt landscape
Core Concept: Popular integrated assessment climate policy models have key caveats
Better explicating the strengths and shortcomings of these models will help refine projections and improve transparency in the years ahead.
Image credit: Witsawat.S.
Double helix
Journal Club: Noncoding DNA shown to underlie function, cause limb malformations
Using CRISPR, researchers showed that a region some used to label “junk DNA” has a major role in a rare genetic disorder.
Image credit: Nathan Devery.
Steamboat Geyser eruption.
Eruption of Steamboat Geyser
Mara Reed and Michael Manga explore why Yellowstone's Steamboat Geyser resumed erupting in 2018.
Listen
Past PodcastsSubscribe
Birds nestling on tree branches
Parent–offspring conflict in songbird fledging
Some songbird parents might improve their own fitness by manipulating their offspring into leaving the nest early, at the cost of fledgling survival, a study finds.
Image credit: Gil Eckrich (photographer).

Similar Articles

Site Logo
Powered by HighWire
  • Submit Manuscript
  • Twitter
  • Facebook
  • RSS Feeds
  • Email Alerts

Articles

  • Current Issue
  • Special Feature Articles – Most Recent
  • List of Issues

PNAS Portals

  • Anthropology
  • Chemistry
  • Classics
  • Front Matter
  • Physics
  • Sustainability Science
  • Teaching Resources

Information

  • Authors
  • Editorial Board
  • Reviewers
  • Librarians
  • Press
  • Site Map
  • PNAS Updates

Feedback    Privacy/Legal

Copyright © 2021 National Academy of Sciences. Online ISSN 1091-6490