Call for PNAS Covers  Sign up for PNAS Online eTocs
Link: Info for AuthorsLink: Editorial BoardLink: AboutLink: SubscribeLink: AdvertiseLink: ContactLink: Sitemap Link: PNAS Home
Proceedings of the National Academy of Sciences
Link: Current Issue "" Link: Archives "" Link: Online Submission ""  Link: Advanced Search

Published online on June 27, 2007, 10.1073/pnas.0701393104

This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a colleague
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My File Cabinet
Right arrow Download to citation manager
Right arrow Request Copyright Permission
Citing Articles
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Meyerguz, L.
Right arrow Articles by Elber, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Meyerguz, L.
Right arrow Articles by Elber, R.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg  
What's this?

Biophysics
The network of sequence flow between protein structures

( protein designability | sequence capacity | structure stability | transitional sequences )

Leonid Meyerguz, Jon Kleinberg, and Ron Elber {dagger}

Department of Computer Science, Cornell University, Ithaca, NY 14853

Edited by Harold A. Scheraga, Cornell University, Ithaca, NY, and approved May 25, 2007 (received for review February 14, 2007)

Sequence-structure relationships in proteins are highly asymmetric because many sequences fold into relatively few structures. What is the number of sequences that fold into a particular protein structure? Is it possible to switch between stable protein folds by point mutations? To address these questions, we compute a directed graph of sequences and structures of proteins, which is based on 2,060 experimentally determined protein shapes from the Protein Data Bank. The directed graph is highly connected at native energies with "sinks" that attract many sequences from other folds. The sinks are rich in {beta}-sheets. The number of sequences that transition between folds is significantly smaller than the number of sequences retained by their fold. The sequence flow into a particular protein shape from other proteins correlates with the number of sequences that matches this shape in empirically determined genomes. Properties of strongly connected components of the graph are correlated with protein length and secondary structure.


Author contributions: L.M., J.K., and R.E. designed research; L.M. performed research; J.K. and R.E. contributed new reagents/analytic tools; L.M., J.K., and R.E. analyzed data; and R.E. wrote the paper.

The authors declare no conflict of interest.

{dagger}To whom correspondence should be addressed.

Ron Elber, E-mail: ron{at}cs.cornell.edu

www.pnas.org/cgi/doi/10.1073/pnas.0701393104
Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg    What's this?