Protein ranking: From local to global structure in the protein similarity network
- Jason Weston†,‡,
- Andre Elisseeff‡,
- Dengyong Zhou‡,
- Christina S. Leslie§, and
- William Stafford Noble¶,∥
- †NEC Laboratories America, 4 Independence Way, Princeton, NJ 08540; ‡Max Planck Institute for Biological Cybernetics, Spemannstrasse 38, 72076 Tübingen, Germany; §Department of Computer Science, Columbia University, 1214 Amsterdam Avenue, MC 0401, New York, NY 10027; and ¶Department of Genome Sciences, University of Washington, Health Sciences Center, P.O. Box 357730, Seattle, WA 98195
-
Edited by Michael S. Waterman, University of Southern California, Los Angeles, CA (received for review December 4, 2003)
Abstract
Biologists regularly search databases of DNA or protein sequences for evolutionary or functional relationships to a given query sequence. We describe a ranking algorithm that exploits the entire network structure of similarity relationships among proteins in a sequence database by performing a diffusion operation on a precomputed, weighted network. The resulting ranking algorithm, evaluated by using a human-curated database of protein structures, is efficient and provides significantly better rankings than a local network search algorithm such as psi-blast.
Footnotes
-
↵ ∥ To whom correspondence should be addressed. E-mail: noble{at}gs.washington.edu.
-
This paper was submitted directly (Track II) to the PNAS office.
-
Abbreviation: ROC, receiver operating characteristic.
- Copyright © 2004, The National Academy of Sciences





