Protein quantification across hundreds of experimental conditions

  1. Zia Khana,b,1,
  2. Joshua S. Bloomb,c,
  3. Benjamin A. Garciac,
  4. Mona Singha,b and
  5. Leonid Kruglyaka,b,d,e
  1. aDepartment of Computer Science,
  2. bLewis–Sigler Institute for Integrative Genomics,
  3. Departments of cMolecular Biology and
  4. dEcology and Evolutionary Biology, and
  5. eHoward Hughes Medical Institute, Princeton University, Princeton, NJ 08540
  1. Edited by Adam Godzik, Burnham Institute for Medical Research, La Jolla, CA, and accepted by the Editorial Board July 23, 2009 (received for review April 14, 2009)

Abstract

Quantitative studies of protein abundance rarely span more than a small number of experimental conditions and replicates. In contrast, quantitative studies of transcript abundance often span hundreds of experimental conditions and replicates. This situation exists, in part, because extracting quantitative data from large proteomics datasets is significantly more difficult than reading quantitative data from a gene expression microarray. To address this problem, we introduce two algorithmic advances in the processing of quantitative proteomics data. First, we use space-partitioning data structures to handle the large size of these datasets. Second, we introduce techniques that combine graph-theoretic algorithms with space-partitioning data structures to collect relative protein abundance data across hundreds of experimental conditions and replicates. We validate these algorithmic techniques by analyzing several datasets and computing both internal and external measures of quantification accuracy. We demonstrate the scalability of these techniques by applying them to a large dataset that comprises a total of 472 experimental conditions and replicates.

Footnotes

  • 1To whom correspondence should be addressed. E-mail: zkhan{at}princeton.edu
  • Author contributions: Z.K., M.S., and L.K. designed research; Z.K. performed research; Z.K. and B.A.G. contributed new reagents/analytic tools; Z.K. and J.S.B. analyzed data; and Z.K., B.A.G., M.S., and L.K. wrote the paper.

  • Conflict of interest statement: A patent for related technology has been filed with the U.S. Patent and Trademark office by Princeton University. Z.K., M.S., and L.K. are coinventors.

  • This article is a PNAS Direct Submission. A.G. is a guest editor invited by the Editorial Board.

  • This article contains supporting information online at www.pnas.org/cgi/content/full/0904100106/DCSupplemental.

« Previous | Next Article »Table of Contents
OPEN ACCESS ARTICLE