Keystone Symposia 2008 Conference Schedule  Sign up for PNAS Online eTocs
Link: Info for AuthorsLink: Editorial BoardLink: AboutLink: SubscribeLink: AdvertiseLink: ContactLink: Sitemap Link: PNAS Home
Proceedings of the National Academy of Sciences
Link: Current Issue "" Link: Archives "" Link: Online Submission ""  Link: Advanced Search

Published online on June 5, 2007, 10.1073/pnas.0703834104
PNAS | June 12, 2007 | vol. 104 | no. 24 | 10110-10115


This Article
Right arrow Figures Only
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Supporting Information
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a colleague
Right arrow Related articles in PNAS
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My File Cabinet
Right arrow Download to citation manager
Right arrow Request Copyright Permission
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via ISI Web of Science (6)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Korbel, J. O.
Right arrow Articles by Gerstein, M. B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Korbel, J. O.
Right arrow Articles by Gerstein, M. B.
Right arrowPubmed/NCBI databases
*GEO DataSet
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg  
What's this?

 Previous Article  | Table of Contents |  Next Article 

BIOLOGICAL SCIENCES / GENETICS
Systematic prediction and validation of breakpoints associated with copy-number variants in the human genome

Jan O. Korbel*,{dagger},{ddagger}, Alexander Eckehart Urban§, Fabian Grubert§, Jiang Du||, Thomas E. Royce*, Peter Starr*, Guoneng Zhong*, Beverly S. Emanuel**, Sherman M. Weissman§, Michael Snyder,{ddagger}, and Mark B. Gerstein*,||,{ddagger}

Departments of *Molecular Biophysics and Biochemistry and §Genetics, Yale University School of Medicine, New Haven, CT 06520; {dagger}European Molecular Biology Laboratory, 69117 Heidelberg, Germany; Departments of Molecular, Cellular, and Developmental Biology and ||Computer Science, Yale University, New Haven, CT 06520; and **Department of Pediatrics, University of Pennsylvania School of Medicine, Philadelphia, PA 19104

Communicated by Francis H. Ruddle, Yale University, New Haven, CT, April 30, 2007 (received for review January 11, 2007)

Copy-number variants (CNVs) are an abundant form of genetic variation in humans. However, approaches for determining exact CNV breakpoint sequences (physical deletion or duplication boundaries) across individuals, crucial for associating genotype to phenotype, have been lacking so far, and the vast majority of CNVs have been reported with approximate genomic coordinates only. Here, we report an approach, called BreakPtr, for fine-mapping CNVs (available from http://breakptr.gersteinlab.org). We statistically integrate both sequence characteristics and data from high-resolution comparative genome hybridization experiments in a discrete-valued, bivariate hidden Markov model. Incorporation of nucleotide-sequence information allows us to take into account the fact that recently duplicated sequences (e.g., segmental duplications) often coincide with breakpoints. In anticipation of an upcoming increase in CNV data, we developed an iterative, "active" approach to initially scoring with a preliminary model, performing targeted validations, retraining the model, and then rescoring, and a flexible parameterization system that intuitively collapses from a full model of 2,503 parameters to a core one of only 10. Using our approach, we accurately mapped >400 breakpoints on chromosome 22 and a region of chromosome 11, refining the boundaries of many previously approximately mapped CNVs. Four predicted breakpoints flanked known disease-associated deletions. We validated an additional four predicted CNV breakpoints by sequencing. Overall, our results suggest a predictive resolution of {approx}300bp. This level of resolution enables more precise correlations between CNVs and across individuals than previously possible, allowing the study of CNV population frequencies. Further, it enabled us to demonstrate a clear Mendelian pattern of inheritance for one of the CNVs.

copy number polymorphism | human genome variation | structural variants


Author contributions: J.O.K. and A.E.U. contributed equally to this work; J.O.K., A.E.U., S.M.W., M.S., and M.B.G. designed research; J.O.K., A.E.U., and F.G. performed research; A.E.U., F.G., J.D., P.S., G.Z., and B.S.E. contributed new reagents/analytic tools; J.O.K., A.E.U., F.G., J.D., T.E.R., S.M.W., M.S., and M.B.G. analyzed data; and J.O.K. and M.B.G. wrote the paper.

The authors declare no conflict of interest.

Data deposition: Microarray data have been deposited in the Gene Expression Omnibus repository (accession no. GSE6010).

This article contains supporting information online at www.pnas.org/cgi/content/full/0703834104/DC1.

{ddagger}To whom correspondence may be addressed. E-mail: jan.korbel{at}yale.edu, michael.snyder{at}yale.edu, or mark.gerstein{at}yale.edu

© 2007 by The National Academy of Sciences of the USA


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg    What's this?

Related articles in PNAS:

In This Issue

PNAS 2007 104: 9913-9914. [Full Text]  



This article has been cited by other articles in HighWire Press-hosted journals:


Home page
Hum Mol GenetHome page
A. S. Lee, M. Gutierrez-Arcelus, G. H. Perry, E. J. Vallender, W. E. Johnson, G. M. Miller, J. O. Korbel, and C. Lee
Analysis of copy number variation in the rhesus macaque genome identifies candidate loci for evolutionary and human disease studies
Hum. Mol. Genet., April 15, 2008; 17(8): 1127 - 1136.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. Cahan, L. E. Godfrey, P. S. Eis, T. A. Richmond, R. R. Selzer, M. Brent, H. L. McLeod, T. J. Ley, and T. A. Graubert
wuHMM: a robust algorithm to detect DNA copy number variation using long oligonucleotide microarray data
Nucleic Acids Res., April 1, 2008; 36(7): e41 - e41.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
M. Nozawa, Y. Kawahara, and M. Nei
From the Cover: Genomic drift and copy number variation of sensory receptor genes in humans
PNAS, December 18, 2007; 104(51): 20421 - 20426.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
J. O. Korbel, A. E. Urban, J. P. Affourtit, B. Godwin, F. Grubert, J. F. Simons, P. M. Kim, D. Palejev, N. J. Carriero, L. Du, et al.
Paired-End Mapping Reveals Extensive Structural Variation in the Human Genome
Science, October 19, 2007; 318(5849): 420 - 426.
[Abstract] [Full Text] [PDF]


Home page
J Biomol TechHome page
ARTICLE WATCH
J. Biomol. Tech., September 1, 2007; 18(4): 268 - 273.
[Full Text] [PDF]