Minimal haplotype tagging

  1. Paola Sebastiani,
  2. Ross Lazarus,§,
  3. Scott T. Weiss,§,,
  4. Louis M. Kunkel,,††,
  5. Isaac S. Kohane,,‡‡, and
  6. Marco F. Ramoni,,‡‡,§§
  1. Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118; Harvard Medical School, Boston, MA 02115; §Channing Laboratory, Brigham and Women's Hospital, Boston, MA 02115; Harvard Partners Center for Genetics and Genomics, Boston, MA 02115; and Division of Genetics, ††Howard Hughes Medical Institute, and ‡‡Informatics Program, Children's Hospital, Boston, MA 02115
  1. Contributed by Louis M. Kunkel, June 16, 2003

Abstract

The high frequency of single-nucleotide polymorphisms (SNPs) in the human genome presents an unparalleled opportunity to track down the genetic basis of common diseases. At the same time, the sheer number of SNPs also makes unfeasible genomewide disease association studies. The haplotypic nature of the human genome, however, lends itself to the selection of a parsimonious set of SNPs, called haplotype tagging SNPs (htSNPs), able to distinguish the haplotypic variations in a population. Current approaches rely on statistical analysis of transmission rates to identify htSNPs. In contrast to these approximate methods, this contribution describes an exact, analytical, and lossless method, called BEST (Best Enumeration of SNP Tags), able to identify the minimum set of SNPs tagging an arbitrary set of haplotypes from either pedigree or independent samples. Our results confirm that a small proportion of SNPs is sufficient to capture the haplotypic variations in a population and that this proportion decreases exponentially as the haplotype length increases. We used BEST to tag the haplotypes of 105 genes in an African-American and a European-American sample. An interesting finding of this analysis is that the vast majority (95%) of the htSNPs in the European-American sample is a subset of the htSNPs of the African-American sample. This result seems to provide further evidence that a severe bottleneck occurred during the founding of Europe and the conjectured “Out of Africa” event.

Footnotes

  • §§ To whom correspondence should be addressed at: Informatics Program, Children's Hospital, Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115. E-mail: marco_ramoni{at}harvard.edu.

  • Abbreviations: BEST, Best Enumeration of SNP Tags; SNP, single-nucleotide polymorphism; htSNP, haplotype tagging SNP.

« Previous | Next Article »Table of Contents