Samuelson et al. 10.1073/pnas.0701687104.
Fig. 6. No differences in mammary gland Mcs5a region gene expression. Mcs5a region transcript levels were determined in total RNA from whole-rat mammary gland by using real-time QPCR. Similar transcript levels from genes within and flanking (±500 kb) the Mcs5a1 and Mcs5a2 loci were measured in Mcs5a resistant (WKy/WKy) and susceptible (WF/WF) 12-week-old virgin-female rats (A) and 4-weeks after DMBA administration (B). The map of rat chromosome 5 in A is the 1-Mb genomic region containing the Mcs5a1 (32 kb) and Mcs5a2 (84 kb) loci. The map shows the physical location of the genes analyzed in the mammary gland using QPCR. (C) Summary of the mammary gland expression levels of the Mcs5a1 and Mcs5a2 respective candidate genes, Fbxo10 and Frmpd1 after DMBA administration (Time = weeks after carcinogen). No differences were detected in mammary gland transcript levels of Fbxo10 or Frmpd1 at any time following DMBA exposure (1, 4, 6, and 9 weeks) with the use of two to three different TaqMan probes designed for each gene. Expression levels are presented as the log2 ± SD of the gene transcript level determined by real-time QPCR adjusted for Gapdh. QPCR expression levels were compared using the Mann-Whitney test.
Fig. 7. Mcs5a candidate genes, Fbxo10 and Frmpd1, are differentially expressed between mammary carcinoma-susceptible and -resistant rats in thymus and spleen, but not in ovary or brain tissues. Transcripts originating in Mcs5a were quantified in rat tissues that are known to influence the mammary gland. Fbxo10, Frmpd1, C9orf105, and DQ901406 transcript levels were measured in total RNA from Mcs5a resistant (WKy/WKy) and susceptible (WF/WF) 12-week-old virgin-female rats (C) and at 4, 6, or 9 weeks after DMBA administration. Three independent TaqMan probes for Fbxo10 and two for Frmpd1 were used. (A) Fbxo10 was expressed at higher levels in the thymus tissues of susceptible females at each time point, while the levels of Frmpd1, C9orf105, and DQ901406 transcripts were similar between the genotypes at all but one time point, which was at 9 weeks postDMBA for Frmpd1 p1. At 9 weeks after DMBA administration, higher levels of Frmpd1 were detected in the thymus tissues of resistant rats with Frmpd1 p1, but not Frmpd1 p2. (B) Frmpd1 was expressed at higher levels in the spleens of resistant females at each time point, while the levels of Fbxo10, C9orf105, and DQ901406 transcripts were similar between the genotypes. Frmpd1 was significantly higher (P < 0.05) at seven of the eight Frmpd1 probe (n = 2) ´ time (n = 4) combinations. Fbxo10 was significantly higher (P < 0.05) at 10 of the 12 Fbxo10 probe (n = 3) ´ time (n = 4) combinations. (C and D) No differences in expression between susceptible and resistant rats were consistently detected in the ovary or brain. Expression levels are presented as the log2 ± SD of the gene transcript level determined by real-time QPCR adjusted for Gapdh. QPCR expression levels were compared using the Mann-Whitney test. Comparisons with P values <0.05 are marked with asterisks.
Fig. 8. Human haplotype block map for the 1-Mb region homologous to Rat Mcs5a. (A) Distances (in kilobases) between haplotype blocks are indicated by the vertical black lines. SNPs shown in light blue are located between the block structures. (B) Common haplotype block variants. Each SNP is represented in a column within the block. SNPs from left to right were as follows. Block 1: rs1999142, rs7034763, rs2279556, rs7039784, rs1492713, rs10973239, rs13290794, rs2029646, rs12000309, rs495304, rs2790063, rs308492, rs3780334, rs186299, rs3780335; block 2, rs2235096, rs309458, C__2013184, rs309444; block 3: rs1887455, rs3739576, rs2296775, rs10758435, rs17413120, rs7044153; block 4: rs4878697, rs13285217, rs10758440, rs999988, rs2182317 (SNP-3), rs2381718; block 5: rs1886909, rs1325916; block 6: rs13298495, rs3747539, rs7025444, rs10814604, rs2296553; block 7: rs1952125, rs10124071, rs2013458, rs3789019; block 8: rs2296552, rs12551499, rs7873508, rs1409145, rs1059059, rs7158, rs10973556, rs10511954, rs2005084, 2025440, rs776018, rs1976936, rs12553058, rs1138374, rs1105773, rs943940, rs10973637, rs2183130, rs2890783, rs11790106, rs1928246, rs1928249; block 9: rs1928233, rs885431; block 10: rs731841, rs2585668; block 11: rs763936, rs1885491, rs645259; block 12: rs716933, rs1033790; block 13: rs2073478, rs3043; block 14: rs4878806, rs11361, rs2038589. Tag SNPs are in bold. (C) Size of haplotype blocks. (D) Percentage of haplotypes that are represented by common variants. (E) Corresponding percentages for the individual common haplotype variants shown in B. The following SNP genotyping assays were available from Applied Biosystems: rs6476611, rs1999142, rs7034763, rs2279556, rs7039784, rs1492713, rs10973239, rs13290794, rs2029646, rs12000309, rs2790063, rs3780334, rs186299, rs2087358, rs11793053, rs2235096, C__2013184_10, rs309444, rs1571234, rs1887455, rs3739576, rs2296775, rs10758435, rs17413120, rs7044153, rs4878697, rs13285217, rs10758440, rs2381718, rs1886909, rs12554736, rs13298495, rs7025444, rs10814604, rs2296553, rs1952125, rs10124071, rs2296552, rs12551499, rs7873508, rs1409145, rs1059059, rs7158, rs10973556, rs2005084, rs2025440, rs776018, rs1976936, rs12553058, rs1105773, rs943940, rs10973637, rs2183130, rs2890783, rs11790106, rs1928246, rs1928249, rs731841, rs2585668, rs1867178, rs645259, rs716933, rs3043, rs4878806, rs11361, rs2038589, and rs3849928. Twenty-four additional SNPs were chosen from the National Center for Biotechnology Information SNP database to narrow the distance between the SNPs from ABI. Twenty-one of these SNPs had minor alleles represented in the Wisconsin population; allelic differentiation primer and TaqMan (Applied Biosystems) probe sequences are in SI Table 7.
Fig. 9. Human Haplotype Blocks 3-5. (A) Distances between original haplotype blocks in 1 Mb MCS5A region haplotype block map. (B) Bases in columns represent SNPs that were genotyped in the haplotype block map determination. SNPs from left to right are as follows. Block 3: rs1887455, rs3739576, rs2296775, rs10758435, rs17413120, rs7044153; block 4.1: rs4878697, rs13285217; block 4.2: rs10758440, rs999988, rs2182317 (SNP-3), rs2381718; block 5: rs1886909, rs1325916. The vertical line in block 4 represents the division between blocks 4.1 and 4.2. Block 4 was split to facilitate testing fewer haplotypes. SNP rs2182317 in the blue box associates with a reduced risk of breast cancer. Tag SNPs are in bold. (C) FBXO10 and FRMPD1 genes; vertical bars represent exons. (D) Distances represent block length. Percentages represent the proportion of variation represented by the haplotype alleles within a block.
Fig. 10. Haplotypes determined from additional polymorphisms genotyped in haplotype block 4. Polymorphisms that appear in these haplotypes are as follows from left to right: rs4878697, rs13285217, rs10117312, rs6476640, 24-131, rs10973418, 128-249, rs10758440, 138-9899 (four-base indel), rs6476643 (SNP-A1), rs7021977, rs999988, rs2182317 (SNP-3), and rs2381718. Haplotypes were determined by using PHASE. Haplotype frequencies are based on genotypes of a 1,135-member subset of the Wisconsin case-control population. These polymorphisms were genotyped to determine which haplotypes contained the minor alleles of MCS5A1 and MCS5A2.
Fig. 11. Haplotype blocks within MCS5A after resequencing. (A) Haplotypes based on correlation patterns. Blocks A, B, and C are made up of polymorphisms with no proxies in other blocks. SNPs are shown in columns and are as follows from left to right. Block A, rs4878697, rs13285217, rs10117312, rs6476640, 24-131, rs10973418, 128-249; block B, rs10758440, 138-9899 (four-base indel), rs6476643 (SNP-A1), rs7021977; block C, rs999988, rs2182317 (SNP-3), rs2381718. Tag SNPs are in bold. (B) Percentages represent the total frequencies of the common haplotypes represented (above) in a 1,135-member subset of our Wisconsin case-control population. (C) Corresponding percentages in the population for each individual haplotype shown in a. (D) The vertical line separates MCS5A1 and MCS5A2. Block B is present in MCS5A1 and MCS5A2. (E) Original block 4 and subparts, 4.1 and 4.2, tested in the initial analysis.
Table 2. Haplotype frequencies determined by COCAPHASE in the Wisconsin sample set after the first phase of genotyping
|
Haplotype |
Case no. (%) |
Control no. (%) |
Odds ratio |
95% CI |
|
Haplotype block 4.1 haplotypes |
||||
|
TC |
1,669 (56) |
1,538 (55) |
1 |
Ref. |
|
CC |
858 (29) |
795 (28) |
0.99 |
0.88-1.12 |
|
CT |
451 (15) |
454 (16) |
0.92 |
0.79-1.06 |
|
TT |
22 (0.7) |
23 (0.8) |
0.88 |
0.49-1.58 |
|
Homozygous individuals only |
||||
|
TC |
457 (30) |
425 (30) |
1 |
Ref. |
|
CC |
123 (8) |
116 (8) |
0.98 |
0.74-1.31 |
|
CT |
35 (2) |
42 (3) |
0.77 |
0.48-1.24 |
|
Haplotype block 4.2 haplotypes |
||||
|
ACGC* |
722 (24) |
691 (25) |
1 |
Ref |
|
CCGC |
669 (22) |
665 (24) |
0.96 |
0.82-1.11 |
|
ACGG† |
673 (23) |
578 (21) |
1.12 |
0.96-1.31 |
|
ACTG‡ |
312 (10) |
329 (12) |
0.91 |
0.75-1.09 |
|
ATGG |
502 (17) |
458 (16) |
1.02 |
0.89-1.25 |
|
ATGC |
76 (3) |
52 (2) |
1.43 |
0.99-2.08 |
|
CCGG |
32 (1) |
30 (1) |
1.05 |
0.63-1.74 |
|
Homozygous individuals only |
||||
|
ACGC* |
77 (5) |
84 (6) |
0.97 |
0.66-1.42 |
|
CCGC |
61 (4) |
84 (6) |
0.76 |
0.51-1.13 |
|
ACGG† |
81 (5) |
61 (4) |
1.41 |
0.95-2.10 |
|
ACTG‡ |
12 (0.8) |
26 (2) |
0.48 |
0.23-0.98 |
|
ATGG |
37 (2) |
35 (2) |
1.12 |
0.67-1.86 |
SNPs in block 4.1 from left to right are rs4878697 and rs13285217. SNPs in block 4.2 from left to right are rs10758440, rs999988, rs2182317, and rs2381718.
*The minor allele of SNP-A1 (rs6476643) is found in 27% of women who carry this haplotype.
†The minor allele of the MCS5A1 breast cancer risk associated SNP-A1 (rs6476643) is found in 84% of women who carry this haplotype.
‡This haplotype is the only one that contains the minor allele of rs2182317 (third SNP listed in haplotype 4.2).
Table 8. Rat primer and probe sequences used to determine transcript levels of Mcs5a region genes
|
Gene |
Forward Primer |
TaqMan Probe |
Reverse Primer |
|
Zcchc7 |
TCAAAAAACTGCCCCTTACCA |
AAAAAGTCCGGCCCTGCTGTCTCTG |
GCCATACTGGAGGTGTCCTCTT |
|
Grhpr |
CCCCTGCTGACCCTCAAGA |
CGTGATCCTGCCCCACATCGG |
TGGTGTTGCGAGTTTTGTAGGT |
|
Zbtb5 |
TAGTGGGAAACAGGCACTTCAA |
CCGCTCTGTGCTGGCCGCAT |
CAGGGCCCGGAAATGC |
|
Polr1e |
GGCGCCCTCAAATGCA |
CACTCTTTGCAGACACTTTGTGGGCATTC |
CCATTTGGCCAGAGGTCTTATT |
|
Fbxo10 p1 |
GCTGGCATAGCAGTGAACGA |
CTCATCACAGAAAACGT |
CTACACCTCCCCACTGGTTCTC |
|
Fbxo10 p2 |
CGTGTGTGCAATCTTGTCTTCAT |
CATGTATAAGACCACATCAGG |
TCCATTCTCAAAGTTGCAGTTGTC |
|
Fbxo10 p3 |
CCAGCCTCTATGATCGAATCG |
CTTTTCCCAGACCACATC |
CATTCTCAAAGTTGCAGTTGTCAA |
|
C9orf105 |
AAGTGCAAATATCCACCTCTGTGA |
TGTTGCCGGAATTG |
CCCCTCAGAGTCTGGCATTATC |
|
DQ901406 |
CTGGCCGGGCATGCTA |
CACACGCCTTTAATC |
CCGCCTGTCTCTGCCTTCT |
|
Frmpd1 p1 |
CCCTGTGGCCTTTGAGTATCTC |
AGAGTTGCAGTGATGTC |
CCGGAGGGCAGAATTGC |
|
Frmpd1 p2 |
CGCCTACACCTTTTGCATGAG |
CCTTTCCACCACCTGCTG |
GCGGGAGTCCTGTGATTCTTC |
|
Rg9mtd3 |
GGAAAATCAAGCGTGCAGAAGAC |
CAGGTGCCGTTACCC |
AAACGTTTGCTGTGCTGTGG |
|
Exosc2 |
GATGGGTGTGATTGGACAGGAT |
CTTAATTAGAAAGCTATTGGCTC |
ATAGAGTTTTCCCAGCTCTTGCA |
|
Wdr32 |
CATCAGGATTTGATGGAAATGTCA |
CTAACAGGTGCACAGAAG |
TCTGGTGTTAATCTCATTCGCATAAG |
|
Mcart1 |
GCACTTATCAAAATTGACTTCAGAAGA |
ATGTCACAGGCTAAGTC |
TGAGCTTCTGAGCCCATCATG |
|
Shb |
CTGCGGAGGCAAAAAATTG |
AAGACAAGGTGACCATAGC |
CCTTTCCTGCTTTGCTCTTGA |
|
Gene |
Forward Primer |
TaqMan Probe |
Reverse Primer |
|
Zcchc7 |
TCAAAAAACTGCCCCTTACCA |
AAAAAGTCCGGCCCTGCTGTCTCTG |
GCCATACTGGAGGTGTCCTCTT |
|
Grhpr |
CCCCTGCTGACCCTCAAGA |
CGTGATCCTGCCCCACATCGG |
TGGTGTTGCGAGTTTTGTAGGT |
|
Zbtb5 |
TAGTGGGAAACAGGCACTTCAA |
CCGCTCTGTGCTGGCCGCAT |
CAGGGCCCGGAAATGC |
|
Polr1e |
GGCGCCCTCAAATGCA |
CACTCTTTGCAGACACTTTGTGGGCATTC |
CCATTTGGCCAGAGGTCTTATT |
|
Fbxo10 p1 |
GCTGGCATAGCAGTGAACGA |
CTCATCACAGAAAACGT |
CTACACCTCCCCACTGGTTCTC |
|
Fbxo10 p2 |
CGTGTGTGCAATCTTGTCTTCAT |
CATGTATAAGACCACATCAGG |
TCCATTCTCAAAGTTGCAGTTGTC |
|
Fbxo10 p3 |
CCAGCCTCTATGATCGAATCG |
CTTTTCCCAGACCACATC |
CATTCTCAAAGTTGCAGTTGTCAA |
|
C9orf105 |
AAGTGCAAATATCCACCTCTGTGA |
TGTTGCCGGAATTG |
CCCCTCAGAGTCTGGCATTATC |
|
DQ901406 |
CTGGCCGGGCATGCTA |
CACACGCCTTTAATC |
CCGCCTGTCTCTGCCTTCT |
|
Frmpd1 p1 |
CCCTGTGGCCTTTGAGTATCTC |
AGAGTTGCAGTGATGTC |
CCGGAGGGCAGAATTGC |
|
Frmpd1 p2 |
CGCCTACACCTTTTGCATGAG |
CCTTTCCACCACCTGCTG |
GCGGGAGTCCTGTGATTCTTC |
|
Rg9mtd3 |
GGAAAATCAAGCGTGCAGAAGAC |
CAGGTGCCGTTACCC |
AAACGTTTGCTGTGCTGTGG |
|
Exosc2 |
GATGGGTGTGATTGGACAGGAT |
CTTAATTAGAAAGCTATTGGCTC |
ATAGAGTTTTCCCAGCTCTTGCA |
|
Wdr32 |
CATCAGGATTTGATGGAAATGTCA |
CTAACAGGTGCACAGAAG |
TCTGGTGTTAATCTCATTCGCATAAG |
|
Mcart1 |
GCACTTATCAAAATTGACTTCAGAAGA |
ATGTCACAGGCTAAGTC |
TGAGCTTCTGAGCCCATCATG |
|
Shb |
CTGCGGAGGCAAAAAATTG |
AAGACAAGGTGACCATAGC |
CCTTTCCTGCTTTGCTCTTGA |
SI Text
Notes. SNP-3 (rs2182317) and other polymorphisms listed in the bin are candidates for the causative SNP(s) in MCS5A2. These candidates include many highly correlated SNPs that are heterozygous in all individuals carrying, exclusively, allele 4 of haplotype block 4.2 (SI Figs. 9 and 10). Any SNP-3 correlated SNP, or SNP combination, found in individuals carrying haplotype block 4.2 allele 4 (SI Fig. 9) could be the causative SNP. We hypothesize that the causative SNP is in or near a conserved region; thus, the genetic variation in SI Fig. 10 may effectively narrow the region containing the causative SNP to chr9:37,610,247-37,655,573. Considering all SNPs that are more rare or more common than SNP-3 and found in carriers of haplotype block 4.2 allele 4, the possible candidate SNPs include: 114-117(MAF~0.02-0.04), d3-169, rs12378421, r3-116, rs17505776, rs4878708, rs4878709, rs4878710, rs10973450, l4-70, m4-218, rs4490927, x4-77, z4-66, f5-152, rs4878713, y5-43, i6-31, and i6-103 (SI Fig. 10). Testing these SNP-3 correlated SNPs individually will not, by itself, distinguish one of these SNPs as the causative SNP in MCS5A2. This is because all but one of these SNPs, 114-117, has an identical or very similar minor allele frequency and distribution to rs2182317. There were only two SNPs, 128-249 and a correlated SNP 105-3 (SI Fig. 10), in MCS5A1 in which the minor alleles were found only in individuals with haplotype block 4.2 allele 4. The SNP 128-249 (MAF 0.02) was not associated with a reduction of breast cancer risk (SI Table 4); thus, the only causative SNP-3 correlated SNP candidates reside in MCS5A2.
The SNP bin that includes rs6476643 (SNP-A1) contains 4 polymorphisms with similar minor allele distributions but slightly different minor allele frequencies. The minor allele of SNP-A1 rs647643 was found in 84% women with haplotype block 4.2 allele 3, and in 27% women with haplotype block 4.2 allele 1 (the haplotypes including the SNPs used to define the original block 4 haplotypes and those tested subsequently are shown in SI Figs. 8 and 10). The SNP bin including SNP-A1 rs6476643 is close to an area of recombination and spans ~6 Kb (Fig. 4 and SI Fig. 10). No SNPs correlated to this bin were found outside of this 6 Kb MCS5A1 region. This result is based on the distribution of the minor alleles found in the resequencing effort. Only one other SNP bin had minor alleles found mostly in haplotype block 4.2, allele 3 (SI Table 2). This SNP bin marked by SNP 24-131 has a minor allele frequency of 0.03, and was found in 16% of women who have haplotype block 4.2, allele 3. However, this SNP did not show an association with breast cancer risk (SI Table 4). The indel 138-9899 is correlated to SNP-A1 rs6476643 (r2 = 0.9) (SI Fig. 10). This correlation was not observed in the resequencing data because heterozygous and homozygous individuals for the deletion could not be distinguished (i.e., both appear as homozygous for the deletion). When tested using allelic discrimination, this polymorphism showed a trend toward resistance in the Wisconsin population with P = 0.22.
SI Methods. Primer information of the genetic markers and the SNP bp (rat genome June 2003 assembly available at the UCSC Genome Browser,
www.genome.ucsc.edu) defining the ends of the WKy allele carried by each congenic line are: lines O, XX, and B3 proximal microsatellite
marker gUwm40-18 FWP: 5'-GACTTAATGTGGGGAGTGAA, RVP: 5'-AGCACATATGGAGGTTTGAC; lines O, LL, and WW distal microsatellite marker gUwm45-5 FWP: 5'-CTAGAAAGGTGCTTTGGTTG, RVP: 5'-TCAGCTTCTCCTCCTTCC; line WW proximal SNP at chr5:61634906C>T; line LL proximal SNP at chr5:61667232A>G; line XX distal microsatellite marker gUwm23-29 FWP: 5'-CCAGTCTGATGACCTGAGTT, RVP: 5'-CTTGCATGTGTGTAAGTGCT; and line B3 distal SNP at
chr5:6166666918A>G.
Transcribed elements were verified by using RT-PCR of total RNA from WF and WKy mammary gland and brain tissues. Complementary DNA was synthesized from 2 mg of total RNA for 2 h at 42°C in a 20-ml final reaction volume consisting of 0.5´ RNA Secure (Ambion, Austin, TX), 0.05 mg/ml oligo(dT)18, 125 mM dNTP mix, 1´ first strand buffer (Invitrogen, Carlsbad, CA), 10 mM DTT, and 200 units of Superscript II reverse transcriptase (RT) (Invitrogen). For PCRs, RT reactions were diluted 1:2 or 1:4 and 1 ml of this dilution (~25-50 ng RNA equivalent cDNA) was used in a 20-ml PCR. The reaction components were 1´ Herculase Buffer (Stratagene, La Jolla, CA), 200 mM each dNTP, 500 nM each primer, and 1 unit of Herculase DNA Pol (Stratagene). PCR cycling conditions in a Biometra T3 thermocycler were 95°C for 2 min, followed by 35 cycles of 92°C for 1 min, 59°C for 45 sec, and 72°C for 2 min. A 5-min extension at 72°C was added at the end.
To resequence rat gDNA regions, exons, 5'-UTR, and ORFs, primers spanning these elements were designed by using Primer 3 (1). Spleen gDNA (~50 ng) or mammary gland cDNA (~25-50 ng) from WF and WKy rats was used as PCR templates. Spleen gDNA was extracted by using previously published conditions (2). Complementary DNA was synthesized as described above. The target gDNA or cDNA was amplified by the PCR conditions used for the verification of transcripts described above and gel-purified by using Qiagen's (Valencia, CA) gel purification kit according to the manufacturer's protocol. One microliter of the gel purified suspension was used in each forward and reverse sequencing reaction (15 ml) that included 0.5 ml of BigDye v3.1 (Applied Biosystems), 1´ reaction buffer (Applied Biosystems), 0.67 mM primer. Conditions of the sequencing reaction were 95°C for 3 min followed by 32 cycles of 95°C for 30 sec and 58°C for 2.5 min. A 72°C, 7 min extension was added at the end. The sequence reactions were processed by using CleanSEQ magnetic beads (Agencourt, Beverly, MA) according to the manufacturer's directions. The final volume after cleaning was 50 ml. The sequence was determined by the University of Wisconsin Biotechnology Center (Madison, WI) sequencing facility from a 1:1 dilution of the sequencing reaction. Sequence data were analyzed by using Sequencher v 4.2 software (Gene Codes, Ann Arbor, MI).
The human gDNA sequence spanning MCS5A1 (chr9: 37544050-37582460) and the human/rat CNS in MCS5A2 (chr9: 37586100-37658620) were resequenced in 24 women representative of the case-control population frequency of block 4.2 haplotype variants. DNA was submitted to Polymorphic DNA Technologies (Alameda, CA) for resequencing.
Human DNA extraction and genotyping. Only willing participants were asked to provide a mouthwash rinse in 30 ml Scope (Procter and Gamble, Cincinnati, OH). For the Wisconsin sample population we collected 1,737 case and 1,790 control samples. Human mouthwash rinse samples were centrifuged in a Beckman JS-4.2 rotor at 3000 RPM (2,000 ´ g) at room temperature for 15 min. DNA was extracted from the pellet by using the PUREGENE cell and tissue kit from Gentra Systems (Minneapolis, MN) according to the manufacturer's protocol. DNA was resuspended in double distilled H2O and DNA concentrations were determined by using the PicoGreen dsDNA Assay (Invitrogen). Samples contained an average yield of 29.3 mg of DNA. Case and control DNA samples were genotyped by using either 5 ng of genomic DNA from Wisconsin samples or 10 ng of primer extension preamplification (PEP) DNA from U.K. samples in 5-ml SNP allelic discrimination assays according to manufacturer's protocol (Applied Biosystems). Samples were amplified on MJ Research/Bio-Rad Laboratories (Hercules, CA) thermocyclers with conditions recommended by Applied Biosystems. Fluorescence levels were determined by using an Applied Biosystems 7900 instrument.
Construction of haplotype block map of MCS5A. A 1-Mb haplotype block map of the human genomic region containing MCS5A was constructed. Ninety-one SNPs that spanned a 1-Mb region on chromosome 9 orthologous to the rat Mcs5a region were used to make a haplotype block map (SI Fig. 8 and SI Table 9).
To determine the linkage in the region, 39 CEPH (Centre d' etudes du Polymorphisme Humain) family grandparents (unrelated individuals with DNA available from the Coriell Institute, http://ccr.coriell.org/nigms) and a selection of 100 case-control high-DNA-yield samples from the Wisconsin population were used to estimate the haplotype phase of 88 SNPs. Ten CEPH families (each including four grandparents along with two parents and two children) were genotyped and then haplotyped by inference from pedigree information. CEPH grandparents and 100 case-control samples were assigned haplotypes by using the PHASE software package (3). A comparison of the results of the PHASE software program and inferred pedigree information of all of the CEPH family members was made to determine the accuracy of the haplotype estimates. Of the 79 CEPH family members, only six had haplotypes with discrepancies between haplotypes inferred from pedigrees and the PHASE output with the highest posterior probability. Of 11 total discrepancies, 10 were "switch errors" in areas of low linkage disequilibrium and one "switch error" occurred in an area of high linkage disequilibrium.
Block boundaries were defined by areas of low linkage between adjacent SNPs (SI Fig. 8). Linkage disequilibrium between each SNP pair was determined by calculating the Lewtonin D′ statistic in the CEPH grandparents and Wisconsin case-control samples. Adjacent pairs of SNPs with a D′ ³ 0.89 were considered to be in the same block, whereas those with a D′ < 0.89 were not included in the same haplotype block.
Analysis of polymorphisms in haplotype block 4. The initial 1-Mb interval containing MCS5A was subsequently narrowed, using additional phenotyping data from congenic rats, to an orthologous human region of ~94 Kb that was completely contained in block 4 (SI Fig. 8). Tag SNPs listed below in haplotype block 4 were genotyped in our Wisconsin case-control population (~1,500 cases and ~1,405 controls). Twelve common haplotype alleles were observed in block 4. To reduce the number of haplotypes for testing, block 4 was divided into part 4.1, which had 4 major variants, and part 4.2, which had 7 major variants (SI Fig. 9). The division between the second and third SNPs of block 4 resulted in the fewest haplotype variants.
All common haplotypes in block 4 (four in 4.1, seven in 4.2) were screened by using COCAPHASE v 2.403 (4), which calculates odds ratios and an overall P value based on a likelihood ratio test. The haplotype block labeled 4.2 merited further investigation as a likely candidate, based on the odds ratio and 95% confidence interval (SI Table 2). It is important to note that the minor allele of rs2182317 (risk reduction-associated allele SNP-3 in MCS5A2) was found only in allele 4 in block 4.2 (SI Fig. 9).
Polymorphisms observed in resequencing data are listed in SI Data Set 1. Polymorphisms in SI Tables 5 (MCS5A1) and 6 (MCS5A2) had minor alleles that were observed in more than one individual, and were not listed on NCBI.
Quality control for human sample genotyping. Quality control of the Wisconsin samples was conducted with DNA from 85 subjects who had submitted two independent samples. These duplicate samples were genotyped for all polymorphisms tested in the entire Wisconsin population; 95% (1,648 of 1,729) of the genotypes were identical for the two samples, 5% (81 of 1,729) of the samples had a call for one, but a no call for the other sample. There were no sample sets that resulted in mismatched calls. Wisconsin samples that were genotyped twice (initially when the original haplotype block map was made and again when a subset of these SNPs was genotyped in the entire population) yielded the same genotype 98.4% (1,794 of 1,823) of the time. Only, 1.5% (27 of 1,823) had a no call in one of the replicates. Mismatches occurred in 0.1% (2 of 1,823) of the replicates. For the U.K. samples each 384-well plate had 12 samples that were duplicated on a separate plate. All genotyping calls for the first four SNPs tested in the U.K. population were in agreement for each duplicate, except for one genotype of an individual with no call. The call rate for SNP rs6476643 was 0.981 in the U.K. study set.
1. Rozen S, Skaletsky H (2000) Methods Mol Biol 132:365-386.
2. Samuelson DJ, Haag JD, Lan H, Monson DM, Shultz MA, Kolman BD, Gould MN (2003) Carcinogenesis 24:1455-1460.
3. Stephens M, Smith NJ, Donnelly P (2001) Am J Hum Genet 68:978-989.
4. Dudbridge F (2003) Genet Epidemiol 25:115-121.