Enrich et al. 10.1073/pnas.0712251105.

Fig. 5. Scatterplot showing the relationship between replicated measurements of 96 samples over four different genes. Both replicates yielded very similar results and show a good correlation.

Fig. 6. Box plot showing the number sequence motifs, known to be enriched in unmethylated CGIs, per amplicon for the group of amplicons with low average methylation (<20%) and high average methylation (>80%) in normal samples.

Fig. 7. Shown is an analysis of clinical colon cancer tissue samples. (a) Dendrogram of a hierarchical cluster analysis of 48 colon cancer tissue samples. Their clinical phenotypes are annotated as color-coded bars. Each column annotates a separate feature; from left to right: gender (white, male; black, female); age at diagnosis (a continuous color scale from white = 46 years to blue = 83 years); tumor Stage (white, Stage I; light gray, Stage II; dark gray, Stage III; and black = Stage IV); local or distant recurrence (white, no recurrence; black, recurrent disease); location of the primary tumor (annotated from white to black according to their anatomical locations from Caecum to Rectum). (b) Graphical representation from a projection of the methylation values into three dimensions. Classical multidimensional scaling was used for the projection. The results for tissue and cell-line samples are shown within the three axes. Sample origins are annotated by different colors (dark blue, colon cancer cell line; dark orange, normal control DNA; light blue, colon cancer tissue; light orange, normal colon tissue). The differences between normal and colon cancer tissue are smaller than those between normal DNA samples and colon cancer cell lines.

Fig. 8. Shown is the relationship between methylation levels of normal samples and the methylation difference to tumor samples. For each amplicon, we calculated the mean methylation value across all six normal DNA samples (plotted on the x axis). We also calculated the mean methylation values for each gene across all cancer cell-line samples. We then use these values to calculate the gene specific methylation difference between normal and cancer cell line samples (plotted on the y axis). For better visualization we plotted horizontal and vertical lines at a 20% methylation cutoff. Genes that are between 20% and 80% methylated in normal samples are annotated in black. Genes that are outside of this range and show a methylation difference <20% are shown in red. Genes that are methylated <20% or >80% and have a methylation difference >20% in cancer cell lines are shown in blue. We calculated the fraction of genes that were hypermethylated (>20% methylation difference) in the group of genes with low methylation values and the fraction of genes that are hypomethylated (greater than -20% methylation difference) in the group of genes that are highly methylated in normal samples. We used these values to perform a Fisher's exact test to evaluate whether those groups are different. The results are given in the lower left corner. Melanoma and CNS neoplasms are the most likely to show a significant difference in the two fractions.

Fig. 9. The group of significantly differentially methylated genes in cancer is divided into subsets according to their number of PCR2 marks. A box-whisker plot reveals that genes with a higher number of PRC2 marks tend to be altered in more tumor types, whereas genes with no PRC2 marks are most connected to only one or two tumor types.
Fig. 10. Shown is the methylation based cluster dendrogram for all samples. NCI-60 cell line names are annotated along the final branches of the dendrogram.
SI Text
Hypo- Versus Hypermethylation. When analyzing the distribution of methylation values, we find that the majority of promoter regions are not methylated. We also find that the most frequent methylation change in the cancer cell lines was hypermethylation. Hence, we ask the question of whether hypermethylation is only the most frequent event because the methylation levels for most genes is low in normal samples and any methylation change is only possible in the hypermethylated direction. We investigated whether it is more likely for a nonmethylated gene to become hypermethylated or for a methylated gene to become hypomethylated in cancer. When analyzing the methylation differences for the group of amplicons that are methylated at low (<20%) or high (>80%) levels in normal samples, we find that it is as likely to observe hypermethylation of low-methylated amplicons as it is to find hypomethylation of highly methylated amplicons (P = 1, Fisher's exact test). Given that amplicon-specific methylation changes might differ significantly between tumor types, we also analyzed each type of tumor cell lines individually. The results differ slightly for each tumor type, but, in general, the analysis confirms previous findings on an individual level (SI Fig. 8). These findings should be seen with the necessary caution and might not be applicable to genome-wide studies. This study is a candidate gene approach of only 430 genes. A large subset of these genes was chosen from literature because they were known to be imprinted. We find that the genes that show hypomethylation in these cancer cell lines are all known to be from this pool of imprinted genes. Consequently, this comparison of methylation changes might be biased by the relative overrepresentation of imprinted genes.
Sequence-Motif Confirmation and Motif Detection. A set of CG-rich sequence motifs has been reported to be enriched in nonmethylated CGIs (1). We divided our amplicons into groups with low (<20%, n = 300) and high (>80%, n = 16) average methylation. The nonmethylated amplicons comprised significantly more of said sequence motifs (P < 0.001, Wilcox Test, mean in the low methylation set = 22, mean in the high methylation set = 8) (SI Fig. 6).
Next, we split the amplicons into two groups based on the observed differences in DNA methylation (DM) between normal and cancer cell-line samples. The group with high average methylation differences (DM >20%; P < 0.001, two-sided t test) comprised 89 amplicon sequences, and the group with low differences comprised 121 amplicon sequences (DM <5%; P >0.05, two-sided t test). During amplicon design, target sequences were preferably selected to be located in CG-rich regions. Hence, we cannot expect an even distribution of sequence motifs, so a permutation method was used to compare the distribution of 6-mer oligonucleotides and identify sequence motifs which are overrepresented in one of the two sets. Five sequence motifs were overrepresented in the set of nondifferentially methylated sequences (ATACCG, ATACTA, ATAGAT, TATACT, TCATGG). A smaller set of four motifs was enriched in the differentially methylated genes (GACCTG, GCCAAG, GTCCCA, TTGAAG). Finally, we used a public database to annotate transcription factor binding sites in each of the two sequence-motif sets (www.cbil.upenn.edu/cgi-bin/tess/tess). Binding sites for Sp1, a zinc finger transcription factor, were enriched in the set of nondifferentially methylated sequences (P = 0.001, Fisher's exact test).
Methods
The PCRs were carried out in a 5-ml format with »10 ng/ml bisulfite-treated DNA, 0.2 units of HotStart TaqDNA polymerase (Qiagen), 1x supplied HotStart buffer, and 200 mM PCR primers. Amplification for the PCR was as follows: preactivation of 95°C for 15 min, 45 cycles of 95°C denaturation for 20 s, 56°C annealing for 30 s, and 72°C extension for 30 s, finishing with a 72°C incubation for 4 min. Dephosphorylation of unincorporated dNTPs was performed by adding 1.7 ml of H2O and 0.3 units of shrimp alkaline phosphatase (Sequenom), incubating at 37°C for 20 min, and then for 10 min at 85°C to deactivate the enzyme. Next, in vivo transcription and RNA cleavage was achieved by adding 2 ml of PCR product to 5 ml of transcription/cleavage reaction and incubating at 37°C for 3 h. The transcription/cleavage reaction contains 27 units of T7 R&DNA polymerase (EpiCentre), 0.64x of T7 R&DNA polymerase buffer, 0.22 ml T Cleavage Mix (Sequenom), 3.14 mM DTT, 3.21 ml H2O, and 0.09 mg/ml RNaseA (Sequenom). The reactions were additionally diluted with 27 ml of H2O and conditioned with 6 mg of CLEAN Resin (Sequenom) for optimal mass-spectra analysis. Samples were then dispensed with the MassARRAY nanodispenser (Samsung) on a 384-well SpectroChip (Sequenom).
1. Das R, et al. (2006) Computational prediction of methylation status in human genomic sequences. Proc Natl Acad Sci USA 103:10713-10716.