Skip to main content
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian
  • Log in
  • My Cart

Main menu

  • Home
  • Articles
    • Current
    • Latest Articles
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • Archive
  • Front Matter
  • News
    • For the Press
    • Highlights from Latest Articles
    • PNAS in the News
  • Podcasts
  • Authors
    • Purpose and Scope
    • Editorial and Journal Policies
    • Submission Procedures
    • For Reviewers
    • Author FAQ
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian

User menu

  • Log in
  • My Cart

Search

  • Advanced search
Home
Home

Advanced Search

  • Home
  • Articles
    • Current
    • Latest Articles
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • Archive
  • Front Matter
  • News
    • For the Press
    • Highlights from Latest Articles
    • PNAS in the News
  • Podcasts
  • Authors
    • Purpose and Scope
    • Editorial and Journal Policies
    • Submission Procedures
    • For Reviewers
    • Author FAQ

New Research In

Physical Sciences

Featured Portals

  • Physics
  • Chemistry
  • Sustainability Science

Articles by Topic

  • Applied Mathematics
  • Applied Physical Sciences
  • Astronomy
  • Computer Sciences
  • Earth, Atmospheric, and Planetary Sciences
  • Engineering
  • Environmental Sciences
  • Mathematics
  • Statistics

Social Sciences

Featured Portals

  • Anthropology
  • Sustainability Science

Articles by Topic

  • Economic Sciences
  • Environmental Sciences
  • Political Sciences
  • Psychological and Cognitive Sciences
  • Social Sciences

Biological Sciences

Featured Portals

  • Sustainability Science

Articles by Topic

  • Agricultural Sciences
  • Anthropology
  • Applied Biological Sciences
  • Biochemistry
  • Biophysics and Computational Biology
  • Cell Biology
  • Developmental Biology
  • Ecology
  • Environmental Sciences
  • Evolution
  • Genetics
  • Immunology and Inflammation
  • Medical Sciences
  • Microbiology
  • Neuroscience
  • Pharmacology
  • Physiology
  • Plant Biology
  • Population Biology
  • Psychological and Cognitive Sciences
  • Sustainability Science
  • Systems Biology

Significance analysis of microarrays applied to the ionizing radiation response

Virginia Goss Tusher, Robert Tibshirani, and Gilbert Chu
PNAS April 24, 2001 98 (9) 5116-5121; https://doi.org/10.1073/pnas.091062498
Virginia Goss Tusher
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Robert Tibshirani
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Gilbert Chu
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  1. Communicated by Bradley Efron, Stanford University, Stanford, CA (received for review December 1, 2000)

  • Article
  • Figures & SI
  • Info & Metrics
  • PDF
Loading

Abstract

Microarrays can measure the expression of thousands of genes to identify changes in expression between different biological states. Methods are needed to determine the significance of these changes while accounting for the enormous number of genes. We describe a method, Significance Analysis of Microarrays (SAM), that assigns a score to each gene on the basis of change in gene expression relative to the standard deviation of repeated measurements. For genes with scores greater than an adjustable threshold, SAM uses permutations of the repeated measurements to estimate the percentage of genes identified by chance, the false discovery rate (FDR). When the transcriptional response of human cells to ionizing radiation was measured by microarrays, SAM identified 34 genes that changed at least 1.5-fold with an estimated FDR of 12%, compared with FDRs of 60 and 84% by using conventional methods of analysis. Of the 34 genes, 19 were involved in cell cycle regulation and 3 in apoptosis. Surprisingly, four nucleotide excision repair genes were induced, suggesting that this repair pathway for UV-damaged DNA might play a previously unrecognized role in repairing DNA damaged by ionizing radiation.

DNA microarrays contain oligonucleotide or cDNA probes for measuring the expression of thousands of genes in a single hybridization experiment. Although massive amounts of data are generated, methods are needed to determine whether changes in gene expression are experimentally significant. Cluster analysis of microarray data can find coherent patterns of gene expression (1) but provides little information about statistical significance. Methods based on conventional t tests provide the probability (P) that a difference in gene expression occurred by chance (2, 3). Although P = 0.01 is significant in the context of experiments designed to evaluate small numbers of genes, a microarray experiment for 10,000 genes would identify 100 genes by chance. This problem led us to develop a statistical method adapted specifically for microarrays, Significance Analysis of Microarrays (SAM).

SAM identifies genes with statistically significant changes in expression by assimilating a set of gene-specific t tests. Each gene is assigned a score on the basis of its change in gene expression relative to the standard deviation of repeated measurements for that gene. Genes with scores greater than a threshold are deemed potentially significant. The percentage of such genes identified by chance is the false discovery rate (FDR). To estimate the FDR, nonsense genes are identified by analyzing permutations of the measurements. The threshold can be adjusted to identify smaller or larger sets of genes, and FDRs are calculated for each set. To demonstrate its utility, SAM was used to analyze a biologically important problem: the transcriptional response of lymphoblastoid cells to ionizing radiation (IR).

Materials and Methods

Preparation of RNA.

Human lymphoblastoid cell lines GM14660 and GM08925 (Coriell Cell Repositories, Camden, NJ) were seeded at 2.5 × 105 cells/ml and exposed to IR 24 h later. RNA was isolated, labeled, and hybridized to the HuGeneFL GeneChip microarray according to manufacturer's protocols (Affymetrix, Santa Clara, CA).

Microarray Hybridization.

Each gene in the microarray was represented by 20 oligonucleotide pairs, each pair consisting of an oligonucleotide perfectly matched to the cDNA sequence, and a second oligonucleotide containing a single base mismatch. Because gene expression was computed from differences in hybridization to the matched and mismatched probes, expression levels were sometimes reported by the GeneChip analysis suite software as negative numbers.

Northern Blot Hybridization.

Total RNA (15 μg) was resolved by agarose gel electrophoresis, transferred to a nylon membrane, and hybridized to specific radiolabeled DNA probes, which were prepared by PCR amplification.

Results

RNA was harvested from wild-type human lymphoblastoid cell lines, designated 1 and 2, growing in an unirradiated state (U) or in an irradiated state (I) 4 h after exposure to a modest dose of 5 Gy of IR. RNA samples were labeled and divided into two identical aliquots for independent hybridizations, A and B. Thus, data for 6,800 genes on the microarray were generated from eight hybridizations (U1A, U1B, U2A, U2B, I1A, I1B, I2A, and I2B).

We scaled the data from different hybridizations as follows. A reference data set was generated by averaging the expression of each gene over all eight hybridizations. The data for each hybridization were compared with the reference data set in a cube root scatter plot. We chose the cube root scatter plot because it resolved the vast majority of genes that are expressed at low levels and permitted the inclusion of negative levels of expression that are sometimes generated by the GeneChip software. A linear least-squares fit to the cube root scatter plot was then used to calibrate each hybridization.

After scaling, a linear scatter plot was generated for average gene expression in the four A aliquots (U1A, I1A, U2A, and U2A) vs. the average in the four B aliquots (U1B, I1B, U2B, and U2B), a partitioning of the data that eliminates biological changes in gene expression (Fig. 1A). The linear scatter plot confirmed that the data were generally reproducible but failed to resolve genes expressed at low levels. Better resolution of these genes was achieved by the cube root scatter plot (Fig. 1B), which revealed three salient features: the large percentage of genes (24%) assigned negative levels of expression, the large percentage of genes with low levels of expression, and the low signal-to-noise ratio at low levels of expression.

Figure 1
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 1

Gene expression measured by microarrays. (A) Linear scatter plot of gene expression. Each gene (i) in the microarray is represented by a point with coordinates consisting of average gene expression measured from the four A hybridizations (avg xA) and the average gene expression in the four B hybridizations (avg xB). (B) Cube root scatter plot of gene expression. The average gene expression from the A and B hybridizations have been plotted on a cube root scale to resolve genes expressed at low levels. (C) Cube root scatter plot of average gene expression from the four hybridizations with uninduced cells (avg xU) and induced cells 4 h after exposure to 5 Gy of IR (avg xI). Some of the genes that responded to IR are indicated by arrows.

To assess the biological effect of IR, a scatter plot was generated for average gene expression in the four irradiated states vs. the four unirradiated states (compare Fig. 1 B and C). A few of the potentially significant changes in gene expression are indicated by arrows in Fig. 1C, but the effect was not easily quantified, and a method was needed to identify changes with statistical confidence.

Our approach was based on analysis of random fluctuations in the data. In general, the signal-to-noise ratio decreased with decreasing gene expression (Fig. 1). However, even for a given level of expression, we found that fluctuations were gene specific. To account for gene-specific fluctuations, we defined a statistic based on the ratio of change in gene expression to standard deviation in the data for that gene. The “relative difference” d(i) in gene expression is: Math1 where x̄I(i) and x̄U(i) are defined as the average levels of expression for gene (i) in states I and U, respectively. The “gene-specific scatter” s(i) is the standard deviation of repeated expression measurements: Math2 where Σm and Σn are summations of the expression measurements in states I and U, respectively, a = (1/n1 + 1/n2)/(n1 + n2 − 2), and n1 and n2 are the numbers of measurements in states I and U (four in this experiment).

To compare values of d(i) across all genes, the distribution of d(i) should be independent of the level of gene expression. At low expression levels, variance in d(i) can be high because of small values of s(i). To ensure that the variance of d(i) is independent of gene expression, we added a small positive constant s0 to the denominator of Eq. 1. The coefficient of variation of d(i) was computed as a function of s(i) in moving windows across the data. The value for s0 was chosen to minimize the coefficient of variation. For the data in this paper, this computation yielded s0 = 3.3.

Scatter plots of d(i) vs. s(i) are shown in Fig. 2. The scatter plot for relative difference between states I and U is shown in Fig. 2A. By contrast, the scatter plot for relative difference between cell lines 1 and 2 shows more marked changes in Fig. 2B. These relative differences exceeded random fluctuations in the data, as measured by the relative difference between hybridizations A and B in Fig. 2C.

Figure 2
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 2

Scatter plots of relative difference in gene expression d(i) vs. gene-specific scatter s(i). The data were partitioned to calculate d(i), as indicated by the bar codes. The shaded and unshaded entries were used for the first and second terms in the numerator of d(i) in Eq. 1. (A) Relative difference between irradiated and unirradiated states. The statistic d(i) was computed from expression measurements partitioned between irradiated and unirradiated cells. (B) Relative difference between cell lines 1 and 2. The statistic d(i) was computed from expression measurements partitioned between cell lines 1 and 2. (C) Relative difference between hybridizations A and B. The statistic d(i) was computed from the permutation in which the expression measurements were partitioned between the equivalent hybridizations A and B. (D) Relative difference for a permutation of the data that was balanced between cell lines 1 and 2.

Although the relative difference computed from hybridizations A and B provided a control for random fluctuations, additional controls were needed to assign statistical significance to the biological effect of IR. Instead of performing more experiments, which are expensive and labor intensive, we generated a large number of controls by computing relative differences from permutations of the hybridizations for the four irradiated and four unirradiated states. To minimize potentially confounding effects from differences between the two cell lines, we analyzed the data by using the 36 permutations that were balanced for cell lines 1 and 2. Permutations were defined as balanced when each group of four experiments contained two experiments from cell line 1 and two experiments from cell line 2. Fig. 2 C and D are examples of balanced permutations.

To find significant changes in gene expression, genes were ranked by magnitude of their d(i) values, so that d(1) was the largest relative difference, d(2) was the second largest relative difference, and d(i) was the ith largest relative difference. For each of the 36 balanced permutations, relative differences dp(i) were also calculated, and the genes were again ranked such that dp(i) was the ith largest relative difference for permutation p. The expected relative difference, dE(i), was defined as the average over the 36 balanced permutations, dE(i) = Σpdp(i)/36.

To identify potentially significant changes in expression, we used a scatter plot of the observed relative difference d(i) vs. the expected relative difference dE(i) (Fig. 3A). For the vast majority of genes, d(i) ≅ dE(i), but some genes are represented by points displaced from the d(i) = dE(i) line by a distance greater than a threshold Δ. For example, the threshold Δ = 1.2 illustrated by the broken lines in Fig. 3A yielded 46 genes that were “called significant.” These 46 genes are shown in the context of the scatter plot for d(i) vs. s(i) (Fig. 3B) and in the scatter plot for the cube root of gene expression x̄I(i) vs. x̄U(i) (Fig. 3C). Genes identified by d(i) do not necessarily have the largest changes in gene expression.

Figure 3
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 3

Identification of genes with significant changes in expression. (A) Scatter plot of the observed relative difference d(i) versus the expected relative difference dE(i). The solid line indicates the line for d(i) = dE(i), where the observed relative difference is identical to the expected relative difference. The dotted lines are drawn at a distance Δ = 1.2 from the solid line. (B) Scatter plot of d(i) vs. s(i). (C) Cube root scatter plot of average gene expression in induced and uninduced cells. The cutoffs for 2-fold induction and repression are indicated by the dashed lines. In A–C, the 46 potentially significant genes for Δ = 1.2 are indicated by the squares.

To determine the number of falsely significant genes generated by SAM, horizontal cutoffs were defined as the smallest d(i) among the genes called significantly induced and the least negative d(i) among the genes called significantly repressed. The number of falsely significant genes corresponding to each permutation was computed by counting the number of genes that exceeded the horizontal cutoffs for induced and repressed genes. The estimated number of falsely significant genes was the average of the number of genes called significant from all 36 permutations. For Δ = 1.2, the permuted data sets generated an average of 8.4 falsely significant genes, compared with 46 genes called significant, yielding an estimated FDR of 18% (Table 1). As Δ decreased, the number of genes called significant by SAM increased but at the cost of an increasing FDR. (Omitting s0 from Eq. 1 produced higher FDRs of 45, 35, and 28% for Δ = 0.6, 0.9, and 1.2.)

View this table:
  • View inline
  • View popup
Table 1

Comparison of methods for identifying changes in gene expression

Our method for setting thresholds provides asymmetric cutoffs for induced and repressed genes. The alternative is the standard t test, which imposes a symmetric horizontal cutoff, with d(i) > c for induced genes and d(i) <− c for repressed genes. However, the asymmetric cutoff is preferred because it allows for the possibility that d(i) for induced and repressed genes may behave differently in some biological experiments.

SAM proved to be superior to conventional methods for analyzing microarrays (Table 1 and Fig. 4A). First, SAM was compared with the approach of identifying genes as significantly changed if an R-fold change was observed. In this “fold change” method, r(i) = x̄I(i)/x̄U(i), and gene (i) was called significantly changed if r(i) > R or r(i) < 1/R. To permit computation of r(i) from negative values for gene expression, x̄I(i) and x̄U(i) were converted to 10 when their values were negative or less than 10. The results of this procedure yielded unacceptably high FDRs of 73–84%.

Figure 4
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 4

Comparison of SAM to conventional methods for analyzing microarrays. (A) Falsely significant genes plotted against number of genes called significant. Of the 57 genes most highly ranked by the fold change method, 5 were included among the 46 genes most highly ranked by SAM. Of the 38 genes most highly ranked by the pairwise fold change method, 11 were included among the 46 genes most highly ranked by SAM. These results were consistent with the FDR of SAM compared to the FDRs of the fold change and pairwise fold change methods. (B) Northern blot validation for genes identified by the fold change method. Values of r(i) are plotted for genes chosen at random from the 57 genes most highly ranked by the fold change method. (C) Validation for genes identified by SAM. Results are plotted for genes chosen at random from the 46 genes most highly ranked by SAM. Genes analyzed by Northern blot are represented by circles. TNF-α was validated by using a PreDeveloped TaqMan assay (PE Biosystems) and is represented by a square. The straight lines in B and C indicate the position of exact agreement between Northern blot and microarray results.

Another approach attempts to account for uncertainty in the data by identifying genes as significantly changed if an R-fold change is observed consistently between paired samples (4). To apply this “pairwise fold change” method to our four data sets before IR and four data sets after IR, changes in gene expression were declared significant if 12 of 16 pairings satisfied the criteria r(i) > R or r(i) < 1/R. Despite the demand for consistent changes between paired samples, this method yielded FDRs of 60–71%.

To understand why fold-change methods fail, note that the vast majority of genes are expressed at low levels where the signal-to-noise ratio is very low (Fig. 3C). Thus, 2-fold changes in gene expression occur at random for a large number of genes. Conversely, for higher levels of expression, smaller changes in gene expression may be real, but these changes are rejected by fold-change methods. The pairwise fold-change method provides modest improvement but remains inferior to SAM.

Of the 46 genes most highly ranked by SAM (Δ = 1.2), 36 increased or decreased at least 1.5-fold (R = 1.5). The number of falsely significant genes that met these two criteria was 4.5, corresponding to a FDR of 12% (Table 1). Fas was identified three times as alternately spliced forms, leaving 34 independent genes (Table 2). As an indication of biological validity, 10 of the 34 genes have been reported in the literature as part of the transcriptional response to IR. TNF-α was reported to be induced by other investigators (5) but was repressed here. Quantitative reverse transcription–PCR confirmed this result.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 2

Genes with changes in expression called significant by SAM

To test the validity of SAM directly, we performed Northern blots for genes that were randomly selected from the 46 and 57 genes most highly ranked by SAM (Δ = 1.2) and the fold-change method (at least 3.6-fold change), respectively. Northern blots showed little correlation with the genes identified by the fold change method (Fig. 4B), but strong correlation with the genes identified by SAM (Fig. 4C). Indeed, Northern blots contradicted only 1 (maxiK) of 11 genes identified by SAM, consistent with our estimated FDR.

Nineteen of the 34 genes most highly ranked by SAM appear to be involved in the cell cycle. Three are known to be induced in a p53-dependent manner: p21, cyclin G1, and mdm2 (6–8). Six cell cycle genes were repressed: E2-EPF, p55cdc, cyclin B, ckshs2, cdc25, and wee1 (9, 10). Five genes encoding the mitotic machinery were also repressed: PLK-1, MKLP-1, MCAK, C-TAK1, CENP-E (11–13). Three genes involved in cell proliferation were induced or repressed: PTP(CAAX1), LPAP, and c-myc (14–18). Some responses appeared paradoxical. For example, cdc25 phosphatase and wee1 kinase have antagonistic effects on the phosphorylation state of cdc2, but both genes were repressed. Repression of these genes together with the mitotic genes may represent a damage response that dismantles the cell cycle machinery until the cell has repaired the damaged DNA.

Four of the 34 genes play roles in DNA repair, but none are involved in the repair of IR-induced double-strand breaks. Instead, the genes (p48, XPC, gadd45, PCNA) have roles in nucleotide excision repair, a pathway conventionally associated with UV-induced damage (19–22). We confirmed the induction of these genes by Northern blot (23–25). Fornace et al. reported defective removal of base damage induced by IR in xeroderma pigmentosum cells (26). Leadon et al. reported that a novel DNA repair pathway involving long excision repair patches of at least 150 nucleotides is activated by IR but not UV (27). Our results suggest that this novel pathway might include p48, XPC, gadd45, and PCNA.

Four of the 34 genes play roles in apoptosis (Fas, bbc3, TNF-α, OX40 ligand). The remaining genes may have previously unsuspected roles in the DNA damage response or may be among the estimated set of four falsely detected genes.

The 34 genes most highly ranked by SAM are only a subset of all of the genes that change 1.5-fold with IR. Indeed, we calculated the difference between the number of genes called significant and the number of falsely significant genes for decreasing Δ = 0.3, 0.2, and 0.1, and found the differences to be 92, 170, and 184, respectively. Thus, SAM suggests that approximately 180 of the 6,800 genes on the microarray were induced or repressed by 5 Gy IR.

Discussion

SAM is a method for identifying genes on a microarray with statistically significant changes in expression, developed in the context of an actual biological experiment. SAM was successful in analyzing this experiment as well as several other experiments with oligonucleotide and cDNA microarrays (data not shown).

In the statistics of multiple testing (28–30), the family-wise error rate (FWER) is the probability of at least one false positive over the collection of tests. The Bonferroni method, the most basic method for bounding the FWER, assumes independence of the different tests. An acceptable FWER could be achieved for our microarray data only if the corresponding threshold was set so high that no genes were identified. The step-down correction method of Westfall and Young (29), adapted for microarrays by Dudoit et al. (http://www.stat.berkeley.edu/users/terry/zarray/Html/matt.html), allows for dependent tests but still remains too stringent, yielding no genes from our data.

Westfall and Young (29) define “weak control” to be control of the FWER when all of the null hypotheses are true (i.e., when there are no changes in gene expression). “Strong control” is control of the FWER when any subset of the null hypotheses is true. Under certain conditions, weak control implies strong control. In fact, the step-down correction method exerts both weak and strong control.

The method of Benjamini and Hochberg (31) assumes independent tests and guarantees an upper bound for the FDR (with both weak and strong control) by a step-up or step-down procedure applied to the individual P values. For our data, the P value for each gene is calculated from permutations of the eight experiments. Because of the limited number of permutations, the FDR is too “granular”, and we identified either zero or 300 significant genes, depending on how the P value was defined. A similar granular result was obtained for the adaptation to dependent tests by Benjamini et al. [The Control of the False Discovery Rate in Multiple Testing Under Dependency (Department of Statistics and Operations Research, Tel Aviv University, Tel Aviv). http://www.math.tau.ac.il/∼ybenja/].

SAM does not have strong or weak control of the FWER. Instead, SAM provides an estimate of the FDR for each value of the tuning parameter Δ. The estimated FDR is computed from permutations of the data and hence assumes that all null hypotheses are true, allowing for the possibility of dependent tests. It seems plausible that this estimated FDR approximates the strongly controlled FDR when any subset of null hypotheses is true. However, we have not proven this in general. It is possible for SAM to give an estimate of the FDR that is greater than 1. However, this has not occurred in our experience. Indeed, SAM provides a reasonably accurate estimate for the true FDR. To confirm this, we constructed artificial data sets in which a subset of genes was induced over a background of noise. SAM successfully identified the induced genes and estimated the FDR with reasonable accuracy.

Although this paper analyzes a simple two-state experiment, SAM can be generalized to other types of experiments by defining d(i) in a different way. Suppose the data includes gene expression xj(i) and a response parameter yj, in which i = 1, 2, … , m genes, j = 1, 2, … , n states. The generalized statistical parameter still takes the form d(i) = r(i)/[s(i) + s0], except that the definitions of r(i) and s(i) change.

To identify genes with changes in expression in an experiment with three or more states, the parameter d(i) is defined in terms of the Fisher's linear discriminant. One goal might be to identify genes whose expression in one type of tumor is different from its expression in other types of tumors. Suppose that a set of n samples consists of K nonoverlapping subsets, such that the response parameter yj ε {1, … , K}. Define C(k) = {j : yj = k}. Let nk = number of observations in C(k). The average gene expression in each subset is x̄k(i) = Σj∈C(k)xj(i)/nk and the average gene expression for all n samples is x̄(i) = Σjxj(i)/n. Then define: Math3 Math4 SAM can be adapted for still other types of experimental data. For example, to identify genes whose expression correlates with survival time, d(i) is defined in terms of Cox's proportional hazards function, in which some of the patients remain alive or are lost to follow-up at the time of the study. To identify genes whose expression correlates with a quantitative parameter, such as tumor stage, d(i) can be defined in terms of the Pearson correlation coefficient. Another example includes the definition of d(i) for paired data, such as gene expression in tumors before and after chemotherapy. In each case, the FDR is estimated by random permutation of the data for gene expression among the different experimental arms, i.e., permutations among the n arms of yj. Thus, SAM is a robust and straightforward method that can be adapted to a broad range of experimental situations. SAM and the adaptations discussed above are available for use at http://www-stat-class.stanford.edu/SAM/SAMServlet.

Acknowledgments

We thank Peter Jackson, Ron Davis, James Ferrell, Dean Felsher, Lisa DeFazio, Joe Budman, Jean Tang, Tom Tan, and Kerri Rieger for helpful discussions. This work was supported by the Burroughs Wellcome Clinical Scientist Award and by National Institutes of Health (NIH) Grant CA77302 to G.C., by NIH Small Business Technology Transfer grant CA75675 to G.C. and Affymetrix, and by the Stanford Genome Training Grant to V.T.

Footnotes

    • ↵‡ To whom reprint requests should be addressed. E-mail: chu{at}cmgm.stanford.edu.

    Abbreviations

    SAM,
    significance analysis of microarrays;
    FDR,
    false discovery rate;
    IR,
    ionizing radiation;
    FWER,
    family-wise error rate
    • Received December 1, 2000.
    • Accepted February 6, 2001.
    • Copyright © 2001, The National Academy of Sciences

    References

    1. ↵
      1. Eisen M,
      2. Spellman P,
      3. Brown P,
      4. Botstein D
      (1998) Proc Natl Acad Sci USA 95:14863–14868, pmid:9843981.
      OpenUrlAbstract/FREE Full Text
    2. ↵
      1. Roberts C,
      2. Nelson B,
      3. Marton M,
      4. Stoughton R,
      5. Meyer M,
      6. Bennett H,
      7. He Y,
      8. Dai H,
      9. Walker W,
      10. Hughes T,
      11. Tyers M,
      12. Boone C,
      13. Friend S
      (2000) Science 287:873–880, pmid:10657304.
      OpenUrlAbstract/FREE Full Text
    3. ↵
      1. Galitski T,
      2. Saldanha A,
      3. Styles C,
      4. Lander E,
      5. Fink G
      (1999) Science 285:251–254, pmid:10398601.
      OpenUrlAbstract/FREE Full Text
    4. ↵
      1. Ly D,
      2. Lockhart D,
      3. Lerner R,
      4. Schultz P
      (2000) Science 287:2486–2492, pmid:10741968.
      OpenUrlAbstract/FREE Full Text
    5. ↵
      1. Weill D,
      2. Gay F,
      3. Tovey M,
      4. Chouaib S
      (1996) J Interferon Cytokine Res 16:395–402, pmid:8727080.
      OpenUrlCrossRefPubMed
    6. ↵
      1. Harper J W,
      2. Adami G R,
      3. Wei N,
      4. Keyomarsi K,
      5. Elledge S J
      (1993) Cell 75:805–816, pmid:8242751.
      OpenUrlCrossRefPubMed
      1. Okamoto K,
      2. Beach D
      (1994) EMBO J 13:4816–4822, pmid:7957050.
      OpenUrlPubMed
    7. ↵
      1. Prives C
      (1998) Cell 95:5–8, pmid:9778240.
      OpenUrlCrossRefPubMed
    8. ↵
      1. Furnari B,
      2. Rhind N,
      3. Russell P
      (1997) Science 277:1495–1497, pmid:9278510.
      OpenUrlAbstract/FREE Full Text
    9. ↵
      1. Liu Z,
      2. Diaz L,
      3. Haas A,
      4. Giudice G
      (1992) J Biol Chem 267:15829–15835, pmid:1379239.
      OpenUrlAbstract/FREE Full Text
    10. ↵
      1. Lee K,
      2. Yuan Y,
      3. Kuriyama R,
      4. Erikson R
      (1995) Mol Cell Biol 15:7143–7151, pmid:8524282.
      OpenUrlAbstract/FREE Full Text
      1. Maney T,
      2. Hunter A,
      3. Wagenbach M,
      4. Wordeman L
      (1998) J Cell Biol 142:787–801, pmid:9700166.
      OpenUrlAbstract/FREE Full Text
    11. ↵
      1. Wood K,
      2. Sakowicz R,
      3. Goldstein L,
      4. Cleveland D
      (1997) Cell 91:357–366, pmid:9363944.
      OpenUrlCrossRefPubMed
    12. ↵
      1. Ding I,
      2. Bruyns E,
      3. Li P,
      4. Magada D,
      5. Paskind M,
      6. Rodman L,
      7. Seshadri T,
      8. Alexander D,
      9. Giese T,
      10. Schraven B
      (1999) Eur J Immunol 29:3956–3961, pmid:10602004.
      OpenUrlCrossRefPubMed
      1. Cates C,
      2. Michael R,
      3. Stayrook K,
      4. Harvey K,
      5. Burke Y,
      6. Randall S,
      7. Crowell P,
      8. Crowell D
      (1996) Cancer Lett 110:49–55, pmid:9018080.
      OpenUrlCrossRefPubMed
      1. Godfrey W,
      2. Fagnoni R,
      3. Harara M,
      4. Buck D,
      5. Engleman E
      (1994) J Exp Med 180:757–762, pmid:7913952.
      OpenUrlAbstract/FREE Full Text
      1. Lord J,
      2. McIntosh B,
      3. Greenberg P,
      4. Nelson B
      (2000) J Immunol 164:2533–2541, pmid:10679091.
      OpenUrlAbstract/FREE Full Text
    13. ↵
      1. Prevot D,
      2. Voeltzel T,
      3. Birot A,
      4. Morel A,
      5. Rostan M,
      6. Magaud J,
      7. Corbo L
      (2000) J Biol Chem 275:147–153, pmid:10617598.
      OpenUrlAbstract/FREE Full Text
    14. ↵
      1. Aboussekhra A,
      2. Biggerstaff M,
      3. Shivji M,
      4. Vilpo J,
      5. Moncollin V,
      6. Podust V,
      7. Protic M,
      8. Hubscher U,
      9. Egly J,
      10. Wood R
      (1995) Cell 80:859–868, pmid:7697716.
      OpenUrlCrossRefPubMed
      1. Smith M,
      2. Ford J,
      3. Hollander M,
      4. Bortnick R,
      5. Amundson S,
      6. Seo Y,
      7. Deng C,
      8. Hanawalt P,
      9. Fornace A J
      (2000) Mol Cell Biol 20:3705–3714, pmid:10779360.
      OpenUrlAbstract/FREE Full Text
      1. Sugasawa K,
      2. Ng J,
      3. Masutani C,
      4. Iwai S,
      5. van der Spek P,
      6. Eker A,
      7. Hanaoka F,
      8. Bootsma D,
      9. Hoeijmakers J
      (1998) Mol Cell 2:223–232, pmid:9734359.
      OpenUrlCrossRefPubMed
    15. ↵
      1. Tang J,
      2. Hwang B,
      3. Ford J,
      4. Hanawalt P,
      5. Chu G
      (2000) Mol Cell 5:737–744, pmid:10882109.
      OpenUrlCrossRefPubMed
    16. ↵
      1. Kastan M,
      2. Zhan Q,
      3. El-Deiry F,
      4. Jacks T,
      5. Walsh W,
      6. Plunkett B,
      7. Vogelstein B,
      8. Fornace A
      (1992) Cell 71:587–597, pmid:1423616.
      OpenUrlCrossRefPubMed
      1. Hwang B J,
      2. Ford J,
      3. Hanawalt P C,
      4. Chu G
      (1999) Proc Natl Acad Sci USA 96:424–428, pmid:9892649.
      OpenUrlAbstract/FREE Full Text
    17. ↵
      1. Xu J,
      2. Morris G
      (1999) Mol Cell Biol 19:12–20, pmid:9858527.
      OpenUrlAbstract/FREE Full Text
    18. ↵
      1. Fornace A,
      2. Dobson P,
      3. Kinsella T
      (1986) Radiat Res 106:73–77, pmid:3961106.
      OpenUrlCrossRefPubMed
    19. ↵
      1. Leadon S,
      2. Dunn A,
      3. Ross C
      (1996) Radiat Res 146:123–130, pmid:8693061.
      OpenUrlCrossRefPubMed
    20. ↵
      1. Hochberg Y,
      2. Tamhane A
      (1987) Multiple Comparisons Procedures (Wiley, New York).
    21. ↵
      1. Westfall P,
      2. Young S
      (1993) Resampling-Based Multiple Testing (Wiley, New York).
    22. ↵
      1. Hsu J
      (1996) Multiple Comparisons: Theory and Methods (Chapman & Hall, London).
    23. ↵
      1. Benjamini Y,
      2. Hochberg Y
      (1995) J R Stat Soc B 57:289–300.
      OpenUrl
    View Abstract
    PreviousNext
    Back to top
    Article Alerts
    Email Article

    Thank you for your interest in spreading the word on PNAS.

    NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

    Enter multiple addresses on separate lines or separate them with commas.
    Significance analysis of microarrays applied to the ionizing radiation response
    (Your Name) has sent you a message from PNAS
    (Your Name) thought you would like to see the PNAS web site.
    Citation Tools
    Significance analysis of microarrays applied to the ionizing radiation response
    Virginia Goss Tusher, Robert Tibshirani, Gilbert Chu
    Proceedings of the National Academy of Sciences Apr 2001, 98 (9) 5116-5121; DOI: 10.1073/pnas.091062498

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    Request Permissions
    Share
    Significance analysis of microarrays applied to the ionizing radiation response
    Virginia Goss Tusher, Robert Tibshirani, Gilbert Chu
    Proceedings of the National Academy of Sciences Apr 2001, 98 (9) 5116-5121; DOI: 10.1073/pnas.091062498
    del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
    • Tweet Widget
    • Facebook Like
    • Mendeley logo Mendeley

    More Articles of This Classification

    Physical Sciences

    • Interferometric plasmonic imaging and detection of single exosomes
    • Stable Frank–Kasper phases of self-assembled, soft matter spheres
    • Two-dimensional dry ices with rich polymorphic and polyamorphic phase behavior
    Show more

    Statistics

    • Jackknife approach to the estimation of mutual information
    • Projection pursuit in high dimensions
    • Asymptotic theory of rerandomization in treatment–control experiments
    Show more

    Biological Sciences

    • Critical role for the Ly49 family of class I MHC receptors in adaptive natural killer cell responses
    • VEGF-B is a potent antioxidant
    • l-Serine–modified polyamidoamine dendrimer as a highly potent renal targeting drug carrier
    Show more

    Genetics

    • Cancer-driving H3G34V/R/D mutations block H3K36 methylation and H3K36me3–MutSα interaction
    • Erroneous ribosomal RNAs promote the generation of antisense ribosomal siRNA
    • Parp3 promotes long-range end joining in murine cells
    Show more

    Related Content

    • No related articles found.
    • Scopus
    • PubMed
    • Google Scholar

    Cited by...

    • Photoaffinity-engineered protein scaffold for systematically exploring native phosphotyrosine signaling complexes in tumor samples
    • Ancient convergent losses of Paraoxonase 1 yield potential risks for modern marine mammals
    • Metabolic pathways and immunometabolism in rare kidney diseases
    • Pax6 regulation of Sox9 in the mouse retinal pigmented epithelium controls its timely differentiation and choroid vasculature development
    • Role of Cnot6l in maternal mRNA turnover
    • The {beta}3-integrin endothelial adhesome regulates microtubule-dependent cell migration
    • GFPT2-Expressing Cancer-Associated Fibroblasts Mediate Metabolic Reprogramming in Human Lung Adenocarcinoma
    • Phosphorylation of human TRM9L integrates multiple stress-signaling pathways for tumor growth suppression
    • The Transcriptional Regulator BpsR Controls the Growth of Bordetella bronchiseptica by Repressing Genes Involved in Nicotinic Acid Degradation
    • Mistimed food intake and sleep alters 24-hour time-of-day patterns of the human plasma proteome
    • Promyelocytic Leukemia Protein (PML) Requirement for Interferon-induced Global Cellular SUMOylation
    • Proteomics and C9orf72 neuropathology identify ribosomes as poly-GR/PR interactors driving toxicity
    • MYB30 links ROS signaling, root cell elongation, and plant immune responses
    • Heterologous Expression of AtBBX21 Enhances the Rate of Photosynthesis and Alleviates Photoinhibition in Solanum tuberosum
    • Antisecretory Factor-Mediated Inhibition of Cell Volume Dynamics Produces Antitumor Activity in Glioblastoma
    • Peptide Level Turnover Measurements Enable the Study of Proteoform Dynamics
    • Cross-Talk between Myeloid-Derived Suppressor Cells and Mast Cells Mediates Tumor-Specific Immunosuppression in Prostate Cancer
    • GDV1 induces sexual commitment of malaria parasites by antagonizing HP1-dependent gene silencing
    • Race Disparities in the Contribution of miRNA Isoforms and tRNA-Derived Fragments to Triple-Negative Breast Cancer
    • An Expanded Role for the RFX Transcription Factor DAF-19, with Dual Functions in Ciliated and Nonciliated Neurons
    • Fatty acid synthase mediates EGFR palmitoylation in EGFR mutated non-small cell lung cancer
    • Zc3h13/Flacc is required for adenosine methylation by bridging the mRNA-binding factor Rbm15/Spenito to the m6A machinery component Wtap/Fl(2)d
    • Divergence in Gene Regulation Contributes to Sympatric Speciation of Shewanella baltica Strains
    • Epstein-Barr virus-associated primary nodal T/NK-cell lymphoma shows a distinct molecular signature and copy number changes
    • Stage-Specific Gene Profiling of Germinal Cells Helps Delineate the Mitosis/Meiosis Transition
    • Compartment-resolved Proteomic Analysis of Mouse Aorta during Atherosclerotic Plaque Formation Reveals Osteoclast-specific Protein Expression
    • Bioinformatics Analysis of Differential Innate Immune Signaling in Macrophages by Wild-Type Vaccinia Mature Virus and a Mutant Virus with a Deletion of the A26 Protein
    • {gamma}{delta} T Cells Are Required for the Induction of Sterile Immunity during Irradiated Sporozoite Vaccinations
    • The kinesin spindle protein inhibitor filanesib enhances the activity of pomalidomide and dexamethasone in multiple myeloma
    • An improved analysis methodology for translational profiling by microarray
    • Transcriptomic and Proteomic Profiling Provides Insight into Mesangial Cell Function in IgA Nephropathy
    • Oncogenic KRAS and p53 Loss Drive Gastric Tumorigenesis in Mice That Can Be Attenuated by E-Cadherin Expression
    • Integrative CAGE and DNA Methylation Profiling Identify Epigenetically Regulated Genes in NSCLC
    • The Capacity of Mycobacterium tuberculosis To Survive Iron Starvation Might Enable It To Persist in Iron-Deprived Microenvironments of Human Granulomas
    • When High Throughput Meets Mechanistic Studies: A State-of-the-Art Approach in Brugada Syndrome
    • The Longitudinal Transcriptional Response to Neoadjuvant Chemotherapy with and without Bevacizumab in Breast Cancer
    • Infection Exposure Promotes ETV6-RUNX1 Precursor B-cell Leukemia via Impaired H3K4 Demethylases
    • Feasibility of Ultra-High-Throughput Functional Screening of Melanoma Biopsies for Discovery of Novel Cancer Drug Combinations
    • Unsupervised Clustering of Quantitative Image Phenotypes Reveals Breast Cancer Subtypes with Distinct Prognoses and Molecular Pathways
    • Lung tumors with distinct p53 mutations respond similarly to p53 targeted therapy but exhibit genotype-specific statin sensitivity
    • A PAM50-Based Chemoendocrine Score for Hormone Receptor-Positive Breast Cancer with an Intermediate Risk of Relapse
    • Repression of phosphatidylinositol transfer protein {alpha} ameliorates the pathology of Duchenne muscular dystrophy
    • The Agr-Like Quorum Sensing System Is Required for Pathogenesis of Necrotic Enteritis Caused by Clostridium perfringens in Poultry
    • Methods, Tools and Current Perspectives in Proteogenomics
    • Morphoproteomic Characterization of Lung Squamous Cell Carcinoma Fragmentation, a Histological Marker of Increased Tumor Invasiveness
    • Proteomic Analysis of Sera from Individuals with Diffuse Cutaneous Systemic Sclerosis Reveals a Multianalyte Signature Associated with Clinical Improvement during Imatinib Mesylate Treatment
    • Intact Pneumococci Trigger Transcription of Interferon-Related Genes in Human Monocytes, while Fragmented, Autolyzed Bacteria Subvert This Response
    • Distinctive Histogenesis and Immunological Microenvironment Based on Transcriptional Profiles of Follicular Dendritic Cell Sarcomas
    • Characterization of an Abiraterone Ultraresponsive Phenotype in Castration-Resistant Prostate Cancer Patient-Derived Xenografts
    • Systematic Analysis of Cell-Type Differences in the Epithelial Secretome Reveals Insights into the Pathogenesis of Respiratory Syncytial Virus-Induced Lower Respiratory Tract Infections
    • YAP-mediated mechanotransduction determines the podocytes response to damage
    • Myeloid Cells That Impair Immunotherapy Are Restored in Melanomas with Acquired Resistance to BRAF Inhibitors
    • Multicenter Systems Analysis of Human Blood Reveals Immature Neutrophils in Males and During Pregnancy
    • CCR7-CCL19/CCL21 Axis is Essential for Effective Arteriogenesis in a Murine Model of Hindlimb Ischemia
    • Transcriptomic Microenvironment of Lung Adenocarcinoma
    • Analysis of Microarray and RNA-seq Expression Profiling Data
    • Whole-transcriptome sequencing identifies a distinct subtype of acute lymphoblastic leukemia with predominant genomic abnormalities of EP300 and CREBBP
    • A New Role for ER{alpha}: Silencing via DNA Methylation of Basal, Stem Cell, and EMT Genes
    • Microarray Comparison of Anterior and Posterior Drosophila Wing Imaginal Disc Cells Identifies Novel Wing Genes
    • Comprehensive Cross-Population Analysis of High-Grade Serous Ovarian Cancer Supports No More Than Three Subtypes
    • The First Scube3 Mutant Mouse Line with Pleiotropic Phenotypic Alterations
    • Quantitative GTPase Affinity Purification Identifies Rho Family Protein Interaction Partners
    • Identification of candidate genes in osteoporosis by integrated microarray analysis
    • Mapping the Fetomaternal Peripheral Immune System at Term Pregnancy
    • Significance of TP53 Mutation in Wilms Tumors with Diffuse Anaplasia: A Report from the Children's Oncology Group
    • TDP-43 loss of function inhibits endosomal trafficking and alters trophic signaling in neurons
    • Dynamic Protein Interactions of the Polycomb Repressive Complex 2 during Differentiation of Pluripotent Cells
    • Data-driven identification of prognostic tumor subpopulations using spatially mapped t-SNE of mass spectrometry imaging data
    • Cell autonomous regulation of herpes and influenza virus infection by the circadian clock
    • Identification of Chemical-Genetic Interactions via Parallel Analysis of Barcoded Yeast Strains
    • Urinary Soluble CD163 in Active Renal Vasculitis
    • CCL19 as a Chemokine Risk Factor for Posttreatment Lyme Disease Syndrome: a Prospective Clinical Cohort Study
    • Coordination of Metabolism and Virulence Factors Expression of Extraintestinal Pathogenic Escherichia coli Purified from Blood Cultures of Patients with Sepsis
    • Gene copy-number variations (CNVs) of complement C4 and C4A deficiency in genetic risk and pathogenesis of juvenile dermatomyositis
    • Molecular Profile of Tumor-Specific CD8+ T Cell Hypofunction in a Transplantable Murine Cancer Model
    • Human Survivors of Disease Outbreaks Caused by Ebola or Marburg Virus Exhibit Cross-Reactive and Long-Lived Antibody Responses
    • Response of Vibrio cholerae to Low-Temperature Shifts: CspV Regulation of Type VI Secretion, Biofilm Formation, and Association with Zooplankton
    • Genomic profiling of murine mammary tumors identifies potential personalized drug targets for p53-deficient mammary cancers
    • Fragile X Mental Retardation Protein (FMRP) controls diacylglycerol kinase activity in neurons
    • Targeting surface voids to counter membrane disorders in lipointoxication-related diseases
    • Contribution of Human Fibroblasts and Endothelial Cells to the Hallmarks of Inflammation as Determined by Proteome Profiling
    • Cell Death Control by Matrix Metalloproteinases
    • An Endothelial Gene Signature Score Predicts Poor Outcome in Patients with Endocrine-Treated, Low Genomic Grade Breast Tumors
    • Expression-Based Genome-Wide Association Study Links Vitamin D-Binding Protein With Autoantigenicity in Type 1 Diabetes
    • In Vivo Interaction Proteomics in Caenorhabditis elegans Embryos Provides New Insights into P Granule Dynamics
    • The Histone-Like Nucleoid Structuring Protein (H-NS) Is a Negative Regulator of the Lateral Flagellar System in the Deep-Sea Bacterium Shewanella piezotolerans WP3
    • Scopus (8675)
    • Google Scholar

    Similar Articles

    You May Also be Interested in

    Nick Melosh describes a method for sampling RNA and proteins from cells using nanostraws.
    Nondestructive sampling of cell contents
    Nick Melosh describes a method for sampling RNA and proteins from cells using nanostraws.
    Listen
    Past PodcastsSubscribe
    PNAS Profile with NAS member and mathematician Yuval Peres
    PNAS Profile
    PNAS Profile with NAS member and mathematician Yuval Peres
    Researchers report evidence in yeast cells that nucleosomes inhibit binding and cleavage by the genome-editing enzyme CRISPR-Cas9, suggesting nucleosome position maps might help improve genome-editing efficiency. Image courtesy of Janet Iwasa (University of Utah, Salt Lake City).
    DNA architecture influences genome editing efficiency
    Researchers report evidence in yeast cells that nucleosomes inhibit binding and cleavage by the genome-editing enzyme CRISPR-Cas9, suggesting nucleosome position maps might help improve genome-editing efficiency.
    Image courtesy of Janet Iwasa (University of Utah, Salt Lake City).
    A study exploring intergenerational social mobility in the United States finds that fewer people born in the 1980s were upwardly mobile than those born in the 1940s and that the slowing of status mobility accentuates inequalities of opportunity. Image courtesy of Pixabay/Ponciano.
    Intergenerational trends in status mobility
    A study exploring intergenerational social mobility in the United States finds that fewer people born in the 1980s were upwardly mobile than those born in the 1940s and that the slowing of status mobility accentuates inequalities of opportunity.
    Image courtesy of Pixabay/Ponciano.
    A study suggests that social learning from exposure to opposing political views on social networks can improve accuracy and remove partisan bias, but displaying political symbols during cross-party communication can prevent such learning, according to the authors.
    Social networks and interpretation of climate data
    A study suggests that social learning from exposure to opposing political views on social networks can improve accuracy and remove partisan bias, but displaying political symbols during cross-party communication can prevent such learning, according to the authors.
    Proceedings of the National Academy of Sciences: 115 (38)
    Current Issue

    Submit

    Sign up for Article Alerts

    Jump to section

    • Article
      • Abstract
      • Materials and Methods
      • Results
      • Discussion
      • Acknowledgments
      • Footnotes
      • Abbreviations
      • References
    • Figures & SI
    • Info & Metrics
    • PDF
    Site Logo
    Powered by HighWire
    • Submit Manuscript
    • Twitter
    • Facebook
    • RSS Feeds
    • Email Alerts

    Articles

    • Current Issue
    • Latest Articles
    • Archive

    PNAS Portals

    • Classics
    • Front Matter
    • Teaching Resources
    • Anthropology
    • Chemistry
    • Physics
    • Sustainability Science

    Information

    • Authors
    • Reviewers
    • Press
    • Site Map

    Feedback    Privacy/Legal

    Copyright © 2018 National Academy of Sciences.