Previous Article |
Table of Contents
| Next Article
From the Cover
GENETICS
Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles
aBroad Institute of Massachusetts Institute of Technology and Harvard, 320 Charles Street, Cambridge, MA 02141;cDepartment of Systems Biology, Alpert 536, Harvard Medical School, 200 Longwood Avenue, Boston, MA 02446; dInstitute for Genome Sciences and Policy, Center for Interdisciplinary Engineering, Medicine, and Applied Sciences, Duke University, 101 Science Drive, Durham, NC 27708; eDepartment of Medical Oncology, DanaFarber Cancer Institute, 44 Binney Street, Boston, MA 02115; fDivision of Pulmonary and Critical Care Medicine, Massachusetts General Hospital, 55 Fruit Street, Boston, MA 02114; gFred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, C2-023, P.O. Box 19024, Seattle, WA 98109-1024; hDepartment of Neurology, Enders 260, Children's Hospital, Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115; iDepartment of Biology, Massachusetts Institute of Technology, Cambridge, MA 02142; and jWhitehead Institute for Biomedical Research, Massachusetts Institute of Technology, Cambridge, MA 02142
Contributed by Eric S. Lander, August 2, 2005
Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.
microarray
Freely available online through the PNAS open access option.
Abbreviations: ALL, acute lymphoid leukemia; AML, acute myeloid leukemia; ES, enrichment score; FDR, false discovery rate; GSEA, Gene Set Enrichment Analysis; MAPK, mitogen-activated protein kinase; MSigDB, Molecular Signature Database; NES, normalized enrichment score.
See Commentary on page 15278.
b A.S. and P.T. contributed equally to this work.
k To whom correspondence may be addressed. E-mail: lander{at}broad.mit.edu or mesirov{at}broad.mit.edu.
© 2005 by The National Academy of Sciences of the USA
![]()
CiteULike
Complore
Connotea
Del.icio.us
Digg What's this?
Related Commentary in PNAS:
This article has been cited by other articles in HighWire Press-hosted journals:
![]() |
Z. Hu, E. S. Snitkin, and C. DeLisi VisANT: an integrative framework for networks in systems biology Brief Bioinform, May 7, 2008; (2008) bbn020v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Nam and S.-Y. Kim Gene-set approach for expression pattern analysis Brief Bioinform, May 1, 2008; 9(3): 189 - 197. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Alvarez, A. Corvalan, J. C. Roa, P. Argani, F. Murillo, J. Edwards, R. Beaty, G. Feldmann, S.-M. Hong, M. Mullendore, et al. Serial Analysis of Gene Expression Identifies Connective Tissue Growth Factor Expression as a Prognostic Biomarker in Gallbladder Cancer Clin. Cancer Res., May 1, 2008; 14(9): 2631 - 2638. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Frijters, B. Heupers, P. v. Beek, M. Bouwhuis, R. v. Schaik, J. d. Vlieg, J. Polman, and W. Alkema CoPub: a literature-based keyword enrichment tool for microarray data analysis Nucleic Acids Res., April 28, 2008; (2008) gkn215v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Jimeno, A. C. Tan, J. Coffa, N.V. Rajeshkumar, P. Kulesza, B. Rubio-Viqueira, J. Wheelhouse, B. Diosdado, W. A. Messersmith, C. Iacobuzio-Donahue, et al. Coordinated Epidermal Growth Factor Receptor Pathway Gene Overexpression Predicts Epidermal Growth Factor Receptor Inhibitor Sensitivity in Pancreatic Cancer Cancer Res., April 15, 2008; 68(8): 2841 - 2849. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Kendall, H. Anderson, A. K. Dunbier, A. Mackay, T. Dexter, A. Urruticoechea, C. Harper-Wynne, and M. Dowsett Impact of Estrogen Deprivation on Gene Expression Profiles of Normal Postmenopausal Breast Tissue In vivo Cancer Epidemiol. Biomarkers Prev., April 1, 2008; 17(4): 855 - 863. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. K. Hellerstein Exploiting Complexity and the Robustness of Network Architecture for Drug Discovery J. Pharmacol. Exp. Ther., April 1, 2008; 325(1): 1 - 9. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Stevens, Q. T. Krantz, W. P. Linak, S. Hester, and M. I. Gilmour Increased Transcription of Immune and Metabolic Pathways in Naive and Allergic Mice Exposed to Diesel Exhaust Toxicol. Sci., April 1, 2008; 102(2): 359 - 370. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Lee, S. W. Kong, and P. J. Park Integrative analysis reveals the direct and indirect interactions between DNA copy number aberrations and gene expression changes Bioinformatics, April 1, 2008; 24(7): 889 - 896. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Chang Milbauer, P. Wei, J. Enenstein, A. Jiang, C. A. Hillery, J. P. Scott, S. C. Nelson, V. Bodempudi, J. N. Topper, R.-B. Yang, et al. Genetic endothelial systems biology of sickle stroke risk Blood, April 1, 2008; 111(7): 3872 - 3879. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. L. Pearson, T. J. Robinson, M. J. Munoz, A. R. Kornblihtt, and M. A. Garcia-Blanco Identification of the Cellular Targets of the Transcription Factor TCERG1 Reveals a Prevalent Role in mRNA Processing J. Biol. Chem., March 21, 2008; 283(12): 7949 - 7961. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sarkar, V. Kalia, W. N. Haining, B. T. Konieczny, S. Subramaniam, and R. Ahmed Functional and genomic profiling of effector CD8 T cell subsets with distinct memory fates J. Exp. Med., March 17, 2008; 205(3): 625 - 640. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Rodenburg, A. G. Heidema, J. M. A. Boer, I. M. J. Bovee-Oudenhoven, E. J. M. Feskens, E. C. M. Mariman, and J. Keijer A framework to identify physiological responses in microarray-based gene expression studies: selection and interpretation of biologically relevant genes Physiol Genomics, March 10, 2008; 33(1): 78 - 90. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Jin, Q. Tao, J. Peng, H. M. Soo, W. Wu, J. Ying, C. R. Fields, A. L. Delmas, X. Liu, J. Qiu, et al. DNA methyltransferase 3B (DNMT3B) mutations in ICF syndrome lead to altered epigenetic modifications and aberrant expression of genes regulating development, neurogenesis and immune function Hum. Mol. Genet., March 1, 2008; 17(5): 690 - 709. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Novershtern, Z. Itzhaki, O. Manor, N. Friedman, and N. Kaminski A Functional and Regulatory Map of Asthma Am. J. Respir. Cell Mol. Biol., March 1, 2008; 38(3): 324 - 336. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. N. Grigoryev, M. Liu, H. T. Hassoun, C. Cheadle, K. C. Barnes, and H. Rabb The Local and Systemic Inflammatory Transcriptome after Acute Kidney Injury J. Am. Soc. Nephrol., March 1, 2008; 19(3): 547 - 558. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. de Wilde, R. Mohren, S. van den Berg, M. Boekschoten, K. W.-V. Dijk, P. de Groot, M. Muller, E. Mariman, and E. Smit Short-term high fat-feeding results in morphological and metabolic adaptations in the skeletal muscle of C57BL/6J mice Physiol Genomics, February 19, 2008; 32(3): 360 - 369. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Miao, X. Wu, L. Zhang, A. D. Riggs, and R. Natarajan Histone Methylation Patterns Are Cell-Type Specific in Human Monocytes and Lymphocytes and Well Maintained at Core Genes J. Immunol., February 15, 2008; 180(4): 2264 - 2269. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. G. Sankaran, S. H. Orkin,, and C. R. Walkley Rb intrinsically promotes erythropoiesis by coupling cell cycle exit with mitochondrial biogenesis Genes & Dev., February 15, 2008; 22(4): 463 - 475. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Klezovitch, M. Risk, I. Coleman, J. M. Lucas, M. Null, L. D. True, P. S. Nelson, and V. Vasioukhin A causal role for ERG in neoplastic transformation of prostate epithelium PNAS, February 12, 2008; 105(6): 2105 - 2110. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. J. Chng, E. Braggio, G. Mulligan, B. Bryant, E. Remstein, R. Valdez, A. Dogan, and R. Fonseca The centrosome index is a powerful prognostic marker in myeloma and identifies a cohort of patients that might benefit from aurora kinase inhibition Blood, February 1, 2008; 111(3): 1603 - 1609. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Cordero, M. Botta, and R. A. Calogero Microarray data analysis and mining approaches Brief Funct Genomic Proteomic, January 22, 2008; (2008) elm034v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Yang, Y. Li, H. Xiao, Q. Liu, M. Zhang, J. Zhu, W. Ma, C. Yao, J. Wang, D. Wang, et al. Gaining confidence in biological interpretation of the microarray data: the functional consistence of the significant GO categories Bioinformatics, January 15, 2008; 24(2): 265 - 271. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Jurman, S. Merler, A. Barla, S. Paoli, A. Galea, and C. Furlanello Algebraic stability indicators for ranked lists in molecular profiling Bioinformatics, January 15, 2008; 24(2): 258 - 264. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Nettleton, J. Recknor, and J. M. Reecy Identification of differentially expressed gene categories in microarray studies using nonparametric multivariate analysis Bioinformatics, January 15, 2008; 24(2): 192 - 201. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Sabatino, Y. Zhao, S. Voiculescu, A. Monaco, P. Robbins, L. Karai, B. J. Nickoloff, M. Maio, S. Selleri, F. M. Marincola, et al. Conservation of Genetic Alterations in Recurrent Melanoma Supports the Melanoma Stem Cell Hypothesis Cancer Res., January 1, 2008; 68(1): 122 - 131. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. A. Pangas, X. Li, L. Umans, A. Zwijsen, D. Huylebroeck, C. Gutierrez, D. Wang, J. F. Martin, S. P. Jamin, R. R. Behringer, et al. Conditional Deletion of Smad1 and Smad5 in Somatic Cells of Male and Female Gonads Leads to Metastatic Tumor Development in Mice Mol. Cell. Biol., January 1, 2008; 28(1): 248 - 257. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Xu, Y. Zhao, and R. Simon Gene Set Expression Comparison kit for BRB-ArrayTools Bioinformatics, January 1, 2008; 24(1): 137 - 139. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Li, J. Walling, Y. Kotliarov, A. Center, M. E. Steed, S. J. Ahn, M. Rosenblum, T. Mikkelsen, J. C. Zenklusen, and H. A. Fine Genomic Changes and Gene Expression Profiles Reveal That Established Glioma Cell Lines Are Poorly Representative of Primary Human Gliomas Mol. Cancer Res., January 1, 2008; 6(1): 21 - 30. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. S. Adler, S. Sinha, T. L.A. Kawahara, J. Y. Zhang, E. Segal, and H. Y. Chang Motif module map reveals enforcement of aging by continual NF-{kappa}B activity Genes & Dev., December 15, 2007; 21(24): 3244 - 3257. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. B.H. Williams, E. K.F. Chan, M. J. Cowley, and P. F.R. Little The influence of genetic variation on gene expression Genome Res., December 1, 2007; 17(12): 1707 - 1716. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Subramanian, H. Kuehn, J. Gould, P. Tamayo, and J. P. Mesirov GSEA-P: a desktop application for Gene Set Enrichment Analysis Bioinformatics, December 1, 2007; 23(23): 3251 - 3253. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Alexe, G. S. Dalgin, D. Scanfeld, P. Tamayo, J. P. Mesirov, C. DeLisi, L. Harris, N. Barnard, M. Martel, A. J. Levine, et al. High Expression of Lymphocyte-Associated Genes in Node-Negative HER2+ Breast Cancers Correlates with Lower Recurrence Rates Cancer Res., November 15, 2007; 67(22): 10669 - 10676. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. Xu, P. K. Majumder, K. Ross, Y. Shim, T. R. Golub, M. Loda, and W. R. Sellers Identification of prostate cancer modifier pathways using parental strain expression mapping PNAS, November 6, 2007; 104(45): 17771 - 17776. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. R. Setlur, T. E. Royce, A. Sboner, J.-M. Mosquera, F. Demichelis, M. D. Hofer, K. D. Mertz, M. Gerstein, and M. A. Rubin Integrative Microarray Analysis of Pathways Dysregulated in Metastatic Prostate Cancer Cancer Res., November 1, 2007; 67(21): 10296 - 10303. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. F. Murphy, G. J. Hooiveld, M. Muller, R. A. Calogero, and K. D. Cashman Conjugated Linoleic Acid Alters Global Gene Expression in Human Intestinal-Like Caco-2 Cells in an Isomer-Specific Manner J. Nutr., November 1, 2007; 137(11): 2359 - 2365. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. C. Chan, S. Raengpradub, K. J. Boor, and M. Wiedmann Microarray-Based Characterization of the Listeria monocytogenes Cold Regulon in Log- and Stationary-Phase Cells Appl. Envir. Microbiol., October 15, 2007; 73(20): 6484 - 6498. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Ucar, I. Neuhaus, P. Ross-MacDonald, C. Tilford, S. Parthasarathy, N. Siemers, and R.-R. Ji Construction of a reference gene association network from multiple profiling data: application to data analysis Bioinformatics, October 15, 2007; 23(20): 2716 - 2724. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Shin, S. Monti, D. J. Aires, M. Duvic, T. Golub, D. A. Jones, and T. S. Kupper Lesional gene expression profiling in cutaneous T-cell lymphoma reveals natural clusters associated with disease outcome Blood, October 15, 2007; 110(8): 3015 - 3027. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Draghici, P. Khatri, A. L. Tarca, K. Amin, A. Done, C. Voichita, C. Georgescu, and R. Romero A systems biology approach for pathway level analysis Genome Res., October 1, 2007; 17(10): 1537 - 1545. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Cavalieri, C. Castagnini, S. Toti, K. Maciag, T. Kelder, L. Gambineri, S. Angioli, and P. Dolara Eu.Gene Analyzer a tool for integrating gene expression data with pathway databases Bioinformatics, October 1, 2007; 23(19): 2631 - 2632. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Lapointe, C. Li, C. P. Giacomini, K. Salari, S. Huang, P. Wang, M. Ferrari, T. Hernandez-Boussard, J. D. Brooks, and J. R. Pollack Genomic Profiling Reveals Alternative Genetic Pathways of Prostate Tumorigenesis Cancer Res., September 15, 2007; 67(18): 8504 - 8510. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Spentzos, S. A Cannistra, F. Grall, D. A Levine, K. Pillay, T. A Libermann, and C. S Mantzoros IGF axis gene expression patterns are prognostic of survival in epithelial ovarian cancer Endocr. Relat. Cancer, September 1, 2007; 14(3): 781 - 790. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. M Mangravite, K. Dawson, R. R Davis, J. P Gregg, and R. M Krauss Fatty acid desaturase regulation in adipose tissue by dietary composition is independent of weight loss and is correlated with the plasma triacylglycerol response Am. J. Clinical Nutrition, September 1, 2007; 86(3): 759 - 767. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. G. Kann Protein interactions and disease: computational approaches to uncover the etiology of diseases Brief Bioinform, September 1, 2007; 8(5): 333 - 346. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Lin, C. M. Gan, X. Zhang, S. Jones, T. Sjoblom, L. D. Wood, D. W. Parsons, N. Papadopoulos, K. W. Kinzler, B. Vogelstein, et al. A multidimensional analysis of genes mutated in breast and colorectal cancers Genome Res., September 1, 2007; 17(9): 1304 - 1318. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Lucchinetti, J. Aguirre, J. Feng, M. Zhu, M. Suter, D. R. Spahn, L. Harter, and M. Zaugg Molecular Evidence of Late Preconditioning After Sevoflurane Inhalation in Healthy Volunteers Anesth. Analg., September 1, 2007; 105(3): 629 - 640. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Skov, D. Glintborg, S. Knudsen, T. Jensen, T. A. Kruse, Q. Tan, K. Brusgaard, H. Beck-Nielsen, and K. Hojlund Reduced Expression of Nuclear-Encoded Genes Involved in Mitochondrial Oxidative Metabolism in Skeletal Muscle of Insulin-Resistant Women With Polycystic Ovary Syndrome Diabetes, September 1, 2007; 56(9): 2349 - 2355. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Lottaz, J. Toedling, and R. Spang Annotation-based distance measures for patient subgroup discovery in clinical microarray studies Bioinformatics, September 1, 2007; 23(17): 2256 - 2264. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Kovacs, P. Tornvall, R. Nilsson, J. Tegner, A. Hamsten, and J. Bjorkegren Human C-reactive protein slows atherosclerosis development in a mouse model with human-like hypercholesterolemia PNAS, August 21, 2007; 104(34): 13768 - 13773. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. |