New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
 Agricultural Sciences
 Anthropology
 Applied Biological Sciences
 Biochemistry
 Biophysics and Computational Biology
 Cell Biology
 Developmental Biology
 Ecology
 Environmental Sciences
 Evolution
 Genetics
 Immunology and Inflammation
 Medical Sciences
 Microbiology
 Neuroscience
 Pharmacology
 Physiology
 Plant Biology
 Population Biology
 Psychological and Cognitive Sciences
 Sustainability Science
 Systems Biology
Statistical analysis of MPSS measurements: Application to the study of LPSactivated macrophage gene expression

Edited by Charles R. Cantor, Sequenom, Inc., San Diego, CA (received for review September 4, 2004)
Article Figures & SI
Figures
Data supplements
Stolovitzky et al. 10.1073/pnas.0406555102.
Supporting Information
Files in this Data Supplement:
Supporting Text
Supporting Figure 5
Supporting Figure 6
Supporting Figure 7
Supporting Figure 8
Supporting Figure 9
Supporting Figure 10
Supporting Figure 11
Supporting Table 1
Supporting Table 2
Supporting Table 3
Supporting Table 4
Supporting Figure 5Fig. 5. Overview of the MPSS steps.
Table 1. Classification of signatures in terms of position with respect to the closest gene and genomic signals
Virtual signature
class
mRNA orientation
Polyadenylation features
Position
0
Either, repeat warning
Not applicable
Not applicable
1
Forward strand
Poly(A) signal, Poly(A) tail
3' most
2
Poly(A) signal
3' most
3
Poly(A) tail
3' most
4
None
3' most
5
None
Not 3' most
11
Reverse strand
Poly(A) signal, Poly(A) tail
5' most
12
Poly(A) signal
5' most
13
Poly(A) tail
5' most
14
None
5' most
15
None
Not 5' most
22
Unknown
Poly(A) signal
Last before signal
23
Poly(A) tail
Last before tail
24
None
Last in sequence
25
None
Not last
1,000
Unknown, derived
from genomic sequence
Not applicable
Not applicable
Supporting Figure 6Fig. 6. Construction to determine the statistical significance of the differential expression of a pair of measurements. Points outside of the orange curve are statistically significant, with a P value given by the gray area under the distribution.
Supporting Figure 7Fig. 7. Expression of CD14 in human macrophages over 24 h after LPS stimulation. Only the pairwise comparisons between times 2 h and 24 h and between times 4 h and 24 h are significant at a P value level < 0.01.
Table 2. Summary of signature library characteristics
Library
No. of signatures
Cumulative
Reliable and significant
Cumulative
LPS at t = 0 replica 1
24,449
24,449
17,701
17,701
LPS at t = 0 replica 2
39,361
49,541
24,440
28,412
LPS at t = 2 h
46,240
75,041
27,314
36,586
LPS at t = 4 h replica 1
41,383
92,743
25,841
41,162
LPS at t = 4 h replica 2
29,525
102,220
20,811
43,605
LPS at t = 8 h
42,377
118,314
24,641
45,721
LPS at t = 24 h
37,952
130,150
24,772
47,841
Supporting Figure 8Fig. 8. (a) Samevs.same comparison of different MPSS runs of unperturbed macrophages show that the statistical behavior of the zero counts is quite special, exhibiting a discontinuous behavior with respect to the counts of 1. The axes are in units of bead counts. (b) Probability density functions of the counts in B1 and B2 for the ensembles of signatures shown in the legend. Only the nonzero counts were considered for the probability calculation.
Supporting Figure 9Fig. 9. (A) The nonzero null hypothesis pertains to signatures for which none of the MPSS sequencing replicates involved yielded a zero count. (a) Scatter plot of signature log(tpm) pairs (θ_{i,j},θ_{i,j'}), where replicates j and j' are biological replicates taken at either t = 0 or t = 4 h (each θ is the log of an aggregate tpm) for all signatures i for which the nonzero null hypothesis applies. (b) Standard deviation of measurement noise σ as a function of signal level μ for data shown in a. Solid line is best fit of calculated values of σ to an exponential decay function. (c) Illustration of significance region (region outside the solid lines) for P value 0.05. The points shown in the figure are false positives at this P value. (d) P value as a function of the fraction of false positives. The line with diamonds closely follows the theoretically expected straight line, deviating from it only at the data points shown by the arrows. This is due to the two outliers shown by the arrows in c. (B) The onezero null hypothesis pertains to signatures for which one of the two sequencing replicates in at least one of two biological replicates yielded a zero count. (a) Scatter plot of signature log(tpm) pairs (θ_{i,j},θ_{i,j'}), where replicates j and j' are biological replicates taken at either t = 0 or t = 4 h (each θ is the log of an aggregate tpm) for all signatures i for which the onezero null hypothesis applies. (b) Standard deviation of measurement noise σ as a function of signal level μ for data shown in a. Solid line is best fit of calculated values of σ to an exponential decay function. (c) Illustration of significance region (region outside the solid lines) for P value 0.05. The points shown in the figure are false positives at this P value. (d) P value as a function of the fraction of false positives. The line with diamonds deviates from the theoretically expected straight line, possibly due to the sparseness of the data and the consequent difficulty in obtaining a good estimation of the σ(μ) curve. (C) The all zeros null hypothesis pertains to signatures for which the two sequencing replicates in one of the two biological replicates yielded a zero count. (a) Scatter plot of signature log(tpm) pairs (θ_{i,j},θ_{i,j'}), where replicates j and j' are biological replicates taken at either t = 0 or t = 4 h (each θ is the log of an aggregate tpm) for all signatures i for which the all zero null hypothesis applies. Note that because one of the tpms in the pair is zero, the log(tpm) was arbitrarily set to zero. (b) Normalized histogram of the nonzero θ of the pair. The black shaded area adds up to 0.05 and illustrates the significance region at this P value. (c) Significance region for P value 0.05. The points shown are false positives at this P value. (d) P value as a function of the fraction of false positives. The line with diamonds very closely follows the theoretically expected straight line, indicating that the distribution of points in the x and y axis in a are very similar.
Supporting Figure 10Fig. 10. (a) Histogram of the number of signatures with a given SI among the 12,567 signatures with an associated UniGene ID no. If for a given UniGene ID no. there was more than one signature, then the one with the smallest SI was used. (b) Same as in a, but only among the 127 genes identified by Nau et al. (1) and mapped to signatures measured in our MPSS experiments), as induced in bacteriaexposed macrophage.
1. Nau, G. J., Richmond, J. F., Schlesinger, A., Jennings, E. G., Lander, E. S. & Young, R. A. (2002) Proc. Natl. Acad. Sci. USA 99, 15031508.
Supporting Figure 11Fig. 11. Hierarchical cluster analysis of LPSactivated macrophage expression data. Rows are individual genes and columns are MPSS measurements taken at 2, 4, 8, and 24 h after activation as well as previously published (1). Shown are Affymetrix GeneChip measurements taken at 1, 2, 6, 12, and 24 h after activation. See text for details.
 Nau, G. J., Richmond, J. F., Schlesinger, A., Jennings, E. G., Lander, E. S. & Young, R. A. (2002) Proc. Natl. Acad. Sci. USA 99, 15031508.
Table 3. Distribution of putative early and late responders among the functional categories that were activated by LPS stimulation
Early responders (%)
Late responders (%)
Antiapoptotic
5 (100)
0 (0)
Adhesion
4 (80)
1 (20)
Cytokines + chemokines
10 (77)
3 (23)
Transcription
8 (73)
3 (27)
Signaling
11 (65)
6 (35)
Enzyme
0 (0)
8 (100)
Receptors
5 (36)
9 (64)
Table 4. The 101 significant genes that participate in Fig. 4
Genes (in the same lefttoright order as in Fig. 4)
Early responders
INHBA, IL8, P2RX4, JUNB, PLAUR, ANKRD15, JAG1, CXCL3, CXCL2, CXCL1, GADD45A, ZFP36, DUSP1, NFKBIA, GEM, TNFAIP3, EBI2, DUSP2, IER3, PTGS2, TNF, IL1B, CSF3, SERPINB2, NR4A3, DUSP5, PTX3, HM74, CCL4, CCL3, IL6, CCL20, CD83, IRF1, CCRL2, XBP1, IRLB, BIRC2, TNFAIP2, FSCN1, TRAF1, BIRC3, NFKB1, TRIP10, STAT5A, HSPA1A, MLP, ICAM1, PLK3, ATP2B1, NFKB2, CFLAR, PHLDA2, SDC4, SERPINB8.
Late responders
ADM, GCLM, PDE4B, PTPN1, TXN, IL7R, SOD2, TNFAIP6, NINJ1, G1P2, MX1, IL15RA, INDO, HCK, IFITM1, BF, KYNU, HIST2H2AA, EBI3, HSD11B1, WTAP, LIMK2, SLAMF1, TNFRSF5, CKB, P2RX7, ELF4, IL1RN, ARID5A, CXCL10, CD44, CCR7, ADORA2A, PBEF1, G0S2, GCH1, GBP1, ADA, ISG20, ARHH, B4GALT1, TNIP1, BTG1, MMP14, DSCR1, PNRC1.
The early and late responder genes correspond, respectively, to the genes in the left and right coarser clusters of Fig. 4.