Roy and Gilbert. 10.1073/pnas.0408274102.
Supporting Table 3
Supporting Table 4
Supporting Figure 6
Supporting Figure 7
Supporting Figure 8
Supporting Figure 9
Fig. 6. Alternative "coelomata" relationship between the analyzed species, grouping arthropods with deuterosomes. The deepest node is not resolved, with Arabidopsis thaliana clustering either with Plasmodium falciparum or with the other species.
Fig. 7. 3' intron loss bias. (A) The relationship between length of coding sequence from the 3' end of the gene and the fraction of introns retained. (B) The relationship between the intron position and the fraction of introns retained. For each lineage, each bar represents a quintile of gene length: the rightmost bar gives the fraction retained for known to be ancestral to the lineage (KAL) introns in the C-terminal 20% of the gene, the second rightmost the fraction for introns from 20–40% of the length from the 3' to the 5', etc. (C) Detection of 3' loss bias per gene. For each lineage, the numbers of genes showing a positive correlation between distance from 3' and intron retention (3' loss bias, light bars) and negative correlation (5' bias, dark bars) are given. This figure is exactly analogous to Fig. 2, but calculated assuming the alternative coelomata phylogeny.
Fig. 8. Adjacent introns are lost in concert. (A) The probability distributions for total numbers of pairs, trios, and quartets of adjacent KAL introns lost for all analyzed genes for each lineage, with the observed values given below. (B) The probability distribution for the total numbers of clusters of one or more lost adjacent introns for all analyzed genes, with the observed values given below. This figure is exactly analogous to Fig. 3, but calculated assuming the alternative coelomata phylogeny.
Fig. 9. Phase-biased intron loss. (A) The fraction of phase zero and phase one and two KAL introns retained along each lineage. (B) The distribution of the number of fungi/animal taxa (Diptera, Caenorhabditis elegans, Homo sapiens, and Saccharomyces pombe) in which an intron is retained for introns shared between fungi/animals and Arabidopsis/Plasmodium for phase zero and phase one and two introns. This figure is exactly analogous to Fig. 4, but calculated assuming the alternative coelomata phylogeny.
Table 3. Summary of the data
|
Lineage |
Sister taxa |
Total introns |
Shared with |
KAL introns (retained + lost) |
|
|
Sister |
Non-Sister |
||||
|
D. melanogaster |
A. gambiae |
725 |
382 |
489 |
451 (295 + 156) |
|
A. gambiae |
D. melanogaster |
675 |
382 |
451 |
489 (295 + 194) |
|
Diptera |
H. sapiens |
1,016 |
568 |
401 |
1257 (324 + 933) |
|
H. sapiens |
Diptera |
3,345 |
568 |
1,257 |
401 (324 + 77) |
|
C. elegans |
Diptera, H. sapiens |
1,468 |
599 |
284 |
948 (213 + 735) |
|
S. pombe |
Animals |
450 |
223 |
158 |
927 (131 + 796) |
|
A. thaliana |
Animals, S. pombe |
2,933 |
908 |
97 |
119 (73 + 46) |
|
P. falciparum |
Animals, Sp, At |
450 |
143 |
- |
- |
Sister taxa, the data set species diverging at the base of the given lineage; introns, total number of introns in the descendent(s) of the lineage; introns shared with sister, number of introns shared between species in the first two columns; introns shared with non-sister, number of introns shared between species in the first column and species not in the second column; KAL introns, number of KAL introns retained plus number lost in lineage, as defined in the text; last column, number of genes that have lost two or more KAL introns as well as retained one or more KAL introns along the lineage. This table is exactly analogous to Table 1, but calculated assuming the alternative coelomata phylogeny.
Table 4. The effect of adjacent intron phase on intron loss
|
Percent introns retained |
||||||
|
Adjacent intron phase |
||||||
|
|
|
Same |
Different |
P |
||
|
Diptera |
|
|
|
|
|
|
|
Ph0 |
5' |
21 |
(73) |
30 |
(188) |
0.93 |
|
|
3' |
30 |
(73) |
21 |
(159) |
0.18 |
|
Ph1 |
5' |
42 |
(12) |
22 |
(112) |
0.16 |
|
|
3' |
42 |
(12) |
39 |
(128) |
0.54 |
|
Ph2 |
5' |
31 |
(13) |
42 |
(123) |
0.85 |
|
|
3' |
31 |
(13) |
50 |
(136) |
0.94 |
|
All |
5' |
24 |
(98) |
31 |
(423) |
0.90 |
|
|
3' |
32 |
(98) |
35 |
(423) |
0.74 |
|
C. elegans |
|
|
|
|
|
|
|
Ph0 |
5' |
26 |
(53) |
37 |
(127) |
0.92 |
|
|
3' |
25 |
(53) |
40 |
(106) |
0.97 |
|
Ph1 |
5' |
29 |
(14) |
36 |
(69) |
0.79 |
|
|
3' |
36 |
(14) |
40 |
(81) |
0.73 |
|
Ph2 |
5' |
29 |
(17) |
27 |
(83) |
0.54 |
|
|
3' |
29 |
(17) |
33 |
(92) |
0.72 |
|
All |
5' |
27 |
(84) |
33 |
(279) |
0.86 |
|
|
3' |
27 |
(84) |
38 |
(279) |
0.97 |
|
H. sapiens |
|
|
|
|
|
|
|
Ph0 |
5' |
63 |
(32) |
71 |
(35) |
0.85 |
|
|
3' |
75 |
(32) |
77 |
(21) |
0.68 |
|
Ph1 |
5' |
91 |
(11) |
81 |
(9) |
0.46 |
|
|
3' |
100 |
(11) |
83 |
(23) |
0.18 |
|
Ph2 |
5' |
83 |
(6) |
92 |
(20) |
0.91 |
|
|
3' |
100 |
(6) |
88 |
(20) |
0.54 |
|
All |
5' |
71 |
(49) |
80 |
(64) |
0.90 |
|
|
3' |
84 |
(49) |
82 |
(64) |
0.52 |
|
S. pombe |
|
|
|
|
|
|
|
Ph0 |
5' |
4 |
(137) |
9 |
(122) |
0.96 |
|
|
3' |
9 |
(137) |
10 |
(104) |
0.60 |
|
Ph1 |
5' |
9 |
(23) |
11 |
(65) |
0.74 |
|
|
3' |
4 |
(23) |
21 |
(81) |
0.99 |
|
Ph2 |
5' |
5 |
(21) |
12 |
(84) |
0.93 |
|
|
3' |
5 |
(21) |
23 |
(86) |
0.99 |
|
All |
5' |
5 |
(181) |
10 |
(271) |
0.99 |
|
|
3' |
8 |
(181) |
17 |
(271) |
1.00 |
5'/3', whether the effect of the up (5') or downstream (3') adjacent intron is being tested; same, adjacent intron is of the same phase; different, adjacent intron is of different phase. Sample sizes are given in parentheses. P values were calculated with a one-tail Fisher’s Exact test. This table is exactly analogous to Table 2, but calculated assuming the alternative coelomata phylogeny.