Skip to main content
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian
  • Log in
  • Log out
  • My Cart

Main menu

  • Home
  • Articles
    • Current
    • Latest Articles
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • Archive
  • Front Matter
  • News
    • For the Press
    • Highlights from Latest Articles
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Purpose and Scope
    • Editorial and Journal Policies
    • Submission Procedures
    • For Reviewers
    • Author FAQ
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian

User menu

  • Log in
  • Log out
  • My Cart

Search

  • Advanced search
Home
Home

Advanced Search

  • Home
  • Articles
    • Current
    • Latest Articles
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • Archive
  • Front Matter
  • News
    • For the Press
    • Highlights from Latest Articles
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Purpose and Scope
    • Editorial and Journal Policies
    • Submission Procedures
    • For Reviewers
    • Author FAQ

New Research In

Physical Sciences

Featured Portals

  • Physics
  • Chemistry
  • Sustainability Science

Articles by Topic

  • Applied Mathematics
  • Applied Physical Sciences
  • Astronomy
  • Computer Sciences
  • Earth, Atmospheric, and Planetary Sciences
  • Engineering
  • Environmental Sciences
  • Mathematics
  • Statistics

Social Sciences

Featured Portals

  • Anthropology
  • Sustainability Science

Articles by Topic

  • Economic Sciences
  • Environmental Sciences
  • Political Sciences
  • Psychological and Cognitive Sciences
  • Social Sciences

Biological Sciences

Featured Portals

  • Sustainability Science

Articles by Topic

  • Agricultural Sciences
  • Anthropology
  • Applied Biological Sciences
  • Biochemistry
  • Biophysics and Computational Biology
  • Cell Biology
  • Developmental Biology
  • Ecology
  • Environmental Sciences
  • Evolution
  • Genetics
  • Immunology and Inflammation
  • Medical Sciences
  • Microbiology
  • Neuroscience
  • Pharmacology
  • Physiology
  • Plant Biology
  • Population Biology
  • Psychological and Cognitive Sciences
  • Sustainability Science
  • Systems Biology

Rates of intron loss and gain: Implications for early eukaryotic evolution

Scott William Roy and Walter Gilbert
PNAS April 19, 2005 102 (16) 5773-5778; https://doi.org/10.1073/pnas.0500383102
Scott William Roy
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Walter Gilbert
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  1. Contributed by Walter Gilbert, February 16, 2005

  • Article
  • Figures & SI
  • Info & Metrics
  • PDF
Loading

Abstract

We study the intron–exon structures of 684 groups of orthologs from seven diverse eukaryotic genomes and provide maximum likelihood estimates for rates and numbers of intron losses and gains in these same genes for a variety of lineages. Rates of intron loss vary from ≈2 × 10–9 to 2 × 10–10 per year. Rates of gain vary from 6 × 10–13 to 4 × 10–12 per possible intron insertion site per year. There is an inverse correspondence between rates of intron loss and gain, leading to a 20-fold variation among lineages in the ratio of the rates of the two processes. The observed rates of intron gain are insufficient to explain the large number of introns estimated to have been present in the plant–animal ancestor, suggesting that introns present in early eukaryotes may have been created by a fundamentally different process than more recently gained introns.

  • genome evolution

The debate over the relative importance of intron loss and gain in shaping the pattern of imperfect conservation of intron–exon structures between homologues has been long and hard fought (1–51). For the first 25 years, the debate was waged in the context of the introns-early/introns-late question. Proponents of introns-late believe that spliceosomal introns are relatively recent arrivals whose modern restriction to eukaryotes reflects their absence in the common ancestors of prokaryotes and eukaryotes and subsequent origin within eukaryotes (1, 12, 24, 37, 52–56). Their task was thus to demonstrate that modern intron–exon structures could be explained primarily or solely by intron gain in eukaryotes, without a necessarily major role for intron loss. Introns-early adherents believe that introns are primordial structures whose presence in eukaryote–prokaryote ancestors facilitated the construction of early genes (2–11, 15, 22, 23, 25, 33, 35, 40). The presence of large numbers of introns in modern genomes thus does not necessarily require active intron insertion in eukaryotes, although the lack of spliceosomal introns in prokaryotes, as well as the existence of some introns with spotty phylogenetic distributions within eukaryotes, require significant intron loss. In this way was the more fundamental debate about the timing of origin of the first spliceosomal introns, with all its implications for the origins of early genes, the emergence of complex genomes, and the divergence of the three kingdoms of life, projected onto the issue of intron loss and gain. Since that time, both perspectives have been softened, intronsearly by discoveries of introns whose very limited phylogenetic distributions suggest their recent gain (3, 7–12, 20, 22, 33, 42–44) and introns-late by the discovery of introns and spliceosomal components in very deep-branching eukaryotes (57–60). However, the basic disagreements over when spliceosomal introns first appeared in significant numbers and whether eukaryotic evolution has been characterized by generally decreasing, stable, or increasing intron density persist (30–50).

In the first characterized case of intron discordance, Perler et al. (1) found that an intron specific to one of a pair of recent insulin duplicates in rat is shared with chicken, demonstrating intron loss. The next two decades brought cases of intron loss and gain in steadily increasing streams (3, 4, 7–14, 17–22, 28, 29, 33), although the largely anecdotal nature of these early studies prevented firm general conclusions about the relative importance of the two processes. With the genomic age came more comprehensive surveys. Studies of large numbers of ortholog pairs described extents and patterns of intron conservation and showed much variation across groups, with >90% intron conservation between humans and fish (20, 61), whereas more recently diverged pairs of dipteran, Caenorhabditis, and Plasmodium species showed far more differences (62, 63). At a deeper level, as many as 25–33% of intron positions are shared between multiple eukaryotic kingdoms (34, 64). Other studies have analyzed the causes of such differences (intron loss or gain). One uncovered a mere five losses and no gains in 1,576 human–mouse–pufferfish ortholog trios (36), and another found roughly balanced intron losses and gains in four species of euascomyces fungi (49). Studies of intron–exon structures in large paralogous gene families from one or a few organisms (3, 14, 16–18, 21, 22, 28, 31, 47) or from large amounts of available sequence (6, 10, 39, 40) tended to conclude that intron evolution was a mixed bag, with some lineages dominated by intron loss, others by gain.

In the most extensive study of intron–exon structure in orthologs, Rogozin et al. (34) used Dollo parsimony to reconstruct the history of 684 sets of orthologs from eight eukaryotic species. They presented a varied picture of intron evolution, with some lineages experiencing sharp reductions in intron number, and others seeing sharp increases. In a previous publication, we argued that parsimony is not appropriate for such reconstructions because it fails to recover ancestral introns in cases of loss (50). A maximum likelihood analysis of their data showed a different picture, with generally intron-rich eukaryotic ancestors and net intron losses in most branches: 6 of 10 studied branches experienced a significant net loss, and 2 showed a net gain (50).

We extend that analysis here. Previously, we estimated only net changes in intron number in the studied regions (e.g., there are 3,345 introns in these regions in humans versus an estimated 3,321 in the bilateran ancestor, thus a change of +24). We here provide individual estimates for numbers of losses and gains for each branch (976 gains and 952 losses along the same branch). We find significant variations in rates of intron loss and gain between branches. Rates of intron loss range from ≈2 × 10–10 to 2 × 10–9 per year, whereas rates of intron gain per possible insertion site are orders of magnitude smaller, from ≈6 × 10–13 to 4 × 10–12. There is an inverse relationship between rates of the two processes. The estimated rates suggest that early eukaryotic ancestors predating even the plant–animal split were very intron-rich and that introns in early eukaryotes may have arisen by qualitatively different processes than more recent insertions.

Methods

Data Set and Programs. We downloaded amino acid level alignments and corresponding intron positions for 684 clusters of orthologous genes from eight eukaryotic species, compiled and previously studied by Rogozin et al. (34). Only intron positions in conserved regions were considered (see ref. 34 for details). Introns at the exact same position (between the exact corresponding pair of nucleotides in the alignment) in different species were assumed homologous and not due to independent multiple insertions, an assumption supported by independent evidence (ref. 46; discussed in ref. 50). Only introns present at the exact same position were considered homologous (34). Saccharomyces cerevisiae was excluded because of its dearth of introns. We wrote Perl programs to perform the analyses described.

Estimates of Numbers of Intron Losses and Gains. External branches. Fig. 1A depicts a scenario in which species 1 diverges from (a group of) species 2 at node X with (a group of) outgroups 3. We call the probabilities that an intron present in ancestor X is present in species 1, in some species from group 2, and in some species from group 3: o 1, o 2, and o 3, respectively. An intron present in ancestor X will have one of the following six modern phylogenetic distributions with respect to species 1 and groups 2 and 3, with probabilities: (i) Pr{present in 1, 2, and 3|present in X} = o 1 o 2 o 3; (ii) Pr{present in 1 and 2; absent in 3|X} = o 1 o 2(1–o 3); (iii) Pr{present in 1 and 3; absent in 2|X} = o 1(1–o 2)o 3; (iv) Pr{present in 2 and 3; absent in 1|X} = (1–o 1)o 2 o 3; (v) Pr{present in 1; absent in 2 and 3|X} = o 1(1–o 2)(1–o 3); and (vi) Pr{absent in 1; present in 2 or 3 or neither|X} = (1–o 1)(1–o 2 o 3). The total number of introns present in X but lost along the branch leading from X to 1, which we call l, is equal to the number in categories iv and vi. Introns found only in species 1 are either gained since X or present in X but absent in 2 and 3 (category v); the number gained along the branch from X to 1, which we call g, is thus the total number found only in species 1 minus the number in category v. The conditional probability of seeing the data is: Math Math Math where n values give numbers of introns present in exactly the indicated groups [e.g., n 123 is the number present in species 1 and (some species from) each of groups 2 and 3, n 12 the number present in 1 and 2 but not 3]; and m values give numbers of introns present in at least the indicated groups (e.g., m 1 is the number of all introns present in group 1, regardless of presence elsewhere: m 1 = n 123 + n 12 + n 13 + n 1). The likelihood of a set of parameters is then L{o 1, o 2, o 3, l, g} = Pr{data|o 1, o 2, o 3, l, g} with maximum likelihood estimates (MLE) at Math Confidence intervals for l, g, and o 1 were derived by using the profile likelihood method, which treats all parameters except one as nuisance parameters and maximizes over them (67, 68).

Fig. 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 1.

General phylogenies for demonstrating the method. (A) External branch. (B) Internal branch. Arrows indicate the branches analyzed.

Internal branches. Fig. 1b depicts the scenario for an internal branch. Here we define o 1 = Pr{1|X}, o 2 = Pr{2|X}, o 3 = Pr{3|Y}, o 4 = Pr{4|Y}, and r = Pr{X|Y}, where for instance Pr{1|X} is the probability that an intron is present in some species in group 1 given that it is present in ancestor X, and Pr{X|Y} is the probability that an intron present in ancestor Y is retained in ancestor X.

Table 1 gives all possible histories of an intron present in Y. To estimate numbers of Y-X branch losses, alternative histories leading to identical modern phylogenetic patterns are treated separately: for instance, both x and xi lead to presence in 3 and 4 and absence in 1 and 2, but xi includes loss along the Y-X branch, whereas x includes retention along the Y-X branch, followed by independent loss in 1 and 2.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 1. Possible histories of introns

n and m values (except nX ) are known values similar to those above (e.g., n 124 is the number of introns present in 1, 2, and 4, but not 3; m 124 = n 124 + n 1234). l, g, l 34, and nX are unknown quantities to be estimated: l and g are the total number of introns lost and gained along the Y-X branch, respectively, l 34 is the number present in both 3 and 4 but lost along the Y-X branch, and nX is the number present in X but absent in 3 and 4. g is thus nX minus the number of introns present in Y, retained in X, and absent in 3 and 4 (xii–xv). Introns with histories xvi and xvii are neither gained or lost along the Y-X branch nor directly observable as ancestrally present and are thus uninformative and ignored. For each of the nX introns present in X but absent in 3 and 4, Table 2 gives the possible subsequent histories.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 2.

Possible histories and modern phylogenetic distributions for an intron present in ancestor X but absent in groups 3 and 4

The conditional probability of seeing the data simplifies to: Math Math Math Math Math Math where K is the product of the factorials of several known n values (see Supporting Text, which is published as supporting information on the PNAS web site). The likelihood of a set of parameters is then L{o 1, o 2, o 3, o 4, r, l, g, l 34, nX } = Pr{data|o 1, o 2, o 3, o 4, r, l, g, l 34, nX }, which has its MLE at Math where parentheses indicate that an intron must be present in at least one of the parenthesized groups to be counted (e.g., n (12)3 = n 13 + n 23 + n 123; m (12)3 = n 13 + n 134 + n 23 + n 234 + n 123 + n 1234).

Introns Gained and Subsequently Lost Along the Same Branch. For each branch, the estimated g includes only gained introns that survive to the end of the branch; the estimated l includes only introns present at the beginning of the branch and then lost. Both values exclude introns that are gained and then lost along the same branch. As such, g and l underestimate the real numbers of introns gained and lost along the branch (call them g′ and l′, respectively).

If introns are lost at constant rate along the branch and a total fraction 1 – x of introns present at the beginning of a branch are lost before the end of that branch, an intron gained at a fraction f of the way along the length of the branch has an x 1–f chance of being retained until the end of the branch. If intron gains also occur at constant rate along the branch, then Math

For each external and internal branch, we used the MLE of g and x (equals ô 1 for external branches andr̂for internal branches) to give estimates for g′, and then simply l′ = l + g′ – g.

Rates of Intron Gain and Loss. For each branch, the estimated number of intron gains in the studied regions per year is simply the estimated g′ divided by T, the estimated branch length in years. The rate per site is then simply the rate for the whole region divided by the total number of possible insertion sites (488,157) in the region. The estimated yearly rate of loss is d = 1 – x 1/T.

Intron Number in the Plasmodium-Crown Ancestor. The MLE for the probability that an intron present in the animal–plant (crown) ancestor is retained in Arabidopsis thaliana is 0.61, and the MLE for the probability it is retained in Schizosaccharomyces pombe and/or an animal is 0.75 (because 73 of 97 introns present in Arabidopsis thaliana and Plasmodium falciparum are present in S. pombe and/or an animal), so the chance that an intron present in the crown ancestor is found in some modern descendent is 1 – (1–0.75) × (1–0.61) = 0.90.

Postulating a rate of intron loss per year d for the deepest branches of the tree and divergence times of t 1 and t 2 years for the plant–animal and crown–apicomplexan divergences, respectively, the probability that an intron present in the ancestor is retained in P. falciparum and some modern crown descendent is 0.9(1–d)2t2–t1. The observed 143 introns shared between P. falciparum and crown group taxa thus suggest some 143/[0.9(1–d)2t2–t1] total introns present in the common ancestor.

Results

Data Set. We studied the intron–exon structures of conserved regions of 684 sets of orthologs from seven eukaryotic species. For each set of orthologs, intron positions are mapped onto the protein sequence, and the protein sequences aligned, giving numbers of intron positions shared between any group (pair, trio, etc.) of species. The data are summarized in Tables 3 and 4. An earlier study used Dollo parsimony to reconstruct the history of intron gains and losses in this data set (34). Such a reconstruction is provided in Fig. 2 for comparison.

Fig. 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 2.

A Dollo parsimony reconstruction of the data for comparison with our results.

View this table:
  • View inline
  • View popup
Table 3. Summary of the data used in estimating intron losses and gains along external branches
View this table:
  • View inline
  • View popup
Table 4. Summary of the data used in estimating the numbers of intron losses and gains along internal branches

Numbers of Intron Losses and Gains. We used the pattern of intron conservation in the conserved regions of 684 groups of orthologs from seven eukaryotic species to calculate MLE for numbers of intron gains and losses in these genes along each branch of the tree. The estimates are shown in Fig. 3A with confidence intervals given in Fig. 4, which is published as supporting information on the PNAS web site. These estimates exclude introns gained and then lost along the same branch. Assuming constant rates of intron loss and gain along the length of each branch, we corrected our estimates for such introns (Fig. 3B ).

Fig. 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 3.

MLE of the numbers of intron losses and gains for 684 groups of orthologs. Number of introns present in modern species or previously estimated present in ancestors (50) are given in black. MLE of the number of intron losses and gains along each branch are given in blue and red, respectively. Blue branches are inferred to have experienced >1.5 losses per gain; red branches >1.5 gains per loss. (A) Initial estimates, excluding introns that are gained and subsequently lost along the same branch. (B) Final results, correcting for introns that are gained and subsequently lost along the same branch. The estimate of the number of introns present in the studied regions in the Plasmodium crown ancestor is derived assuming that the deepest branches had an average rate of loss (see Discussion).

Overall Probabilities of Loss and Rates of Intron Loss and Gain. We calculated MLE for the fraction of introns lost along each external branch (Table 5; confidence intervals in Fig. 5, which is published as supporting information on the PNAS web site). Given estimates of the length of each external branch, we then estimated yearly rates of loss of existent introns and of insertion of new introns per possible insertion site (adjacent pair of nucleotides) (Table 5).

View this table:
  • View inline
  • View popup
Table 5. Estimates of intron loss and gain for external branches

Discussion

We provide MLE for the numbers and rates of intron gain and loss in 684 sets of orthologs over a variety of eukaryotic lineages. These results show significant variations between lineages in rates of both intron loss and gain and in relative rates of the two processes. Rates of intron loss along external branches vary ≈10-fold, from ≈2 × 10–10 to 2 × 10–9. Rates of gain are orders of magnitude smaller, at 6 × 10–13 to 4 × 10–12 per year per possible insertion site (pair of adjacent nucleotides).

Ratios of the rates of the two processes also vary considerably between lineages. Among external branches, ratios of the rates of intron loss and gain vary 20-fold from 113 up to 2,380; ratios of the total numbers of intron losses to gains vary 10-fold from one-half to five. There is an inverse correspondence across branches between rates of intron gain and intron loss: some branches have high rates of loss and low rates of gain (Drosophila melanogaster, Anopheles gambiae, and S. pombe); others have high rates of gain and low rates of loss (H. sapiens and Arabidopsis thaliana). This pattern is not predicted by a purely neutral model of intron evolution but is instead suggestive of differential intensity or efficiency (because of differences in population size) of selection across lineages. The sole exception to this pattern is the branch running from the ecdysozoan ancestor to C. elegans, which shows high rates of both intron loss and gain. This observation joins a host of other differences in intron evolution between nematodes and other groups, with nematodes showing unusually strict splice junction consensus sequences (discussed in ref. 43) as well as a lack of a suite of otherwise general biases in the pattern of intron loss (51).

In contrast to the pattern seen among external branches, two of the internal branches show extremely skewed patterns, with the bilateran–ecdysozoan branch showing 1,005 intron losses but no intron gains and the opisthokont–bilateran branch showing 1,466 gains and only 48 losses. These aberrant patterns are most likely due to unaccounted for differences in loss rates between introns along the same branch. Such differences cause systematic underestimation of intron losses and overestimation of intron gains on external branches (thus our finding of an excess of intron loss is conservative); the pattern is not so predictable for internal branches (unpublished data). Future work should explore the effects of such interintron rate differences on intron loss and gain estimates for internal branches.

Comparison with Parsimony. Our results stand in contrast to those of Rogozin et al. (34), who used Dollo parsimony to reconstruct the history of intron gain and loss in this same data set. Such a reconstruction is given in Fig. 2. Comparison of Figs. 2 and 3 shows consistent differences in the estimates of numbers of intron losses and gains between the two methods, with parsimony generally favoring intron gain over intron loss. Whereas parsimony infers that 5 studied branches of 10 experienced at least 50% more intron gains than losses, maximum likelihood shows only 2 such branches. On the other hand, parsimony infers that only three branches have experienced 50% more losses than gains, whereas maximum likelihood shows six. Parsimony suggests that the bilateran–human branch has experienced 17 gains for each loss, whereas maximum likelihood shows equal numbers of gains and losses; parsimony shows two gains per loss in the ecdyosozoan–C. elegans branch versus maximum likelihood's two losses per gain. These differences are caused by the failure of parsimony to estimate intron losses that are not directly observed, leading to an overemphasis of intron gain.

Early Eukaryotic Evolution. The external branches studied show 0.3 to 2 intron gains in the studied regions per million years. We previously estimated that the common ancestor of animals and plants harbored some 2,000 introns in these regions (50). If even earlier ancestors had accumulated 0.3 introns per year, nearly 7 billion years of constant gain would be required to reach this density. Even at the highest rates observed (2 introns per year), this intron density requires 1 billion years of steady intron accretion, still presumably predating the prokaryote–eukaryote splits.

This apparent paradox suggests that recent intron creation may be very different from the process that created the first spliceosomal introns. Two such two-tiered systems have been proposed. First, the introns present in the plant–animal ancestor could have been largely due not to insertion, but to retention of introns present at the time of formation of their resident genes, as envisioned by the introns-early hypothesis. The number of such introns would thus be unrelated to more recent intron insertion rates.

Alternatively, introns in early eukaryotes could also have been gained, but by difference processes than more recent gains. Indeed, proposed models for insertion of new introns (intron transposition, ref. 2; transposon insertion, ref. 69; tandem genomic duplication, ref. 70; and transfer of introns from paralogs through gene conversion, ref. 71) assume a preexistent spliceosome and cannot explain the initial emergence of the spliceosomal system. Yet the only proposed model that offers such an explanation, transfer of type II introns from bacterial endosymbionts (5), cannot explain observed recent intron gains in species whose endosymbionts lack such introns (e.g., ref. 44). Thus, if any proposed models of intron gain are correct, new intron gains in at least some species must result from processes different from those that created the first spliceosomal introns.

The first introns could have arisen by a major event in early eukaryotic evolution coincident with the creation of the splicing machinery, most plausibly a massive invasion of the eukaryotic nucleus by type II introns from early endosymbionts (5). At this time, the self-splicing type II intron apparatus would have transformed into the nascent eukaryotic spliceosome (72). More recent introns could then arise either by further type II intron insertions or completely unrelated processes. In this case, early arising introns would be well defined, truly homologous, type II-related elements; more recent introns would not necessarily be homologous either to earlier-arising introns or to each other. Instead, new introns would arise from any mutation causing an insertion of sequence into a coding region that is then efficiently removed from transcripts by the spliceosome.

An additional corollary of our results, depending on the phylogeny (e.g., ref. 73), concerns the intron density of the common ancestor of apicomplexans with plants, animals, and fungi. Although the lack of an outgroup in the data set prohibits direct estimation of the number of introns present in that ancestor, we can make inferences assuming that intron loss rates in those deepest branches were similar to rates observed in other branches. Assuming the branches from the Plasmodium crown divergence [assumed to be 1.75 billion years ago (Bya)] to the crown ancestor (assumed 1.5 Bya) and to modern Plasmodium experienced a low rate of intron loss similar to rates in chordates and plants (say 3.5 × 10–10 per year) gives an estimate of 305 introns in the deep ancestor in the studied regions. Assuming a high rate of loss such as that found along other branches (say 1.5 × 10–9 per year) yields an estimate of 4,099 introns, more than in any studied species. The presumably unicellular character of such an ancestor suggests evolutionary pressures more similar to those in yeast than in vertebrates, favoring the latter estimate and suggesting extraordinarily deep intron-rich eukaryotic ancestors. Using the average estimated rate of loss over all external branches (1.24 × 10–9) gives an estimate of 1,948 introns in this ancestor, close to the estimates for the crown group and opisthokont ancestors (Fig. 3). Intron-dense gene structures thus may be even older than previously appreciated.

Using similar methods, Nielsen et al. (49) recently found that intron losses and gains in four species of euascomyces fungi had been roughly balanced for the past ≈330 million years. This finding contrasts with the general excess of losses over gains found here and is particularly surprising in view of the observed correlation of intron number with organismal complexity, which might predict high loss rates in euascomyces. One possible explanation is that, having shed most of their unnecessary introns in early fungal evolution, euascomyces have experienced a more recent equilibration in intron number. A large fraction of the few remaining ancestral introns could be retained because of some selective advantage, whereas other introns are gained and lost in roughly equal numbers.

Conclusion

These results illuminate the relative importance of intron loss and gain in eukaryotic evolution. Rates of loss of existent introns are slightly lower than nucleotide substitution rates. The rate at which introns are gained per site is orders of magnitude smaller. Over a range of external branches, the ratio of total intron losses to gains varies from one-half to five. Studied lineages appear to be gaining introns at a rate that cannot explain the apparently high intron density of very early eukaryotic genomes, suggesting that the processes of intron birth in early eukaryotes could be fundamentally different from the processes in more recent evolution.

Footnotes

  • ↵ * To whom correspondence should be addressed. E-mail: scottroy{at}fas.harvard.edu.

  • Abbreviation: MLE, maximum likelihood estimates.

  • Copyright © 2005, The National Academy of Sciences

References

  1. ↵
    Perler, F., Efstratiadis, A., Lomedico, P., Gilbert, W., Kolodner, R. & Dodgson, J. (1980) Cell 20 , 555–556. pmid:7388949
    OpenUrlCrossRefPubMed
  2. ↵
    Cavalier-Smith, T. (1985) Nature 315 , 283–284. pmid:2987701
    OpenUrlPubMed
  3. ↵
    Dibb, N. J. & Newman, A. J. (1989) EMBO J. 8 , 2015–2021. pmid:2792080
    OpenUrlPubMed
  4. ↵
    Palmer, J. D. & Logsdon, J. M., Jr. (1991) Curr. Opin. Genet. Dev. 1 , 470–477. pmid:1822279
    OpenUrlCrossRefPubMed
  5. ↵
    Cavalier-Smith, T. (1991) Trends Genet. 7 , 145–148. pmid:2068786
    OpenUrlCrossRefPubMed
  6. ↵
    Nyberg A. M. & Cronhjort, M. B. (1992) J. Theor. Biol. 157 , 175–190. pmid:1434673
    OpenUrlCrossRefPubMed
  7. ↵
    Tittiger, C., Whyard, S. & Walker, V. K. (1993) Nature 361 , 470–472. pmid:8429888
    OpenUrlCrossRefPubMed
  8. Kwiatowski, J., Krawczyk, M., Kornacki, M., Bailey, K. & Ayala, F. J. (1995) Proc. Natl. Acad. Sci. USA 92 , 8503–8506. pmid:7667319
    OpenUrlAbstract/FREE Full Text
  9. Logsdon, J. M., Jr., Tyshenko, M. G., Dixon, C., Jafai, J. D., Walker, V. K. & Palmer, J. D. (1995) Proc. Natl. Acad. Sci. USA 92 , 8507–8511. pmid:7667320
    OpenUrlAbstract/FREE Full Text
  10. ↵
    Cho, G. & Doolittle, R. F. (1997) J. Mol. Evol. 44 , 573–584. pmid:9169549
    OpenUrlCrossRefPubMed
  11. ↵
    Stolzfus, A., Logsdon, J. M., Jr., Palmer, J. D. & Doolittle, W. F. (1997) Proc. Natl. Acad. Sci. USA 94 , 10739–10744. pmid:9380704
    OpenUrlAbstract/FREE Full Text
  12. ↵
    Tyshenko, M. G. & Walker, V. K. (1997) Biochim. Biophys. Acta. 1353 , 131–136. pmid:9294007
    OpenUrlPubMed
  13. Hankeln, T., Friedl, H., Ebersberger, I., Martin, J. & Schmidt, E. R. (1997) Gene 31 , 151–160.
    OpenUrl
  14. ↵
    Frugoli, J. A., McPeak, M. A., Thomas, T. L., McClung, C. R. (1998) Genetics 149 , 355–365. pmid:9584109
    OpenUrlAbstract/FREE Full Text
  15. ↵
    Logsdon, J. M., Jr. (1998) Curr. Opin. Genet. Dev. 8 , 637–648. pmid:9914210
    OpenUrlCrossRefPubMed
  16. ↵
    Matthews, C. M. & Trotman, C. N. (1998) J. Mol. Evol. 47 , 763–771. pmid:9847418
    OpenUrlCrossRefPubMed
  17. ↵
    Gotoh, O. (1998) Mol. Biol. Evol. 15 , 1447–1459. pmid:12572608
    OpenUrlCrossRefPubMed
  18. ↵
    Robertson, H. M. (1998) Genome Res. 8 , 449–463. pmid:9582190
    OpenUrlAbstract/FREE Full Text
  19. Patthy, L. (1999) Gene 238 , 103–114. pmid:10570989
    OpenUrlCrossRefPubMed
  20. ↵
    Venkatesh, B., Ning, Y. & Brenner, S. (1999) Proc. Natl. Acad. Sci. USA 96 , 10267–10271. pmid:10468597
    OpenUrlAbstract/FREE Full Text
  21. ↵
    Robertson, H. M. (2000) Genome Res. 10 , 192–203. pmid:10673277
    OpenUrlAbstract/FREE Full Text
  22. ↵
    Paquette, S. M., Bak, S. & Feyereisen, R. (2000) DNA Cell Biol. 19 , 307–317. pmid:10855798
    OpenUrlCrossRefPubMed
  23. ↵
    Wolf, Y. I., Kondrashov, F. A. & Koonin, E. V. (2000) Trends Genet. 16 , 333–334. pmid:10904260
    OpenUrlCrossRefPubMed
  24. ↵
    Roy, S. W., Lewis, B. P., Fedorov, A. & Gilbert, W. (2001) Trends Genet. 17 , 496–499. pmid:11530796
    OpenUrlCrossRefPubMed
  25. ↵
    Wolf, Y. I., Kondrashov, F. A. & Koonin, E. V. (2001) Trends Genet. 17 , 499–501. pmid:11721681
    OpenUrlCrossRefPubMed
  26. Lynch, M. (2002) Proc. Natl. Acad. Sci. USA 99 , 6118–6123. pmid:11983904
    OpenUrlAbstract/FREE Full Text
  27. Sakurai, A. Fujimori, S., Kochiwa, H., Kitamura-Abe, S., Washio, T., Saito, R., Carninci, P., Hayashizaki, Y & Tomita, M. (2002) Gene 300 , 89–95. pmid:12468090
    OpenUrlCrossRefPubMed
  28. ↵
    Hartung, F., Blattner, F. R., Puchta, H. (2002) Nucleic Acids Res. 30 , 5175–5181. pmid:12466542
    OpenUrlCrossRefPubMed
  29. ↵
    Wada, H., Kobayashi, M., Sato, R., Satoh, N., Miyasaka, H. & Shirayama, Y. (2002) J. Mol. Evol. 54 , 118–128. pmid:11734905
    OpenUrlCrossRefPubMed
  30. ↵
    Mourier, T. & Jeffares, D. C. (2003) Science 300 , 1393. pmid:12775832
    OpenUrlFREE Full Text
  31. ↵
    Bon, E., Casaregola, S., Blandin, G., Llorente, B., Neuveglise, C., Munsterkotter, M., Guldener, U., Mewes, H. W., Van Helden, J., Dujon, B. & Gaillardin, C. (2003) Nucleic Acids Res. 31 , 1121–1135. pmid:12582231
    OpenUrlCrossRefPubMed
  32. Fedorov, A., Roy, S., Fedorova, L. & Gilbert, W. (2003) Genome Res. 13 , 2236–2241. pmid:12975308
    OpenUrlAbstract/FREE Full Text
  33. ↵
    Tarrio, R., Rodriguez-Trelles, F. & Ayala, F. J. (2003) Proc. Natl. Acad. Sci. USA 100 , 6580–6583. pmid:12750476
    OpenUrlAbstract/FREE Full Text
  34. ↵
    Rogozin, I. B., Wolf, Y. I., Sorokin, A. V., Mirkin, B. G. & Koonin, E. V. (2003) Curr. Biol. 13 , 1512–1517. pmid:12956953
    OpenUrlCrossRefPubMed
  35. ↵
    Zhaxybayeva, O. & Gogarten, J. P. (2003) Curr. Biol. 13 , R764–R766. pmid:14521854
    OpenUrlCrossRefPubMed
  36. ↵
    Roy, S. W., Fedorov, A. & Gilbert, W. (2003) Proc. Natl. Acad. Sci. USA 100 , 7158–7162. pmid:12777620
    OpenUrlAbstract/FREE Full Text
  37. ↵
    de Souza, S. J. (2003) Genetica (The Hague) 118 , 117–121. pmid:12868602
    OpenUrlCrossRefPubMed
  38. Sverdlov, A. V., Rogozin, I. B., Babenko, V. N. & Koonin, E. V. (2003) Curr. Biol. 13 , 2170–2174. pmid:14680632
    OpenUrlCrossRefPubMed
  39. ↵
    Babenko, V. N., Rogozin, I. B., Mekhedov, S. L. & Koonin, E. V. (2004) Nucleic Acids. Res. 32 , 3724–3733. pmid:15254274
    OpenUrlCrossRefPubMed
  40. ↵
    Qiu, W. G., Schisler, N. & Stoltzfus, A. (2004) Mol. Biol. Evol. 21 , 1252–1263. pmid:15014153
    OpenUrlCrossRefPubMed
  41. Logsdon, J. M., Jr. (2004) Proc. Natl. Acad. Sci. USA 101 , 11195–11196. pmid:15277668
    OpenUrlFREE Full Text
  42. ↵
    Kionthke, K., Gavin, N. P., Raynes, Y., Roehrig, C., Piano, F. & Fitch, D. H. A. (2004) Proc. Natl. Acad. Sci. USA 101 , 9003–9008. pmid:15184656
    OpenUrlAbstract/FREE Full Text
  43. ↵
    Cho, S., Jin, S. W., Cohen, A. & Ellis, R. E. (2004) Genome Res. 14 , 1207–1220. pmid:15231741
    OpenUrlAbstract/FREE Full Text
  44. ↵
    Coghlan, A. & Wolfe, K. H. (2004) Proc. Natl. Acad. Sci. USA 101 , 11362–11367. pmid:15243155
    OpenUrlAbstract/FREE Full Text
  45. Sadusky, T., Newman, A. J. & Dibb, N. J. (2004) Curr. Biol. 14 , 505–509. pmid:15043816
    OpenUrlPubMed
  46. ↵
    Banyai, L. & Patthy, L. (2004) FEBS Lett. 565 , 127–132. pmid:15135065
    OpenUrlCrossRefPubMed
  47. ↵
    Bryson-Richardson, R. J., Logan, D. W., Currie, B. D. & Jackson, J. J. (2004) Gene 338 , 15–23. pmid:15302402
    OpenUrlCrossRefPubMed
  48. Sverdlov, A. V., Babenko, V. N., Rogozin, I. B. & Koonin, E. V. (2004) Gene 338 , 85–91. pmid:15302409
    OpenUrlCrossRefPubMed
  49. ↵
    Nielsen, C. B., Friedman, B., Birren, B., Burge, C. B. & Galagan, J. E. (2004) PloS Biol. 2 , e422. pmid:15562318
    OpenUrlCrossRefPubMed
  50. ↵
    Roy. S. W. & Gilbert, W. (2005) Proc. Natl. Acad. Sci. USA 102 , 713–718. pmid:15642949
    OpenUrlAbstract/FREE Full Text
  51. ↵
    Roy. S. W. & Gilbert, W. (2005) Proc. Natl. Acad. Sci. USA 102 , 1986–1991. pmid:15687506
    OpenUrlAbstract/FREE Full Text
  52. ↵
    Gilbert, W. (1978) Nature 271 , 501. pmid:622185
    OpenUrlCrossRefPubMed
  53. Gilbert, W. (1987) Cold Spring Harb. Symp. Quant. Biol. 52 , 901–905. pmid:2456887
    OpenUrlAbstract/FREE Full Text
  54. Fedorov, A., Suboch, G., Bujakov, M. & Fedorova, L. (1992) Nucleic Acids. Res. 20 , 2553–2557. pmid:1598214
    OpenUrlCrossRefPubMed
  55. Long, M., de Souza, S. J., Rosenberg, C. & Gilbert, W. (1998) Proc. Natl. Acad. Sci. USA 95 , 219–223. pmid:9419356
    OpenUrlAbstract/FREE Full Text
  56. ↵
    De Souza, S. J., Long, M., Klein, R. J., Roy, S., Lin, S. & Gilbert, W. (1998) Proc. Natl. Acad. Sci. 95 , 5094–5099. pmid:9560234
    OpenUrlAbstract/FREE Full Text
  57. ↵
    Fast, N. M., Roger, A. J., Richardson, C. A. & Doolittle, W. F. (1998) Nucleic Acids Res. 26 , 3202–3207. pmid:9628919
    OpenUrlCrossRefPubMed
  58. Fast, N. M. & Doolittle, W. F. (1999) Mol. Biochem. Parisitol. 99 , 275–278.
    OpenUrlCrossRefPubMed
  59. Gardner, M. J., Shallom, S. J., Carlton, J. M., Salzberg, S. L., Nene, V., Shoaibi, A., Ciecko, A., Lynn, J., Rizzo, M., Weaver, B., et al. (2002) Nature 419 , 531–534. pmid:12368868
    OpenUrlCrossRefPubMed
  60. ↵
    Hall, N., Pain, A., Berriman, M., Churcher, C., Harris, B., Harris, D., Mungall, K., Bowman, S., Atkin, R., Baker, S., et al. (2002) Nature 419 , 527–531. pmid:12368867
    OpenUrlCrossRefPubMed
  61. ↵
    Elgar, G. (1996) Hum. Mol. Genet. 5 , 1437–1442. pmid:8875249
    OpenUrlPubMed
  62. ↵
    Kent, W. J. & Zahler, A. M. (2000) Genome Res. 10 , 1115–1125. pmid:10958630
    OpenUrlAbstract/FREE Full Text
  63. ↵
    Castillo-Davis, C. I., Bedford, T. B. & Hartl, D. L. (2004) Mol. Biol. Evol. 21 , 1422–1427. pmid:15084679
    OpenUrlCrossRefPubMed
  64. ↵
    Baldauf, S. L., Rogers, A. J., Wenk-Siefert, I. & Doolittle, W. F. (2000) Science 290 , 972–977. pmid:11062127
    OpenUrlCrossRefPubMed
  65. Lonberg, N. & Gilbert, W. (1985) Cell 40 , 81–90. pmid:2981634
    OpenUrlCrossRefPubMed
  66. Rogozin, I. B., Lynons-Weiler, J. & Koonin, E. V. (2000) Trends Genet. 16 , 430–432. pmid:11050324
    OpenUrlCrossRefPubMed
  67. ↵
    Kalbfleisch, J. D. & Sprott, D. A. (1970) J. R. Stat. Soc. B 32 , 175–208.
    OpenUrl
  68. ↵
    Cox, D. R. (1970) Analysis of Binary Data (Methuen, London).
  69. ↵
    Crick, F. (1979) Science 204 , 264–271. pmid:373120
    OpenUrlAbstract/FREE Full Text
  70. ↵
    Rogers, J. H. (1989) Trends Genet. 5 , 213–216. pmid:2551082
    OpenUrlCrossRefPubMed
  71. ↵
    Hankeln, T., Friedl, H., Ebersberger, I., Martin, J. & Schmidt, E. R. (1997) Gene 205 , 151–160. pmid:9461389
    OpenUrlCrossRefPubMed
  72. ↵
    Stoltzfus, A. (1999) J. Mol. Evol. 49 , 169–181. pmid:10441669
    OpenUrlCrossRefPubMed
  73. ↵
    Cavalier-Smith, T. (1999) Eukaryotic Microbiol. 46 , 347–366.
    OpenUrl
View Abstract
PreviousNext
Back to top
Article Alerts
Email Article

Thank you for your interest in spreading the word on PNAS.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Rates of intron loss and gain: Implications for early eukaryotic evolution
(Your Name) has sent you a message from PNAS
(Your Name) thought you would like to see the PNAS web site.
Citation Tools
Rates of intron loss and gain: Implications for early eukaryotic evolution
Scott William Roy, Walter Gilbert
Proceedings of the National Academy of Sciences Apr 2005, 102 (16) 5773-5778; DOI: 10.1073/pnas.0500383102

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Request Permissions
Share
Rates of intron loss and gain: Implications for early eukaryotic evolution
Scott William Roy, Walter Gilbert
Proceedings of the National Academy of Sciences Apr 2005, 102 (16) 5773-5778; DOI: 10.1073/pnas.0500383102
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Mendeley logo Mendeley
Proceedings of the National Academy of Sciences: 116 (13)
Current Issue

Submit

Sign up for Article Alerts

Jump to section

  • Article
    • Abstract
    • Methods
    • Results
    • Discussion
    • Conclusion
    • Footnotes
    • References
  • Figures & SI
  • Info & Metrics
  • PDF

You May Also be Interested in

Tracing the origin of Europe’s megaliths
Tracing the origin of Europe’s megaliths
Bettina Schulz Paulsson explains the origin and spread of Europe's megaliths, including Stonehenge.
Listen
Past PodcastsSubscribe
Researchers are mining the stuff we excrete to get a window on drug use, antibiotic resistance, and the overall health of populations. Image credit: Biobot Analytics.
News Feature: Interested in gauging a population’s health? Look to sewage
Researchers are mining the stuff we excrete to get a window on drug use, antibiotic resistance, and the overall health of populations.
Image credit: Biobot Analytics.
For too long, the considerable importance and impacts of recreational fisheries have been ignored. Policymakers and managers need to do a better job acknowledging and addressing this very influential sector.
Opinion: Governing the recreational dimension of global fisheries
For too long, the considerable importance and impacts of recreational fisheries have been ignored. Policymakers and managers need to do a better job acknowledging and addressing this very influential sector.
Image credit: Florian Möllers (photographer).
PNAS QnAs with NAS foreign associate and physicist Anne L’Huillier
Featured QnAs
PNAS QnAs with NAS foreign associate and physicist Anne L’Huillier.
Image courtesy of Erika Weiland (photographer).
Brain. Image courtesy of Pixabay/geralt.
Sex differences in metabolic brain aging
Brain metabolism scans in individuals aged 20–82 years revealed that compared with male brains, female brains appeared three to four years younger on average, suggesting potential links between sex and human brain aging.
Image courtesy of Pixabay/geralt.

More Articles of This Classification

Biological Sciences

  • Prion protein quantification in human cerebrospinal fluid as a tool for prion disease drug development
  • Ephemeral states in protein folding under force captured with a magnetic tweezers design
  • Spontaneous ribosomal translocation of mRNA and tRNAs into a chimeric hybrid state
Show more

Evolution

  • Disease mortality in domesticated animals is predicted by host evolutionary relationships
  • Polyandrous bee provides extended offspring care biparentally as an alternative to monandry based eusociality
  • Mechanisms for achieving high speed and efficiency in biomolecular machines
Show more

Related Content

  • No related articles found.
  • Scopus
  • PubMed
  • Google Scholar

Cited by...

  • Intron evolution in Neurospora: the role of mutational bias and selection
  • RNA-seq analysis of the C. briggsae transcriptome
  • Genome-wide analysis of retrogene polymorphisms in Drosophila melanogaster
  • Evolution of Yeast Noncoding RNAs Reveals an Alternative Mechanism for Widespread Intron Loss
  • Extensive, Recent Intron Gains in Daphnia Populations
  • Alternative splicing: A missing piece in the puzzle of intron gain
  • Large-scale comparative analysis of splicing signals and their corresponding splicing factors in eukaryotes
  • Three distinct modes of intron dynamics in the evolution of eukaryotes
  • Evolutionarily conserved genes preferentially accumulate introns
  • Centromeres were derived from telomeres during the evolution of the eukaryotic chromosome
  • Characterization of intron loss events in mammals
  • Large-scale intron conservation and order-of-magnitude variation in intron loss/gain rates in apicomplexan evolution
  • Very little intron loss/gain in Plasmodium: Intron loss/gain mutation rates and intron number
  • Genomics and the irreducible nature of eukaryote cells.
  • The Essential Vertebrate ABCE1 Protein Interacts with Eukaryotic Initiation Factors
  • Scopus (134)
  • Google Scholar

Similar Articles

Site Logo
Powered by HighWire
  • Submit Manuscript
  • Twitter
  • Facebook
  • RSS Feeds
  • Email Alerts

Articles

  • Current Issue
  • Latest Articles
  • Archive

PNAS Portals

  • Classics
  • Front Matter
  • Teaching Resources
  • Anthropology
  • Chemistry
  • Physics
  • Sustainability Science

Information

  • Authors
  • Editorial Board
  • Reviewers
  • Press
  • Site Map

Feedback    Privacy/Legal

Copyright © 2019 National Academy of Sciences. Online ISSN 1091-6490