Skip to main content
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian
  • Log in
  • Log out
  • My Cart

Main menu

  • Home
  • Articles
    • Current
    • Latest Articles
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • Archive
  • Front Matter
  • News
    • For the Press
    • Highlights from Latest Articles
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Purpose and Scope
    • Editorial and Journal Policies
    • Submission Procedures
    • For Reviewers
    • Author FAQ
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian

User menu

  • Log in
  • Log out
  • My Cart

Search

  • Advanced search
Home
Home

Advanced Search

  • Home
  • Articles
    • Current
    • Latest Articles
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • Archive
  • Front Matter
  • News
    • For the Press
    • Highlights from Latest Articles
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Purpose and Scope
    • Editorial and Journal Policies
    • Submission Procedures
    • For Reviewers
    • Author FAQ

New Research In

Physical Sciences

Featured Portals

  • Physics
  • Chemistry
  • Sustainability Science

Articles by Topic

  • Applied Mathematics
  • Applied Physical Sciences
  • Astronomy
  • Computer Sciences
  • Earth, Atmospheric, and Planetary Sciences
  • Engineering
  • Environmental Sciences
  • Mathematics
  • Statistics

Social Sciences

Featured Portals

  • Anthropology
  • Sustainability Science

Articles by Topic

  • Economic Sciences
  • Environmental Sciences
  • Political Sciences
  • Psychological and Cognitive Sciences
  • Social Sciences

Biological Sciences

Featured Portals

  • Sustainability Science

Articles by Topic

  • Agricultural Sciences
  • Anthropology
  • Applied Biological Sciences
  • Biochemistry
  • Biophysics and Computational Biology
  • Cell Biology
  • Developmental Biology
  • Ecology
  • Environmental Sciences
  • Evolution
  • Genetics
  • Immunology and Inflammation
  • Medical Sciences
  • Microbiology
  • Neuroscience
  • Pharmacology
  • Physiology
  • Plant Biology
  • Population Biology
  • Psychological and Cognitive Sciences
  • Sustainability Science
  • Systems Biology

Evaluation of methods for detecting recombination from DNA sequences: Computer simulations

David Posada and Keith A. Crandall
PNAS November 20, 2001 98 (24) 13757-13762; https://doi.org/10.1073/pnas.241370698
David Posada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Keith A. Crandall
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  1. Edited by John C. Avise, University of Georgia, Athens, GA, and approved September 25, 2001 (received for review July 18, 2001)

  • Article
  • Figures & SI
  • Info & Metrics
  • PDF
Loading

Abstract

Recombination is a key evolutionary process that shapes the architecture of genomes and the genetic structure of populations. Although many statistical methods are available for the detection of recombination from DNA sequences, their absolute and relative performance is still unknown. Here we evaluated the performance of 14 different recombination detection algorithms. We used the coalescent with recombination to simulate DNA sequences with different levels of recombination, genetic diversity, and rate variation among sites. Recombination detection methods were applied to these data sets, and whether they detected or not recombination was recorded. Different recombination methods showed distinct performance depending on the amount of recombination, genetic diversity, and rate variation among sites. The model of nucleotide substitution under which the data were generated did not seem to have a significant effect. Most methods increase power with more sequence divergence. In general, recombination detection methods seem to capture the presence of recombination, but they are not very powerful. Methods that use substitution patterns or incompatibility among sites were more powerful than methods based on phylogenetic incongruence. Most methods do not seem to infer more false positives than expected by chance. Especially depending on the amount of diversity in the data, different methods could be used to attain maximum power while minimizing false positives. Results shown here will provide some guidance in the selection of the most appropriate method/s for the analysis of the particular data at hand.

Recombination, defined here as the exchange of genetic information between two nucleotide sequences, is an important process that influences biological evolution at many different levels. Recombination explains a considerable amount of genetic diversity in natural populations and, in general, genes located in regions of the genome with low levels of recombination have low levels of polymorphism. Recombination reshuffles existing variation and even creates new variants at the amino acid level. Indeed, recombination shapes the genetic structure of natural populations (1, 2) and the action of natural selection (3). Characterization of the role of recombination across genomes is of major interest. The study of recombination events will allow us to better understand the dynamics of genomes (4, 5). Recombination breaks down linkage disequilibrium and, consequently, the characterization of recombination is essential for gene mapping, quantitative trait loci, and association studies (6). In addition, recombination has a significant impact on the evolution of several human pathogens (7–9) and consequently on their clinical treatment and prevention. Moreover, many applications in biology today are based on the estimation of phylogenetic trees. One main assumption of most phylogenetic methods is that there is only one phylogeny underlying the evolution of the sequences under study. Recombination violates this assumption by generating mosaic genes, where different regions have different phylogenetic histories. By ignoring the presence of recombination, phylogenetic analysis may be severely compromised (10–13).

For all these reasons, the accurate detection of recombination from DNA sequences becomes very relevant, and indeed a number of methods have been developed for that purpose (D. L. Robertson, http://grinch.zoo.ox.ac.uk/RAP_links.html). Surprisingly, only a few studies have attempted to examine the relative performance of these methods (14–17). Although useful, these studies have been limited in the number of methods compared and the set of conditions evaluated. In practice, researchers are unable to make an objective selection of the most suitable method to detect recombination in their data. Here we perform a comprehensive analysis of 14 different methods for detecting recombination to determine relative performance and associated conditions of performance. We simulated DNA sequences with different rates of recombination, diversity, and rate heterogeneity to investigate the statistical power and the rate of false positives of the 14 different recombination detection algorithms.

Methods

A glossary of terms is described in Table 1. To study the statistical power (1—rate of Type II error, or the probability of rejecting the null hypothesis—no recombination—when it is false) and the rate of false positives (Type I error, or the probability of rejecting the null hypothesis when it is true) of the methods evaluated, two sets of simulations were carried out. In the first set (Simulations I, power analysis; Table 2), an increasing recombination rate was simulated for different levels of variation. In the second set (Simulations II, false positives; Table 3), increasing rate variation among sites was simulated for different levels of variation and without recombination. This design allows us to examine the confounding effect of rate variation with recombination (10). For each set of conditions, 100 replicates were simulated. The range of parameter values used in the simulations is commonly observed in real data sets. Software to perform these simulations is available on request. The general simulation strategy was:

View this table:
  • View inline
  • View popup
Table 1

Glossary of terms

View this table:
  • View inline
  • View popup
Table 2

Parameter values in Simulations I (power analysis)

View this table:
  • View inline
  • View popup
Table 3

Parameter values used in Simulations II (false positive analysis)

(i) Simulate recombinant genealogies by using the coalescent with recombination.

(ii) Evolve nucleotide sequences on the simulated genealogy to obtain a sequence alignment.

(iii) Apply 14 different methods to the simulated data and record how many times, of 100 replicates, a method infers the presence of recombination.

The Coalescent with Recombination.

Multiple genealogies for samples of n = 10 sequences were simulated under the coalescent with recombination (18–26). In the recombination model implemented here, there are n sequences with l sites, and the population consists of N diploid individuals. A continuous time approximation is used, and the time is scaled in units of 2N generations. The recombination rate (ρ) is defined as ρ = 4Nrl, where r is the rate of recombination per site per generation.

The coalescent is built backwards in time. It is constructed by waiting for recombination or coalescent events until all ancestral sites in the n sequences have found a most recent common ancestor (not necessarily the same ancestor for all sites). The waiting times to a coalescent event are exponentially distributed with mean k(k − 1)/2 (k is the number of sequences at a given generation). The waiting times to a recombination event are also exponentially distributed, but with mean 2NrG. G is the number of potential locations where recombination can happen (20). This quantity is a number between 0 and (l − 1)k, and it depends on the outcome of the previous events. G can be written as: Math1 where gi is the number of ways a sequence can be a recombinant descendant of two sequences both ancestral to the sample. As an example, the values of gi associated with the 4-tuples (a sequence with four sites) (1, 0, 0, 1) (1, 0, 1, 0), (1, 1, 0, 0), and (1, 0, 0, 0) are 3, 2, 1, and 0, respectively (20) (where 1 denotes that a site is ancestral and 0 that it is nonancestral). A site between two segments of ancestral material is called “trapped site;” in the tuple (1, 0, 1, 0), the first 0 is trapped, whereas the second is not (21).

Because coalescent and recombination events are independent, the time to one of these events happening is exponentially distributed with mean k(k − 1)/2 + 2NrG. The probabilities that a given event is either a coalescence or a recombination are: Math2 Math3 Simulation of a single realization of this process is performed by starting with k = n sequences at time 0, and determining when the first event (coalescence or recombination) happens by drawing a random number u from the uniform distribution: Math4 A decision is made whether this event is a coalescence or a recombination based in their relative probabilities, and then the number of sequences k is updated. If the event is a coalescence, two sequences are chosen uniformly, and their material is coalesced. The number of sequences decreases by one (k = k − 1). In the case of a recombination event, one sequence is chosen at random by assigning probabilities to the sequences based on their relative gi values. A recombination breakpoint is then chosen uniformly over the ancestral material and the nonancestral material that is trapped between two blocks of ancestral material. When a recombination event happens, the number of sequences increases by one (k = k + 1). After each event, the value of G is updated [initially, G = (l − 1)k]. The process is continued until each site in the extant sequences has found a most recent common ancestor (MRCA). With recombination, different parts of the alignment are likely to have different coalescent trees and different times to the MRCA.

The number of recombinational events in the history of a sample of size n, R(n), has the expectation Math5 (19). Not all recombination events, R(n) in total, are detectable. Regarding their effect on the genealogy, there are three types of recombination events: events that do not change the branch lengths, events that do change the branch lengths but do not change the topology, and events that change the topology. Wiuf et al. (15) provide the expectations for these three types of events.

Sequence Evolution.

Sequences were evolved on the simulated (potentially) recombinant genealogies. Several models of nucleotide substitution were used (Table 4) to study the effect of base frequency and transition/transversion ratio on the detection of recombination. Different mutation rates were used to obtain alignments with different levels of divergence (Tables 2 and 3). The expected average pairwise sequence divergence (p) depends on the parameters of the model of nucleotide substitution. A rough approximation for models with no rate variation would be Math6 where θ = 4Nμl, and μ is the mutation rate per site per generation. For example, a value of θ = 100 would indicate that a randomly chosen pair of sequences is expected to differ in 0.1/(1 + 0.1)≈9% of their sites (given that sequence length, l = 1,000).

View this table:
  • View inline
  • View popup
Table 4

Models of evolution and parameter values used in the simulations

Performance Evaluation.

The recombination detection algorithms were applied to the simulated data sets, and the number of times a method inferred the presence of recombination of the 100 replicates was recorded. This number approximates the probability of detecting recombination for each method and therefore is a convenient indicator of performance. Although some methods provide a qualitative answer for the presence of recombination (yes or no), most methods calculate a P value. In the later case, recombination was inferred when the provided P value was smaller than 0.05.

Methods for Detecting Recombination.

We evaluated 14 methods for the detection of recombination (Table 5). A detailed description of these methods is published as supporting information on the PNAS web site, www.pnas.org. In general, we can tentatively classify these methods as:

View this table:
  • View inline
  • View popup
Table 5

Methods for detecting recombination evaluated for performance

(i) Distance Methods.

Distance methods look for inversions of distance patterns among the sequences (27). In general, they use a sliding window approach and the estimation of some statistic based on genetic distances among the sequences. Because the phylogeny does not need to be known, these are normally fast methods.

(ii) Phylogenetic Methods.

Several methods infer recombination when phylogenies from different parts of the genome result in discordant topologies or when orthologous genes from different species are clustered. When comparisons of adjacent sequences yield different branching patterns, there is reason to suspect the involvement of recombinational events. If the consequence of such changes results in reconciling different sequence phylogenies to a single phylogeny, then the existence of such events becomes a reasonable hypothesis (28–33). These are the methods most extensively used in the literature.

(iii) Compatibility Methods.

Compatibility methods test for partition phylogenetic incongruence in a site-by-site basis and do not require the phylogeny of the sequences analyzed to be known (34, 35).

(iv) Substitution Distribution.

Nucleotide substitution distribution methods examine the sequences for a significant clustering of substitutions or fit to an expected statistical distribution (S. A. Sawyer, http://www.math.wustl.edu/∼sawyer/geneconv/index.html; refs. 36–43).

Implementation of Methods for Detecting Recombination.

Unless otherwise noted, the number of permutations was 1,000, and the family significance level used was 0.05. Permuted alignments were obtained by randomizing the position of the columns in the alignment.

The windows program simplot (33) was generously modified by Stuart Ray (simplot's author) to implement the bootscanning (44) of every sequence in the alignment against the rest. We used a sliding window size of 200 base pairs and a step size of 10 nucleotides. Neighbor-joining trees were estimated by using F84 distances (45, 46), and bootstrap values were obtained from 100 replicates. Several bootstrapping thresholds for assignment of parenthood were explored (70, 90, and 95%), but only the 95% threshold provided reasonable false positive rates. To implement the method of Sawyer (http://www.math.wustl.edu/∼sawyer), we used a modified version of geneconv 1.81. The global permutation P values based on blast-like global scores, obtained from 10,000 replicates smaller than 0.05, were considered evidence of recombination. A multiple comparison correction is already built into these P values, so there was no need for further correction. The parameter gscale, which scales the mismatch penalty, was set to 0. To implement the homoplasy test (42), two qbasic programs written by Maynard Smith were translated into a single c program, which was benchmarked against the original implementation. Because an outgroup was not used, and to be conservative, the number of effective sites, Se, was taken to be 0.6 × the total number of sites.

The program pist (A. Rambaut and M. Worobey, http://evolve.zoo.ox.ac.uk/) was modified for simulations to implement the informative sites test (43). pist takes as input and alignment a tree and the parameter values of a model of evolution. For each data set, a maximum likelihood (ML) tree was estimated under the Hasegawa–Kinshino–Yano (HKY) + Γ model. At the same time, we obtained ML estimates of the parameters in the HKY + Γ model (π, κ, and α). The trees and model parameter estimates for each data set were used in the pist analysis. For the parametric simulation of the null distribution of the statistic (see supporting information, www.pnas.org), 100 replicates were used. A computer program was written in c implementing a modification of Maynard Smith's maximum χ2 method (15, 41) by using only variable sites. The statistic is the maximum χ2 in the original alignment. The P value equals the number of times the original statistic is smaller than the statistic from permuted alignments divided by the number of permutations. For all calculations, a sliding window was used, with the width of the windows set to the number of polymorphic sites divided by 1.5. This window moved in steps of one nucleotide at a time. A previous implementation calculated the P values by calculating the value of the statistic in the permuted data sets exactly at the same position (breakpoint and sequences) where the original maximum was found. This strategy resulted in many false positives and was discarded. The computer program chimaera was written in c, implementing the maximum mismatch χ2 method (D.P., unpublished work) (see supporting information). The rest of the implementation is the same as in the maximum χ2 method. A computer program was written in c implementing an extension of the Phylogenetic Profiles (phypro) method (27), which in its original form does not provide statistical significance. Only variable sites were used. The statistic is the minimum distance vector correlation in the original alignment. The rest of the implementation was the same as in the maximum χ2 and chimaera methods.

The program plato (30) was also modified for simulations. For each data set, a maximum likelihood tree was estimated, with parameter estimates then obtained under the Hasegawa–Kinshino–Yano + Γ model of evolution. Those values were used in the calculation of the likelihoods in plato. A null distribution was simulated by 100 Monte Carlo replicates. The default window settings were used (minimum size, 5; step, 1). The windows program rdp (32) was generously modified by Darren Martin for the simulations. After exploring different conditions, the best settings were using internal and external reference sequences and a window size of 10 nucleotides. Turning off the multiple significance correction improved performance. The c program recpars (K. Fisker, ftp://ftp.daimi.aau.dk/pub/empl/kfisker/programs/RecPars) was modified for the simulations to implement the recombination parsimony method (28, 29). A recombination cost of three times the maximum substitution cost (d = 3 × s) was found to perform the best with the statistic below. The substitution costs were the true values of the parameters of the model. The statistic used was the number of histories recovered. If more than one history was recovered, recombination was inferred.

The c program reticulate (34) was modified for the simulations. The statistic used was the neighbor similarity score. A c program was written implementing the runs test (40). The program was benchmarked against results in Takahata's paper (40). Sneath's method (38, 47) implemented in qbasic was translated into c and benchmarked against the original implementation. Because P values are calculated for each pairwise comparison, P values were Bonferroni-corrected. To evaluate Stephens' method (36), an improved implementation written in fortran by Mary Kuhner (48) was translated into c and benchmarked against the original implementation. Because many tests are made, the Bonferroni correction was applied with a family α level of 0.01 (a 0.05 level gave an excess of false positives).

Results

Different methods for detecting recombination showed very distinct performance in different conditions (Fig. 1).

Figure 1
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 1

Power (Left) and rate of false positives (Right) corresponding to 14 recombination detection algorithms. The probability of detecting recombination is plotted against increasing levels of recombination (ρ) and nucleotide diversity (θ). Sequences were evolved under the Hasegawa–Kinshino–Yano model of evolution.

Power.

At very low divergence (θ = 10), the homoplasy test seems to be the more powerful method, attaining 80 and 100% detection levels when ρ equals 16 and 64, respectively. reticulate follows with 49 and 88% detection, respectively. pist attained 73% power only when ρ equals 64. Substitution methods like chimaera, maxchi, and geneconv show similar power compared with the methods above at low recombination levels, but do not increase detection even with increasing amounts of recombination after ρ = 16. These substitution methods (chimaera, maxchi, and geneconv) and phlypro were the most powerful in detecting recombination, followed by reticulate. Phylogenetic methods performed the worst. Some of them increased their overall power (rdp, recpars, triple) with increasing amounts of recombination (but the power was always lower relative to the substitution methods), whereas others detected recombination only when it was frequent (plato, bootscanning). At medium levels of divergence (θ = 100), power slightly increased for all methods, especially for low recombination rates, except for the homoplasy test, which detected recombination only when it was extremely frequent. At a high level of divergence (θ = 200), power again increased for low levels of recombination, except for the homoplasy test.

False Positives.

Most methods showed false positive rates around the expectation of 5%. However, the homoplasy test inferred recombination (30–86%), with extreme levels of rate variation (α = 0.05). At high levels of divergence, methods like pist and, to a lesser extent, recpars, rdp, and triple, also inferred recombination 11–49% of the time when rate variation was extreme. This false positive rate trend with high rate variation was more evident with increasing levels of divergence.

Models of Nucleotide Substitution.

The model of substitution used in the simulations did not have significant impact on the power of the different recombination detection methods (see supporting information on the PNAS web site). However, it seems that more complex models slightly increased the power and rate of false positives for some methods.

Discussion

Most methods showed more power with increased rates of recombination, which is the expected behavior for efficient methods. However, some methods are more efficient than others. Most methods showed better performance at higher levels of divergence, probably because in such cases there is more information available to recognize the footprint of recombination. The only method that showed decreased power with more sequence divergence was the homoplasy ratio. For most methods, a minimum sequence divergence of 5% seems necessary to attain substantial power. When the number of recombination events in the history of the sequences is around 3 (ρ = 1), the most powerful methods inferred the presence of recombination only 50% of the time, which indicates that several recombination events are needed in order for these methods to detect recombination. However, it should be kept in mind that around 10% of the data sets simulated under ρ = 1 contain no recombination events. For the most powerful methods to detect recombination around 80% of the time, 12 recombination events (ρ = 4) are needed.

In this simulation, methods based on the patterns of substitutions and in-site compatibility worked better than phylogenetic methods, a result also obtained by Brown et al. (16) and Wiuf et al. (15) in their comparison of four recombination detection methods. Simple implementations of methods like chimaera or maxchi, based on summary statistics, performed the best. Indeed, phylogenetic methods can detect only recombination events that change the topology (see supporting information on the PNAS web site), although at high recombination rates, there should be plenty of such events.

To maximize the chances of detecting recombination when it is present and to avoid, at the same time, the inference of recombination when it is absent, it will be useful to estimate the amount of divergence and rate heterogeneity in the data. Given those estimates and the performance graphs shown here, the most suitable methods for detecting recombination for the data at hand might be readily selected. For example, if variation is around 1%, the homoplasy test could be the method of choice, as long as there is not much rate variation. Indeed, this method was intended to work at low levels of divergence (42). For the more divergent data sets (5–20%), methods like chimaera, maxchi, phypro, reticulate, and geneconv are more powerful and do not infer false positives in excess.

The power of different methods for detecting recombination is not superb, but recombination is not an easy problem. Several methods seem to often capture the presence of recombination but detect far less recombination than possible, a fact also pointed out by Wiuf et al. (15). Fortunately, recombination methods do not seem to infer many false positives. Here we study only the different methods from a qualitative point of view. Of course, the problem of recombination is much more complex than that, and includes the identification of parentals and recombinant individuals (sequences) and the localization of the recombinational breakpoint/s. Indeed, in many cases it is important not only to detect the presence of recombination but also to measure its frequency (17).

There are two different contexts in which we may wish to detect recombination: rare recombination or frequent repeated recombination (17). In this study, we have tackled both problems by simulating data over a wide range of recombination rate values. Not surprisingly, most methods have trouble detecting rare recombinational events, especially when sequence divergence is low. Indeed, recent events should be more easily identifiable than older events, as the later may be obscured by subsequent mutation. On the other hand, when recombination rates are very high (higher than those simulated here), leading to situations close to linkage equilibrium, substitution methods might have trouble identifying site patterns (17).

Current recombination methods do not seem to make use of the information contained in the substitution pattern in the data (i.e., model of evolution). Nevertheless, this information could be used to better distinguish between those homoplasies produced by recombination and those produced by mutation.

Our results are based on computer simulations, which are simplifications of the problem. However, analysis of real data (D.P., unpublished work) seems to confirm and validate the conclusions obtained here. It should be noted that we used a limited number of replicates (100) to explore a reasonable parameter space within a practical computing time. The 95% confidence limits for the estimate of power, x, are given by x ± 1.96Math. This confidence interval will be largest when x = 0.5, being ≈0.402–0.598, implying that the power of some methods would not be statistically distinguishable in some situations (see Fig. 1).

Indeed, the accurate inference of recombination is a key to understanding the role of the different molecular evolutionary processes and the architecture of genes and genomes. Hopefully, the results shown here will provide some guidance in the selection of the most appropriate method/s for analysis of the particular data at hand.

Acknowledgments

Part of this work was accomplished during Fall 2000 at Eddie Holmes' group at the University of Oxford. We benefited from discussions with Eddie Holmes, Carsten Wiuf, Mike Worobey, Andrew Rambaut, David Robertson, Korbinian Strimmer, John Huelsenbeck, and John Maynard Smith. Two anonymous reviewers provided useful comments. Special thanks to Darren Martin and Stuart Ray for modifying their programs for us. Thanks to Andrew Rambaut and Mike Worobey for giving us access to pist before it was published. Thanks to Mary Kuhner for sending us the triple code. This work was supported by a Brigham Young University Graduate Studies Award and by a National Science Foundation Doctoral Dissertation Improvement Grant (National Science Foundation Department of Environmental Biology 0073154).

Footnotes

    • ↵* To whom reprint requests should be addressed. E-mail: dposada{at}variagenics.com.

    • This paper was submitted directly (Track II) to the PNAS office.

    • Received July 18, 2001.
    • Copyright © 2001, The National Academy of Sciences

    References

    1. ↵
      1. Anderson J B,
      2. Kohn L M
      (1998) Trends Ecol Evol 13:444–449.
      OpenUrlCrossRefPubMed
    2. ↵
      1. Feil E J,
      2. Holmes E C,
      3. Bessen D E,
      4. Chan M-S,
      5. Day N P J,
      6. Enright M C,
      7. Goldstein R,
      8. Hood D W,
      9. Kalia A,
      10. Moore C E,
      11. et al.
      (2001) Proc Natl Acad Sci USA 98:182–187, pmid:11136255.
      OpenUrlAbstract/FREE Full Text
    3. ↵
      1. Marais G,
      2. Mouchiroud D,
      3. Duret L
      (2001) Proc Natl Acad Sci USA 98:5688–5692, pmid:11320215, . (First Published April 24, 2001; 10.1073/pnas.091427698).
      OpenUrlAbstract/FREE Full Text
    4. ↵
      1. Gibbs A,
      2. Calisher C H,
      3. Garcia-Arenal F
      (1995) Molecular Basis of Virus Evolution (Cambridge Univ. Press, Cambridge, U.K.) p 603.
    5. ↵
      1. Gibbs M J,
      2. Weiller G F
      (1999) Proc Natl Acad Sci USA 96:8022–8027, pmid:10393941.
      OpenUrlAbstract/FREE Full Text
    6. ↵
      1. Drysdale C M,
      2. McGraw D W,
      3. Stack C B,
      4. Stephens J C,
      5. Judson R S,
      6. Nandabalan K,
      7. Arnold K,
      8. Ruano G,
      9. Liggett S B
      (2000) Proc Natl Acad Sci USA 97:10483–10488, pmid:10984540.
      OpenUrlAbstract/FREE Full Text
    7. ↵
      1. Holmes E C,
      2. Urwin R,
      3. Maiden M C J
      (1999) Mol Biol Evol 16:741–749, pmid:10368953.
      OpenUrlAbstract
      1. Robertson D L,
      2. Hahn B H,
      3. Sharp P M
      (1995) J Mol Evol 40:249–259, pmid:7723052.
      OpenUrlCrossRefPubMed
    8. ↵
      1. Gibbs M J,
      2. Armstrong J S,
      3. Gibbs A J
      (2001) Science 293:1842–1845, pmid:11546876.
      OpenUrlAbstract/FREE Full Text
    9. ↵
      1. Schierup M H,
      2. Hein J
      (2000) Genetics 156:879–891, pmid:11014833.
      OpenUrlAbstract/FREE Full Text
      1. Schierup M H,
      2. Hein J
      (2000) Mol Biol Evol 17:1578–1579, pmid:11018163.
      OpenUrlFREE Full Text
    10. Posada, D. & Crandall, K. A. (2001) J. Mol. Evol., in press.
    11. ↵
      1. Posada D
      (2001) Mol Biol Evol 18:1976–1978, pmid:11557803.
      OpenUrlFREE Full Text
    12. ↵
      1. Drouin G,
      2. Prat F,
      3. Ell M,
      4. Paul Clark G D
      (1999) Mol Biol Evol 16:1369–1390, pmid:10563017.
      OpenUrlAbstract
    13. ↵
      1. Wiuf C,
      2. Christensen T,
      3. Hein J
      (2001) Mol Biol Evol 18:1929–1939, pmid:11557798.
      OpenUrlAbstract/FREE Full Text
    14. ↵
      1. Brown C J,
      2. Garner E C,
      3. Dunker K A,
      4. Joyce P
      (2001) Mol Biol Evol 18:1421–1424, pmid:11420381.
      OpenUrlFREE Full Text
    15. ↵
      1. Maynard Smith J
      (1999) Genetics 153:1021–1027, pmid:10511575.
      OpenUrlAbstract/FREE Full Text
    16. ↵
      1. Hudson R R
      (1983) Theor Popul Biol 23:183–201, pmid:6612631.
      OpenUrlCrossRefPubMed
    17. ↵
      1. Hudson R R,
      2. Kaplan N L
      (1985) Genetics 111:147–164, pmid:4029609.
      OpenUrlAbstract/FREE Full Text
    18. ↵
      1. Kaplan N,
      2. Hudson R R
      (1985) Theor Popul Biol 28:382–396, pmid:4071443.
      OpenUrlCrossRefPubMed
    19. ↵
      1. Wiuf C,
      2. Hein J
      (1999) Genetics 151:1217–1228, pmid:10049937.
      OpenUrlAbstract/FREE Full Text
      1. Wiuf C,
      2. Hein J
      (1999) Theor Popul Biol 55:248–259, pmid:10366550.
      OpenUrlCrossRefPubMed
      1. Wiuf C,
      2. Hein J
      (2000) Genetics 155:451–462, pmid:10790416.
      OpenUrlAbstract/FREE Full Text
      1. Griffiths R C
      (1981) Theor Popul Biol 19:169–186.
      OpenUrlCrossRef
      1. Griffiths R C,
      2. Marjoram P
      (1996) J Comput Biol 3:479–502, pmid:9018600.
      OpenUrlPubMed
    20. ↵
      1. Donelly P,
      2. Tavaré S
      1. Griffiths R C,
      2. Marjoram P
      (1997) in Progress in Population Genetics and Human Evolution, eds Donelly P, Tavaré S(Springer, Berlin), 87, pp 257–270.
      OpenUrl
    21. ↵
      1. Weiller G F
      (1998) Mol Biol Evol 15:326–335, pmid:9501499.
      OpenUrlAbstract
    22. ↵
      1. Hein J
      (1990) Math Biosci 98:185–200, pmid:2134501.
      OpenUrlCrossRefPubMed
    23. ↵
      1. Hein J
      (1993) J Mol Evol 36:396–405.
      OpenUrl
    24. ↵
      1. Grassly N C,
      2. Holmes E C
      (1997) Mol Biol Evol 14:239–247, pmid:9066792.
      OpenUrlAbstract
      1. Holmes E C,
      2. Worobey M,
      3. Rambaut A
      (1999) Mol Biol Evol 16:405–409, pmid:10331266.
      OpenUrlAbstract
    25. ↵
      1. Martin D,
      2. Rybicki E
      (2000) Bioinformatics 16:562–563, pmid:10980155.
      OpenUrlAbstract/FREE Full Text
    26. ↵
      1. Lole K S,
      2. Bollinger R C,
      3. Paranjape R S,
      4. Gadkari D,
      5. Kulkarni S S,
      6. Novak N G,
      7. Ingersoll R,
      8. Sheppard H W,
      9. Ray S C
      (1999) J Virol 73:152–160, pmid:9847317.
      OpenUrlAbstract/FREE Full Text
    27. ↵
      1. Jakobsen I B,
      2. Easteal S
      (1996) Comput Appl Biosci 12:291–295, pmid:8902355.
      OpenUrlAbstract/FREE Full Text
    28. ↵
      1. Jakobsen I B,
      2. Wilson S E,
      3. Easteal S
      (1997) Mol Biol Evol 14:474–484, pmid:9159925.
      OpenUrlAbstract
    29. ↵
      1. Stephens J C
      (1985) Mol Biol Evol 2:539–556, pmid:3870876.
      OpenUrlAbstract
      1. Sawyer S
      (1989) Mol Biol Evol 6:526–538, pmid:2677599.
      OpenUrlAbstract
    30. ↵
      1. Sneath P H A
      (1995) Binary 7:148–152.
      OpenUrl
      1. DuBose R F,
      2. Dykhuizen D E,
      3. Hartl D L
      (1988) Proc Natl Acad Sci USA 85:7036–7040, pmid:3045828.
      OpenUrlAbstract/FREE Full Text
    31. ↵
      1. Takahata N
      (1994) Immunogenetics 39:146–149, pmid:8276458.
      OpenUrlPubMed
    32. ↵
      1. Maynard Smith J
      (1992) J Mol Evol 34:126–129, pmid:1556748.
      OpenUrlPubMed
    33. ↵
      1. Maynard Smith J,
      2. Smith N H
      (1998) Mol Biol Evol 15:590–599, pmid:9580989.
      OpenUrlAbstract
    34. ↵
      1. Worobey M
      (2001) Mol Biol Evol 18:1425–1434, pmid:11470833.
      OpenUrlAbstract/FREE Full Text
    35. ↵
      1. Salminen M O,
      2. Carr J K,
      3. Burke D S,
      4. McCutchan F E
      (1996) AIDS Res Hum Retroviruses 11:1423–1425.
      OpenUrl
    36. ↵
      1. Felsenstein J
      (1984) Evolution (Lawrence, KS) 38:16–24.
      OpenUrl
    37. ↵
      1. Felsenstein J
      (1993) phylip (Phylogeny Inference Package), Ver. 3.5c (Department of Genetics, Univ. of Washington, Seattle, WA).
    38. ↵
      1. Sneath P H A
      (1998) Bioinformatics 14:608–616, pmid:9730926.
      OpenUrlAbstract/FREE Full Text
    39. ↵
      1. Kuhner M K,
      2. Lawlor D A,
      3. Ennis P D,
      4. Parham P
      (1991) Tissue Ant 38:152–164, pmid:1801305.
      OpenUrlPubMed
      1. Munro H M
      1. Jukes T H,
      2. Cantor C R
      (1969) in Mammalian Protein Metabolism, ed Munro H M(Academic, New York), pp 21–132.
      1. Kimura M
      (1980) J Mol Evol 16:111–120, pmid:7463489.
      OpenUrlCrossRefPubMed
      1. Felsenstein J
      (1981) J Mol Evol 17:368–376, pmid:7288891.
      OpenUrlCrossRefPubMed
      1. Hasegawa M,
      2. Kishino K,
      3. Yano T
      (1985) J Mol Evol 22:160–174, pmid:3934395.
      OpenUrlCrossRefPubMed
    40. ↵
      1. Yang Z
      (1996) Trends Ecol Evol 11:367–372.
      OpenUrlCrossRef
    View Abstract
    PreviousNext
    Back to top
    Article Alerts
    Email Article

    Thank you for your interest in spreading the word on PNAS.

    NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

    Enter multiple addresses on separate lines or separate them with commas.
    Evaluation of methods for detecting recombination from DNA sequences: Computer simulations
    (Your Name) has sent you a message from PNAS
    (Your Name) thought you would like to see the PNAS web site.
    Citation Tools
    Evaluation of methods for detecting recombination from DNA sequences: Computer simulations
    David Posada, Keith A. Crandall
    Proceedings of the National Academy of Sciences Nov 2001, 98 (24) 13757-13762; DOI: 10.1073/pnas.241370698

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    Request Permissions
    Share
    Evaluation of methods for detecting recombination from DNA sequences: Computer simulations
    David Posada, Keith A. Crandall
    Proceedings of the National Academy of Sciences Nov 2001, 98 (24) 13757-13762; DOI: 10.1073/pnas.241370698
    del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
    • Tweet Widget
    • Facebook Like
    • Mendeley logo Mendeley
    Proceedings of the National Academy of Sciences: 116 (7)
    Current Issue

    Submit

    Sign up for Article Alerts

    Jump to section

    • Article
      • Abstract
      • Methods
      • Results
      • Discussion
      • Acknowledgments
      • Footnotes
      • References
    • Figures & SI
    • Info & Metrics
    • PDF

    You May Also be Interested in

    Several aspects of the proposal, which aims to expand open access, require serious discussion and, in some cases, a rethink.
    Opinion: “Plan S” falls short for society publishers—and for the researchers they serve
    Several aspects of the proposal, which aims to expand open access, require serious discussion and, in some cases, a rethink.
    Image credit: Dave Cutler (artist).
    Several large or long-lived animals seem strangely resistant to developing cancer. Elucidating the reasons why could lead to promising cancer-fighting strategies in humans.
    Core Concept: Solving Peto’s Paradox to better understand cancer
    Several large or long-lived animals seem strangely resistant to developing cancer. Elucidating the reasons why could lead to promising cancer-fighting strategies in humans.
    Image credit: Shutterstock.com/ronnybas frimages.
    Featured Profile
    PNAS Profile of NAS member and biochemist Hao Wu
     Nonmonogamous strawberry poison frog (Oophaga pumilio).  Image courtesy of Yusan Yang (University of Pittsburgh, Pittsburgh).
    Putative signature of monogamy
    A study suggests a putative gene-expression hallmark common to monogamous male vertebrates of some species, namely cichlid fishes, dendrobatid frogs, passeroid songbirds, common voles, and deer mice, and identifies 24 candidate genes potentially associated with monogamy.
    Image courtesy of Yusan Yang (University of Pittsburgh, Pittsburgh).
    Active lifestyles. Image courtesy of Pixabay/MabelAmber.
    Meaningful life tied to healthy aging
    Physical and social well-being in old age are linked to self-assessments of life worth, and a spectrum of behavioral, economic, health, and social variables may influence whether aging individuals believe they are leading meaningful lives.
    Image courtesy of Pixabay/MabelAmber.

    More Articles of This Classification

    Biological Sciences

    • Structural basis for activity of TRIC counter-ion channels in calcium release
    • PGC1A regulates the IRS1:IRS2 ratio during fasting to influence hepatic metabolism downstream of insulin
    • Altered neural odometry in the vertical dimension
    Show more

    Evolution

    • Molecular mechanism and history of non-sense to sense evolution of antifreeze glycoprotein gene in northern gadids
    • Transitions between foot postures are associated with elevated rates of body size evolution in mammals
    • Hagfish from the Cretaceous Tethys Sea and a reconciliation of the morphological–molecular conflict in early vertebrate phylogeny
    Show more

    Related Content

    • No related articles found.
    • Scopus
    • PubMed
    • Google Scholar

    Cited by...

    • Impact of Homologous Recombination on the Evolution of Prokaryotic Core Genomes
    • Floral evolution and phylogeny of the Dialioideae, a diverse subfamily of tropical legumes
    • Genotype-Specific Evolution of Hepatitis E Virus
    • Sensitive Next-Generation Sequencing Method Reveals Deep Genetic Diversity of HIV-1 in the Democratic Republic of the Congo
    • How Good Are Indirect Tests at Detecting Recombination in Human mtDNA?
    • Comparative Genomics of an Emerging Amphibian Virus
    • Population Genetics of Hirsutella rhossiliensis, a Dominant Parasite of Cyst Nematode Juveniles on a Continental Scale
    • Co-circulation of three camel coronavirus species and recombination of MERS-CoVs in Saudi Arabia
    • Aggressive Emerging Pathovars of Xanthomonas arboricola Represent Widespread Epidemic Clones Distinct from Poorly Pathogenic Strains, as Revealed by Multilocus Sequence Typing
    • Interregional Coevolution Analysis Revealing Functional and Structural Interrelatedness between Different Genomic Regions in Human Mastadenovirus D
    • Species delimitation in the lichenized fungal genus Vulpicida (Parmeliaceae, Ascomycota) using gene concatenation and coalescent-based species tree approaches
    • Evolutionary Relationship of Disease Resistance Genes in Soybean and Arabidopsis Specific for the Pseudomonas syringae Effectors AvrB and AvrRpm1
    • Prevalence, Genetic Diversity, and Host Range of Tectiviruses among Members of the Bacillus cereus Group
    • Distribution of cp32 Prophages among Lyme Disease-Causing Spirochetes and Natural Diversity of Their Lipoprotein-Encoding erp Loci
    • Characterization of the Uukuniemi Virus Group (Phlebovirus: Bunyaviridae): Evidence for Seven Distinct Species
    • EvolutionaryGenomics of Salmonellaenterica Subspecies
    • Estimating the Rate of Intersubtype Recombination in Early HIV-1 Group M Strains
    • Discovery of Severe Fever with Thrombocytopenia Syndrome Bunyavirus Strains Originating from Intragenic Recombination
    • Genetic Characterization of Trichomonas vaginalis Isolates by Use of Multilocus Sequence Typing
    • Detecting Genetic Introgression: High Levels of Intersubspecific Recombination Found in Xylella fastidiosa in Brazil
    • Draft Genome Sequences of Staphylococcus aureus Sequence Type 34 (ST34) and ST42 Hybrids
    • Evolution of a Complex Disease Resistance Gene Cluster in Diploid Phaseolus and Tetraploid Glycine
    • After the bottleneck: Genome-wide diversification of the Mycobacterium tuberculosis complex by mutation, recombination, and natural selection
    • Nature and Intensity of Selection Pressure on CRISPR-Associated Genes
    • Artificial Recombination May Influence the Evolutionary Analysis of Newcastle Disease Virus
    • Reconstructing the History of Maize Streak Virus Strain A Dispersal To Reveal Diversification Hot Spots and Its Origin in Southern Africa
    • Probing Individual Environmental Bacteria for Viruses by Using Microfluidic Digital PCR
    • Complete Genome Analysis of Coxsackievirus A2, A4, A5, and A10 Strains Isolated from Hand, Foot, and Mouth Disease Patients in China Revealing Frequent Recombination of Human Enterovirus A
    • Characterization of the Candiru Antigenic Complex (Bunyaviridae: Phlebovirus), a Highly Diverse and Reassorting Group of Viruses Affecting Humans in Tropical America
    • Species delimitation and evolution in morphologically and chemically diverse communities of the lichen-forming genus Xanthoparmelia (Parmeliaceae, Ascomycota) in western North America
    • A Population Genetics-Based and Phylogenetic Approach to Understanding the Evolution of Virulence in the Genus Listeria
    • Genome-Scale Phylogenetic Analyses of Chikungunya Virus Reveal Independent Emergences of Recent Epidemics and Various Evolutionary Rates
    • The Power of the Methods for Detecting Interlocus Gene Conversion
    • Minimal Effect of Ectopic Gene Conversion Among Recent Duplicates in Four Mammalian Genomes
    • A Stenotrophomonas maltophilia Multilocus Sequence Typing Scheme for Inferring Population Structure
    • Widely Conserved Recombination Patterns among Single-Stranded DNA Viruses
    • An overlooked pink species of land iguana in the Galapagos
    • Differential Accumulation of Retroelements and Diversification of NB-LRR Disease Resistance Genes in Duplicated Regions following Polyploidy in the Ancestor of Soybean
    • Borrelia burgdorferi Sensu Stricto Is Clonal in Patients with Early Lyme Borreliosis
    • Multiple hybridization in the Aristolochia kaempferi group (Aristolochiaceae): evidence from reproductive isolation and molecular phylogeny
    • The Effect of Chromosome Geometry on Genetic Diversity
    • Multiple-Locus Sequence Typing and Analysis of Toxin Genes in Bacillus cereus Food-Borne Isolates
    • Serendipitous backyard hybridization and the origin of crops
    • Ancestry Influences the Fate of Duplicated Genes Millions of Years After Polyploidization of Clawed Frogs (Xenopus)
    • An Exact Nonparametric Method for Inferring Mosaic Structure in Sequence Triplets
    • Extensive Intrasubtype Recombination in South African Human Immunodeficiency Virus Type 1 Subtype C Infections
    • Evolutionary Dynamics of Ralstonia solanacearum
    • Evidence for Recombination in Mycobacterium tuberculosis
    • Doubly Uniparental Inheritance Is Associated With High Polymorphism for Rearranged and Recombinant Control Region Haplotypes in Baltic Mytilus trossulus
    • Ribosomal DNA in the Grasshopper Podisma pedestris: Escape From Concerted Evolution
    • Identification, Typing, and Insecticidal Activity of Xenorhabdus Isolates from Entomopathogenic Nematodes in United Kingdom Soil and Characterization of the xpt Toxin Loci
    • Population Structure of Francisella tularensis
    • A Simple and Robust Statistical Test for Detecting the Presence of Recombination
    • Use of a Short Fragment of the C-Terminal E Gene for Detection and Characterization of Two New Lineages of Dengue Virus 1 in India
    • Polyploid and hybrid evolution in roses east of the Rocky Mountains
    • Recombination in Thermotoga: Implications for Species Concepts and Biogeography
    • The ompA Gene in Chlamydia trachomatis Differs in Phylogeny and Rate of Evolution from Other Regions of the Genome
    • Evolutionary Genetics of the Accessory Gene Regulator (agr) Locus in Staphylococcus aureus
    • Molecular Evolution Perspectives on Intraspecific Lateral DNA Transfer of Topoisomerase and Gyrase Loci in Streptococcus pneumoniae, with Implications for Fluoroquinolone Resistance Development and Spread
    • Mosaic Nature of the Wolbachia Surface Protein
    • Evidence of Recombination in the Norovirus Capsid Gene
    • Evolutionary Spread and Recombination of Porcine Endogenous Retroviruses in the Suiformes
    • Phylogenetic relationships of Betula species (Betulaceae) based on nuclear ADH and chloroplast matK sequences
    • Genetic exchange and plasmid transfers in Borrelia burgdorferi sensu stricto revealed by three-way genome comparisons and multilocus sequence typing
    • Reconstructing patterns of reticulate evolution in plants
    • Evidence of Unique Genotypes of Beak and Feather Disease Virus in Southern Africa
    • Genome-Level Evolution of Resistance Genes in Arabidopsis thaliana
    • Mosaic Genomes of the Six Major Primate Lentivirus Lineages Revealed by Phylogenetic Analyses
    • Ancient mitochondrial haplotypes and evidence for intragenic recombination in a gynodioecious plant
    • Scopus (882)
    • Google Scholar

    Similar Articles

    Site Logo
    Powered by HighWire
    • Submit Manuscript
    • Twitter
    • Facebook
    • RSS Feeds
    • Email Alerts

    Articles

    • Current Issue
    • Latest Articles
    • Archive

    PNAS Portals

    • Classics
    • Front Matter
    • Teaching Resources
    • Anthropology
    • Chemistry
    • Physics
    • Sustainability Science

    Information

    • Authors
    • Editorial Board
    • Reviewers
    • Press
    • Site Map

    Feedback    Privacy/Legal

    Copyright © 2019 National Academy of Sciences. Online ISSN 1091-6490