Boyle et al. 10.1073/pnas.0510319103.
Supporting Table 1
Supporting Text
Supporting Figure 5
Fig. 5. Simulated data were used to estimate the cross model (either one intercross or multiple subsequent backcrosses) under which the peak probability was observed for the genotype blocks present in the type I and III lineages. Strains derived from either a single experimental cross or an experimental backcross were also analyzed for comparison. (A) The probabilities of observing the genotype blocks found in the type I strain (proposed to be a progeny of a cross between strain a and type II) or the type III strain (progeny of a cross between b and type II) under a single cross model (1) or multiple subsequent backcrosses (2-6). (B) Combined analysis of the type I (a ´ type II; thick red line) and type III genotypes (b ´ II; thick blue line) as well as 26 F1 (one cross, orange lines) and 15 F2 (two cross F2’s, green lines) progeny created experimentally. To allow for all probability graphs to be displayed on the same graph, the data were mean centered by dividing the probability for each strain under each cross scenario by the average for that strain across all cross scenarios.
Supporting Text
Genetic Cross Simulations. To test the viability of our proposed genealogy, the program GENOMEMIXER (1) was used to simulate genetic cross data based on the known recombination parameters of Toxoplasma gondii (2). A marker density of 1/60 Kb was used to correspond to the number of EST assemblies that contained SNPs (1,022/65 Mb genome). Markers were evenly distributed across each chromosome based on their genetic and physical sizes (2). Multiple cross models, from 1 to 6 (with cross 1 an intercross between a or b and type II and crosses 2-5 being backcrosses between progeny of the previous cross and the a or b genotype, respectively), were simulated 20,000 times, and the number of chromosomal breakpoints and percentage of each genotype were calculated under each cross scenario. These data were used to calculate the overall probability of observing the genotype block patterns identified in the present study. Chromosomal breakpoints were as detailed in the third row of Fig. 1. For example, for the a ´ II cross type I strains have one breakpoint in chromosome XI (at the distal end) and 92% is derived from the a genotype (see Fig. 1). For each chromosome, the probabilities associated with the observed genotype blocks were then multiplied together under each cross scenario to arrive at a cumulative probability of the observed data. This analysis was also carried out for 26 progeny from an experimental cross between a type II and type III strain (2), as well as from 15 progeny derived from a backcross between strain S23 (which was an F1 progeny of a II ´ III cross) and its type III parent. These two data sets were used to examine the distribution of probabilities for progeny derived from one or two experimental crosses, respectively. To allow for display of all strains on a single graph, the calculated probability for each cross was divided by the average probability at that cross for that particular strain.
1. Williams, A. G. & Williams, R. W. (2004) Bioinformatics 20, 2491-2492.
2. Khan, A., Taylor, S., Su, C., Mackey, A. J., Boyle, J., Cole, R., Glover, D., Tang, K., Paulsen, I. T., Berriman, M., et al. (2005) Nucleic Acids Res. 33, 2980-2992.