Globalization and the population structure of Toxoplasma gondii
- *Division of Parasitic Diseases, Centers for Disease Control and Prevention, 4770 Buford Highway, Chamblee, GA 30341;
- †Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, 12735 Twinbrook Parkway, Rockville, MD 20852; and
- **Animal Parasitic Diseases Laboratory, Animal and Natural Resources Institute, Agricultural Research Service, U.S. Department of Agriculture, Beltsville, MD 20705
-
Edited by Francisco J. Ayala, University of California, Irvine, CA, and approved May 30, 2006
-
↵ ¶D.H.G. and E.R.D. contributed equally to this work. (received for review February 21, 2006)
-
Fig. 1.
A schematic showing the geographic origin of the samples (n = 275). Vertical bars over sites (where n ≥ 5) depict sample composition with respect to the four populations identified by the program structure (SA1 and SA2 predominate in South America; RW is common in all continents but is rare in South America, WW is cosmopolitan, see Results for details). Numbers indicate sample size from each locale, and their colors correspond to that of the dominant population in the sample.
-
Fig. 2.
Clustering of populations based on their genetic diversity was performed by using standard principal component (PC) analysis using the correlation matrix of the original variables. Per-locus estimates of expected heterozygosity (27), allele richness (28), and variance in allele size for the STR loci were used for each population (Table 2). Coordinates are the first (vertical) and second (horizontal) PCs. The first PC represented overall diversity because its eigenvector’s loadings were positive and similar in magnitude (data not shown). The first PC alone accounted for 40% of the total variation, and together with the second PC, 53% of the total variation was captured. Brz Rio, Brazil Rio; Brz Sao, Brazil Sao Paulo; Argen, Argentina; Brz Par, Brazil Parana; Venezu, Venezuela; C. Amer, Central America; Brz Amaz, Brazil Amazon; MidEast, Middle East.
-
Fig. 3.
Phylogenetic network showing relationships among five STR haplotypes in relation to their geographic origin and lineage (marked by 1, 2, and 3, respectively). (Inset) Magnification of the area showing the tight cluster of lineage III haplotypes (arrow). The network was derived by using the median-joining algorithm (18) (ε = 0) after processing the data with the reduced median method (19) as implemented by network 4.1. The network incorporated variation in the number of repeats (assuming the step-wise mutation model) of STR loci that were weighted inversely to their variance (M33, 9; M6, M48, and M102, 4; and M163, 3). S. America, South America; C. America, Central America; N. America, North America; M. East, Middle East.
-
Fig. 4.
Within-group divergence between haplotypes observed on two or more continents (Upper) and those observed on one continent (Lower). Divergence was measured by the distribution of allele-sharing distance across seven loci (including the minisatellite M95, which was not included in the STR network). To avoid sampling bias, we excluded identical haplotypes. Because all (but one) multicontinent haplotypes were of lineage III, the comparison included only haplotypes of this lineage; otherwise, the difference between the distributions was even larger.
-
Fig. 5.
Neighbor-joining tree of haplotypes based on the shared-allele distance across six loci, showing lineage (branch color) and populations identified by structure (tip color). The observed frequency of each haplotype is shown if >1. Pie charts show geographical distribution of haplotypes found on two or more continents (color key as in Fig. 3). The distribution of multicontinent haplotypes differs slightly from the network (Fig. 3) because an additional locus (M95) is included in generating the shared-allele distance tree. (Inset) Determination of the number of populations (K) in the T. gondii gene pool using the admixture model (with independent allele frequencies) implemented by structure, based on the likelihood of observing the data and the assignment certainty (the fraction of isolates assigned into any population with probability >75%). The results are averages across three independent simulations (with 105 burn-in iterations, followed by 106 MCMC iterations).
-
Fig. 6.
Within-population divergence measured by the shared-allele distance (Upper) and posterior F ST distributions measuring divergence of the populations identified by structure from the “ancestral” population (Lower), with lines depicting the central 95% of the values of each posterior distribution. The two models identified nearly the same populations (only 13 of 275 individuals were clustered differently, comprising 4.7%). Mean F ST values are shown in the center of each distribution (n = 500). Mean shared-allele distance of each population is shown with an arrow. Statistically distinct groups (P < 0.001), by using Wilcoxon two-sample test accommodating multiple comparisons, are indicated by letters.
Footnotes
- ‡To whom correspondence should be addressed: at the present address: Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, 12735 Twinbrook Parkway, Room 2W13A, Rockville, MD 20852. E-mail: tlehmann{at}niaid.nih.gov











