New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology
Linking parasite populations in hosts to parasite populations in space through Taylor's law and the negative binomial distribution
Contributed by Joel E. Cohen, November 16, 2016 (sent for review July 25, 2016; reviewed by Kevin Lafferty and Ross McVinish)

Significance
The spatial distribution of individuals of any species is a basic concern of ecology. The spatial distribution of parasites matters to control and conservation of parasites that affect human and nonhuman populations. This paper develops a quantitative theory to predict the spatial distribution of parasites based on the distribution of parasites in hosts and the spatial distribution of hosts. The theory is tested using observations of metazoan hosts and parasites in the littoral zone of four lakes in Otago, New Zealand. We infer that the spatial distribution of parasites depends crucially on high local correlations of hosts' parasite loads. If so, local hotspots of correlated parasite loads should be considered in parasite control and conservation.
Abstract
The spatial distribution of individuals of any species is a basic concern of ecology. The spatial distribution of parasites matters to control and conservation of parasites that affect human and nonhuman populations. This paper develops a quantitative theory to predict the spatial distribution of parasites based on the distribution of parasites in hosts and the spatial distribution of hosts. Four models are tested against observations of metazoan hosts and their parasites in littoral zones of four lakes in Otago, New Zealand. These models differ in two dichotomous assumptions, constituting a 2 × 2 theoretical design. One assumption specifies whether the variance function of the number of parasites per host individual is described by Taylor's law (TL) or the negative binomial distribution (NBD). The other assumption specifies whether the numbers of parasite individuals within each host in a square meter of habitat are independent or perfectly correlated among host individuals. We find empirically that the variance–mean relationship of the numbers of parasites per square meter is very well described by TL but is not well described by NBD. Two models that posit perfect correlation of the parasite loads of hosts in a square meter of habitat approximate observations much better than two models that posit independence of parasite loads of hosts in a square meter, regardless of whether the variance–mean relationship of parasites per host individual obeys TL or NBD. We infer that high local interhost correlations in parasite load strongly influence the spatial distribution of parasites. Local hotspots could influence control and conservation of parasites.
The spatial distribution of individuals of any species is a basic concern of ecology. The spatial distribution of parasites matters to control and conservation of parasites that affect human and nonhuman populations. Despite the basic scientific and practical significance of the spatial distribution of parasites, investigations of parasite populations are often founded on their distributions and dynamic processes within and among hosts. A scientific justification for this approach is that the number of parasite individuals per host individual is likely to affect the parasite's impact on the host in theory (1) and empirically (2⇓⇓–5). A practical motivation for this approach is that a field investigator can collect hosts and study the parasite populations in them without the need to describe in detail the spatial distribution of the hosts or their abundance.
Why is the variation of parasite population density from 1 m2 of space to another important? Kuris et al. (6) suggested that, because parasites contribute substantial biomass and productivity to estuaries, parasite ecology should be fully integrated into the general body of ecological theory. The spatial ecology of free-living species has long been a central topic in empirical and theoretical ecology but has not been fully explored for parasites. Moreover, parasites’ spatial variation is likely to influence the conservation and control of parasites, especially those that affect human health, wildlife, and game. On basic scientific and practical grounds, the spatial ecology of parasites deserves fuller development.
To investigate how parasites are distributed in space, this paper develops a theoretical framework and four models that link the distribution of parasites in hosts, the distribution of hosts in space, and the distribution of parasites in space. The four models are tested against observations of the metazoan hosts in the littoral zone of four lakes in Otago, New Zealand.
Prior empirical studies of parasite populations have commonly estimated the number of parasite individuals per host individual from a sample of host individuals. For example, in a study figure 5 in ref. 2) of macroparasites in wild vertebrate hosts and a study (figure 7A in ref. 7, p. 569) of parasitic nematodes in terrestrial mammalian hosts, the sample variance of the number of parasites per individual host was well described (r = 0.98) by the equation log(sample variance) ≈ log(a) + b × log(sample mean), a > 0, which is a log–log form of Taylor’s law (TL) (8):
The fact that the sample variance of the number of individuals could be approximated as a power function of the sample mean of multiple sets of observations was proposed long before Taylor (8) and illustrated with entomological examples (9), a pest plant (10), and the cabbage aphid (11). Without reference to these discoveries, Taylor (8) brought the approximate power law relationship 1 to general attention as a widespread empirical pattern. This pattern has since become known among many ecologists as TL.
Hechinger et al. (12) investigated populations of parasites and co-occurring free-living species. They measured population density for both parasitic and free-living species in number of individuals⋅hectare−1. This measure of population density enabled them to compare allometric power laws in parasitic and free-living species. Measuring population density by individuals⋅meter−2, Lagrue et al. (13) showed that the variance and mean of population density were well approximated by 1 but that the parameters a and b differed among three so-called “lifestyles”: parasites, free-living species that were hosts of parasites (henceforth called “hosts” here), and unparasitized free-living species.
Here, we address theoretical and empirical questions about the relation of populations of parasites in hosts to populations of parasites in physical space. What are the relations between two measures of parasite population density: (i) parasite individuals per host and (ii) parasite individuals per square meter? Can mathematically transparent, empirically testable models describe accurately the variance–mean relationships of hosts per square meter, parasites per host individual, and parasites per square meter? The answer to the last question is not obvious, because any two of these distributions constrain the third.
Theoretical Methods and Results
General Notation and Definitions.
We investigate four models that share a common framework and three assumptions. In all of these models, we specify a single parasite species and a single host species. When a parasite species infects multiple species of hosts or when a host species is infected by multiple species of parasites, we organized the data to consider all possible pairs consisting of a single parasite species and a single host species. In our models, we analyze theoretically the set of all such single-species parasite–host pairs. In the following empirical analyses, we analyze the single-species parasite–host pairs statistically.
Assumption i: the number of parasites (of a selected single species) infecting a host (of a selected single species) in 1 m2 of habitat is the sum of the numbers of parasites in all host individuals (of that species) in that square meter of habitat. We specify two variant forms of this assumption: one for models 1 and 2 and another for models 3 and 4. To spell out these details of assumption i, we now define additional variables and notation.
Let H be a random variable with nonnegative integer values {0, 1, 2, …}. H represents the number of individuals of a particular host species per square meter of habitat. A generalist parasite may infect more than one species of host. Here, H refers to counts of only one selected host species. H is not a fixed number, but a random variable that may differ from 1 m2 to another and differ over time within the same square meter. We assume that the mean and the variance of the probability distribution of H are positive and finite.
Let P be another random variable with nonnegative integer values {0, 1, 2, …}. P represents the number of parasites (of a selected species) in one host individual. We assume that the mean and the variance of the probability distribution of P are positive and finite. The quantity P is sometimes called the “parasite load” (ref. 3, p. 606). If the host individual is parasitized by more than one species, we count here the individuals of only one selected parasite species. Assume H and P are independent in any square meter of habitat. Different square meter of habitat may have different distributions of H and P. Let Pi for i = 1, 2, …, H be random variables that represent the number of parasites in the ith individual host i = 1, 2, …, H in a square meter of habitat. We assume that Pi, i = 1, 2, …, H, all have the distribution of P and are independent of H and independent of one another.
Let S be the number of individuals of the selected parasite species in all individuals of the selected host species per square meter of habitat. The symbol S is a mnemonic for sum of parasites in 1 m2 of habitat space; S recalls “Sum over Space.” When H = 0, define S = 0. When H > 0, models 1 and 2 assume that the total number of parasites in 1 m2 of habitat is a sum of a random number H of independent random variables Pi, i = 1, 2, …, which are each independent of H. In brief,
At the opposite extreme, models 3 and 4 assume that the numbers of parasites per host individual in a square meter of habitat are perfectly correlated among all host individuals, although they are independent of the number of hosts H. Then, the total number of parasites in 1 m2 of habitat is a product Z = H × P of independent random variables. If hosts are absent (i.e., H = 0), then parasites are necessarily absent (i.e., Z = 0) in parallel with models 1 and 2.
These two equations,
We now prepare assumption ii. For a random variable X that has a finite positive population mean and finite positive population variance, let the population mean be
Variance functions.
A variance function is a standard statistical concept (14). Suppose that the probability distributions of H and P depend on a parameter θ, such as temperature, nutrient concentration, light availability, or other factors that vary from 1 m2 to another. Then, the moments
In empirical tests of TL, the sample mean and sample variance differ from one block of observations to another, and θ could be interpreted as a label of each block. The interpretations of θ may be illustrated by published examples. In one test of TL (figure 5 in ref. 2, p. S118), each value of θ specified 1 of 263 pairs of (mean abundance per host, variance of abundance per host) of macroparasites in wildlife host populations. In another test of TL (figure 7A in ref. 7, p. 569), each value of θ specified one pair of (mean abundance per host, variance of abundance per host) of adult nematode worms recorded from individual guts of 66 terrestrial mammalian species (n = 104 values of θ corresponding to 104 reported samples). In a third test of TL (figure 1 in ref. 15, p. 543), each value of θ specified one pair of (mean abundance per host, variance of abundance per host) of helminth parasites of fish from 410 samples (with 180 parasitic helminth species and 68 fish host species from 62 different published papers). In our earlier test of TL (13), each value of θ specified one pair of (mean population density per square meter, variance of population density per square meter) from a specified lake sampled in a specified season counting a specified species of parasite (253 mean–variance pairs) or host (151 mean–variance pairs).
Assumption ii: H, the number of host individuals⋅meter−2 (of a specified species), satisfies TL. Here, θ may be interpreted as a label associated with each square meter or the collection of square meter in each of a set of samples. Explicitly, for some constants a > 0, b (b is not necessarily positive),
Assumption iii: the mean number of parasites per host is a power function of the mean host density. Explicitly, for some constants
If g = 0 in Eq. 3, then
This assumption is a flexible quantitative formulation of the possibility that there may be no relation (g = 0) between the mean number of parasites per host and the mean host density per square meter; that there may be a negative relation (g < 0: greater host abundance per square meter is associated with a reduced mean parasite burden per host) as an effect of herd immunity, dilution, or body size (bigger hosts are rarer and can accommodate more parasites); or that there may be a positive relation (g > 0: greater host abundance is associated with an increased parasite burden per host) as an effect of contagion or reduced host resistance from crowding. We introduce this assumption in the models and ask the data to reveal the relationship, while leaving the mechanism of the relationship (if g ≠ 0) for future research.
Four alternative models.
Four models differ in each of two assumptions, each of which has two alternatives. Thus, four models may be summarized by a 2 × 2 table (Table 1). The first assumption specifies whether the variance function of parasites per host P comes from TL or the negative binomial distribution (NBD). The NBD has traditionally been widely confirmed (3) and assumed for the abundance of metazoan parasites in individual hosts since the work in ref. 16. The second assumption specifies whether Pi, the number of parasites in the ith host, i = 1, 2, …, H, is independent (implying zero correlation) among host individuals in a square meter or identical for all hosts in this square meter (implying correlation one). These two alternatives correspond to the complete absence of synchrony and perfect synchrony, respectively, of the parasite loads of hosts in 1 m2. (In models of statistical physics, the analogous difference is called “annealed” vs. “quenched.”) For brevity, we do not analyze here the obvious possibility of correlations among Pi that are intermediate between zero and one.
Assumptions that differentiate models 1–4
Models 1 and 2: Independence of Parasites per Host.
Let Pi, i = 1, 2, … be independently and identically distributed (iid) random variables with the distribution of P, also independent of H. It is well known (equation 7.2 in ref. 17, p. 119; equation 3 in ref. 18, p. 122; and ref. 19, p. 110 gives a detailed elementary derivation) that, if
We now use Eq. 5 to express the variance of S as a function of the mean of S under two alternative assumptions about the variance function of P. Model 1 assumes that the parasites per host satisfy TL. Model 2 assumes that the parasites per host satisfy the NBD.
Model 1: Independent number P of parasites per host and power law variance function (TL) of P.
Assume P obeys TL [i.e., there exist constants
Then, we prove in SI Appendix that the exact predicted variance of parasites per square meter is
The variance
As
Properties of the NBD.
To specify the NBD (ref. 20, p. 306), let ρ > 0 be a positive real number. When ρ is not an integer, the NBD is sometimes called the Pólya distribution. Let p be a positive probability, 0 < p ≤ 1, and let q = 1 – p, 0 ≤ q < 1. A random variable X taking only the nonnegative integer values 0, 1, 2, … has the NBD if and only if
A family of NBDs is a collection of NBDs in which one or both of its parameters ρ, p vary. The variance function of the NBD depends on which parameter is assumed to vary. If θ = ρ varies and p is constant, then the variance is proportional to the mean
We now show that, when ρ is constant and θ = p varies, a family of NBDs is not consistent with TL, except asymptotically in the extremes of large E(X) and small E(X). As is standard, we use the notation x ≪ y to mean that x is much smaller than y or that y is much larger than x. When Eq. 8 holds and 0 < ρ ≪ E(X), then 1 ≪ E(X)/ρ; therefore,
When Eq. 8 holds and ρ ≫ E(X) > 0, then 1 ≫ E(X)/ρ; therefore,
In Eq. 8,
For a family of NBDs with constant ρ and varying p (equation 6 in ref. 21, p. 162),
The self-contradictory assumption that
Model 2: Independent number P of parasites per host and NBD variance function of P.
Model 2 assumes that P obeys the variance function Eq. 8 of a family of NBDs with constant
Then, instead of Eq. 7, the variance of S is a sum of a linear term plus two power functions:
The exponents of all three terms are independent of
Models 3 and 4: Identical Numbers of Parasites per Host.
The next two models assume that, when H > 0, every host individual in 1 m2 of habitat has an identical number P of parasites. This number of parasites will differ from 1 m2 to another, but the same number P of parasites resides in every host in a square meter. Then the number of parasites per square meter of habitat is
S and Z have the same mean (25),
Model 3: Identical numbers P of parasites per host and power law variance function (TL) of P.
Model 3 assumes that P obeys TL. In Eq. 12, we replace
Model 4: Identical numbers P of parasites per host and NBD variance function of P.
Model 4 assumes that P obeys the variance function Eq. 10 of a family of NBDs with constant
Empirical and Statistical Methods
Empirical Methods.
Lagrue et al. (13) described the field sites and the methods of collecting the data. In brief, all metazoan species in the littoral zones of four lakes in Otago, New Zealand were collected and classified as parasitic, free-living with parasites (here hosts), and free-living without parasites. Each lake was sampled multiple times (depending on the sampling method) in each of three field seasons at multiple locations in each lake. Because of the mobility of fish and the impossibility of counting entire fish populations of large areas, estimates of population densities of fish species as individuals⋅meter−2 may be subject to larger errors than those of, for example, sessile invertebrates. We measured the population density of each parasite species as individuals⋅meter−2 separately for each distinct combination of host species and parasite species. The data structured in this way have not been analyzed previously. We illustrate this method by an example.
In Lake Hayes in September, two species of host insects, Oecetis sp. and Triplectides sp., were infected with metacercariae of the parasite Microphalloidea sp. In 199 samples of Microphalloidea sp., 91 were found in Oecetis sp., and 108 were found in Triplectides sp. To estimate the density per square meter of the parasite Microphalloidea sp., we distinguished combinations of Microphalloidea sp. with different host species and found 24.9 individuals⋅meter−2 Microphalloidea sp. in 91 samples in Oecetis sp. and 45 individuals⋅meter−2 Microphalloidea sp. in 108 samples in Triplectides sp. [Another method would have been to pool all 199 samples of Microphalloidea sp. This method would have yielded 69.9 individuals⋅meter−2 Microphalloidea sp., regardless of host. We rejected this method in preliminary analyses, because the resulting values of
Here, we do not use the data on free-living species without parasites. As noted above, unparasitized free-living species had different body size distributions and taxonomic distributions from both parasites and hosts (13). Based on large sample sizes and careful searches for parasites within free-living species, we think that it is unlikely that our distinction between hosts and unparasitized free-living species is artifactual. The data reported in this paper are in Dataset S1.
Statistical Methods.
We obtained 209 measurements of seven variables for different combinations of host species and parasite species: the mean and the variance of host individuals⋅meter−2, the mean and the variance of parasite individuals per host individual, the mean and the variance of parasite individuals⋅meter−2, and the minimum number of host individuals captured in a sample. This minimum ranged from 0, when a host did not occur in a sample at a particular locality, lake, and season, to 58 hosts. This minimum sample size influenced one of the relationships analyzed below.
Following ref. 13 and many others, we tested power law relationship
In all figures, data are solid dots, and theoretical curves are lines (solid, dash-dotted, dashed, or dotted). Computations used Matlab R2015a (28) running under Microsoft Windows 7.
Empirical Results
Descriptive Summary of Relationships in Data.
The mean and the variance of host abundances H per square meter were distinctly bimodal (SI Appendix, Fig. S1, two upper left diagonal histograms). The less abundant mode corresponded to the larger and less abundant fishes, whereas the more abundant mode corresponded to the smaller and more abundant invertebrate hosts. The tightest relationships among six main variables (excluding minimum sample size) were those between the mean and the corresponding variance of each of three measures of abundance: H (host individuals per square meter), P (parasites per host individual), and S (parasites per square meter) (SI Appendix, Fig. S1, off-diagonal scatterplots). In addition, there were clear positive associations between the mean hosts per square meter and mean parasites per square meter and between the variance of hosts per square meter and variance of parasites per square meter.
Empirical Tests of the Framework Assumptions.
The variance and mean of the number H of hosts per square meter are described well by TL (Fig. 1A) (R2 = 0.9876). This finding is qualitatively consistent with the finding of table 2 in ref. 13 that TL described well (R2 = 0.9810) what they called “free-living parasitized” species but differs slightly in parameter estimates. Whereas ref. 13 estimated slope = 2.0193 with 95% CI = 1.9739, 2.0646 and intercept = 0.2903, we estimated slope = 2.0856 with 99% CI = 2.0434, 2.1278 and intercept = −0.014318. The discrepancy is because of a different way of organizing the data as described in Empirical and Statistical Methods.
Tests of assumptions of the models. (A) Test of TL for hosts H per square meter. The solid line is the log–log form of TL (Eq. 2) for host individuals per square meter. It is superposed on the dotted line of the quadratic generalization of TL. (B) Test of host–parasite density scaling. The mean number of parasites per host P is a decreasing power law function of the mean number of hosts per square meter H. The solid line is the log–log form of the power law (Eq. 3) model of host–parasite density scaling. The dotted line is a quadratic generalization, which is not significantly better. (C) Test of the product rule. The mean number of parasites per square meter is closely approximated by the product of the mean number of hosts per square meter times the mean number of parasites per host individual as predicted by the product rule (Eq. 4). The intercept does not differ significantly from zero, and the slope does not differ significantly from one. (D) Test of TL for S, the number of parasites per square meter. The power law relationship, linear on log–log coordinates (solid black line), approximates well the relation between the sample mean and sample variance of S (solid black dots). Table 2 gives parameter estimates of all linear and some quadratic relationships in the text, some of their 99% CIs, and measures of goodness of fit (
On average, the larger the mean number of hosts per square meter, the smaller the mean number of parasites per host (Fig. 1B). On log–log coordinates, the slope −0.24575 of a linear approximation to this relationship is not statistically distinguishable from −1/4, which is a scaling exponent that plays a major role in the metabolic theory of ecology (ref. 29, p. 1775), and a quadratic approximation is not a significant improvement over a linear relationship. The scatter around a linear relationship is the largest among the relationships examined here (R2 = 0.1849), and the error variance 1.229 on the log10 scale is more than an order of magnitude.
This negative relationship between the mean number of hosts per square meter and the mean number of parasites per host is qualitatively consistent with a finding (figure 3 in ref. 30) in which host density was calculated by pooling individuals of all host species used by a parasite species. Our analysis involves one host species and one parasite species.
The mean number of parasites per square meter is very close to the product of the mean number of hosts per square meter times the number of parasites per host (Fig. 1C) as predicted by Eq. 4.
The variance and mean of S, the number of parasites per square meter, are described well (R2 = 0.9838) by the empirical TL for parasites per square meter (Fig. 1D):
This finding is qualitatively consistent with the finding of table 2 in ref. 13 that TL described parasitic species well (R2 = 0.9708) but differs slightly in parameter estimates. Whereas ref. 13 estimated slope = 2.1020 with 95% CI = 2.0568, 2.1473 and intercept = 0.4333, we estimated slope = 2.1166 with 99% CI = 2.0675, 2.1657 and intercept = 0.26315. The discrepancy is because of a different way of organizing the data as described in Empirical and Statistical Methods.
A consequence of the good agreement with TL with values of the intercept not far from 1 is that the point (0, 0) in Fig. 1D roughly separates mean values of S greater than 1 (on the right) from mean values of S less than 1 (on the left) at the same time that it separates variances of S greater than 1 (above) from variances of S less than 1 (below).
The parameter estimates of all linear and some quadratic relationships in the text, some of their 99% CIs, and measures of goodness of fit (
Parameter estimates and associated statistics
Empirical Results for Model 1.
TL approximated (R2 = 0.9142) the variance of the number P of parasites per host individual as a function of the mean number of parasites per host individual (Fig. 2A), but on log–log scales, a convex (curved upward) quadratic relationship was significantly better than TL.
Variance function of parasites P per host individual. Variance of the number P of parasites per host (⋅) as a function of the mean number of parasites per host. (A) The quadratic generalization of TL (dotted line) provides a significantly better fit than TL (Eq. 6) (solid line). All 209 data points, regardless of sample size, are included. (B) Variance function of P (⋅) compared with mean + 2 × mean2 (solid black line). The NBD of the number P of parasites per host (which was posited in models 2 and 4) predicts that variance = mean + 2 × mean2 according to Eq. 10. The value ρ−1 = 2 was estimated by numerical experimentation. Calculations involving the NBD were carried out on the original scale of measurement (parasite individuals per host) and are plotted here on log–log scales for comparability with other figures. Data (⋅) are from all hosts without regard to the minimum number of hosts per estimate of the mean and variance of P. Number of mean–variance pairs (data points), 209; root-mean-squared error (RMSE) on the original scale of measurement (parasite individuals per host) = 3.22 × 105. (C) Minimum sample size of 15 hosts per estimate of mean and variance of P. Number of mean–variance pairs (data points), 23; RMSE on the original scale of measurement (parasite individuals per host) = 165.
For model 1, using the empirical estimates of the parameters from Table 2 gives the exponent of
Another way to arrive at the same conclusion is to observe that the exponent of
For mean densities or variances of the parasites per square meter greater than one (S log10 mean > 0), the variance of the number of parasites per square meter is well approximated by a sum of two power functions of the mean number of parasites per square meter (Fig. 3A) as predicted by Eq. 7. Both the predicted slope 2.1135 and the level of the predicted variance
Predictions of the variance of S, the number of parasites per square meter, from four models. For each value of the horizontal axis, which is the log10 sample variance of S in all four panels, the black dot falling along the diagonal shows the same value on the vertical axis as a standard of perfect agreement for comparison with the predicted log10 variance of S from each model shown by the red continuous curve. The predicted variance of S is computed from the observed sample mean of S according to the variance function of S derived theoretically for each model using the parameter estimates a, b, c, d, f, g, and ρ estimated independently from other relationships. There is no curve fitting or adjustment of parameters between the sample variance of S (black dots) and the theoretical variance of S (red curve). The red and black points would be superimposed if the model's predicted variance of S matched perfectly the sample variance of S and if there were no sampling variability in the observations. (A) Model 1 assumes that parasite numbers per host P are independent with TL variance function. (B) Model 2 assumes that parasite numbers per host P are independent with negative binomial variance function. (C) Model 3 assumes that parasite numbers P are identical in all hosts with TL variance function. (D) Model 4 assumes that parasite numbers P are identical in all hosts with negative binomial variance function. Only models 3 and 4 predict a variance of S close and linearly related to the observed sample variance of S over the whole range of the observed sample variance of S. All panels have 209 data points. obs., observed; pred., predicted.
Empirical Results for Model 2.
The empirical number of parasites per host individual seems statistically to have a strictly convex variance function on log–log scales (Fig. 2). The quadratic variance function Eq. 10 of the NBD is closer to the empirical variance function (Fig. 2 B and C) than the straight line predicted on log–log scales from TL (Fig. 2A).
For each combination of host species and parasite species (or life stage), the number of hosts used to estimate the mean and variance of the number of parasites per host varied among four lakes and three seasons sampled. When all 209 combinations of host species and parasite species (or life stage) are plotted (Fig. 2B), regardless of the number of hosts sampled, the deviations from the quadratic variance function Eq. 10 of the NBD are greater than when only the 23 host–parasite pairs that had a minimum of 15 hosts sampled in every lake and season are included (Fig. 2C). This comparison suggests that small sample sizes may be at least partly responsible for the deviations from the mean–variance relation of the NBD. This inference does not exclude the possibility that other factors, such as location or season, may be correlated with sample size and may partially explain the deviations from the quadratic variance function Eq. 10 of the NBD.
Testing the Predicted Variance of the Number of Parasites per Square Meter.
To test whether the empirical variance of the number of parasites per square meter is well described by the variance predicted by Eq. 11 requires estimates of the parameters on the right side of Eq. 11. Table 2 gives (after rounding to four decimal places) a = 0.9676, b = 2.0856, f = 1.2175, g = −0.2458, and
For low densities (
The NBD gives a better model of the variance function of parasites per host (Fig. 2 B and C) than TL (Fig. 2A). Neither TL (model 1) nor NBD (model 2) accurately approximates the empirical variance of the number of parasites per area at low mean densities and low variances. Both models successfully describe the observed variances at high densities of parasites per square meter.
Empirical Results for Model 3.
Model 3 assumes perfect correlation of the parasite loads in different hosts in the same square meter, leading to variance function Eq. 13. In this case,
Empirical Results for Model 4.
Model 4 assumes perfect correlation of the parasite loads in different hosts in a square meter, leading to variance function Eq. 14. In this case,
Summary Comparisons of Four Models.
The variance functions for S or Z, the number of parasites per square meter, of all four models have the same general mathematical form: they are a sum of powers of
To test this suggestion, the exponents of every term in each model are assembled in Table 3. The ranges of each model’s exponents (i.e., the largest exponent minus the smallest exponent) are shown below the exponents along with two measures of the lack of fit between the variances predicted by each model and the observed sample variances. The first measure is the SD of the residuals (differences) between the log10 sample variance of S and the log10 predicted variance of S. Here, the prediction is based on the sample mean of S associated with each sample variance, when this sample mean is inserted into the formula for the variance of S derived for each model. To convert this measure on the log10 scale to the original scale on which the variance of S is measured, the last line of Table 3 shows 10SD.
Exponents of μS and their numerical values, the ranges of the exponents, and summaries of the deviations between log10 sample variance of S and log10 predicted variance of S in models 1–4
The ranges of exponents of models 1 and 2 are roughly 10 times larger than the range of exponents of model 3 and three or four times larger than the range of exponents of model 4. As the argument above suggests that they should be, the SDs of models 1 and 2 are roughly three times the SDs of models 3 and 4 on the log10 scale, and 10SD is more than an order of magnitude larger for models 1 and 2 than for models 3 and 4. This qualitative difference between models 1 and 2 on the one hand and models 3 and 4 on the other hand is reflected in the systematic difference in shape between the theoretical and observed variance functions in Fig. 3. Model 3, the best fitting model, has the smallest range of exponents and the lowest values of SD and 10SD, but its advantage over model 4 is small. These results suggest that the decisive difference between the more successful models 3 and 4 and the less successful models 1 and 2 is the assumption in models 3 and 4 that parasite loads of different hosts in 1 m2 are highly (here, perfectly) correlated by contrast with the assumption in models 1 and 2 that parasite loads of different hosts in 1 m2 are uncorrelated. This difference matters far more than whether the variance function of parasites per host obeys TL or NBD.
Discussion
Motivated by a desire to embed the ecology of parasites more firmly within the framework of general ecology, we developed data and models to link the distribution of parasites in hosts with the distribution of parasites in space.
Discussion of Empirical Results.
Empirically, we confirmed TL for hosts per square meter (Fig. 1A) and parasites per square meter (Fig. 1D) after organizing field data by pairing each host species with each parasite species as the basic unit of analysis. We also found empirically that the log variance of parasites per host was better described as a strictly convex function of the log mean parasites per host (Fig. 2), contrary to TL, but in accordance with the NBD. The nonlinear (log–log) relationship became clearer when small sample sizes with fewer than 15 observations were excluded (Fig. 2C). This convexity in the log variance of the number of parasite individuals per host as a function of the log mean of the number of parasite individuals per host differs from some prior findings (2, 7, 15) but is consistent with the variance function of the NBD, which has been widely confirmed (3, 16) or assumed for the distribution of parasite individuals per host individual.
We showed empirically that the product rule Eq. 4 holds. To high precision, the mean number of parasites per square meter is the product of the mean number of parasites per host times the mean number of hosts per square meter (Fig. 1C). This agreement is not a tautology or accounting identity. The empirical agreement with the product rule is compatible with the assumption of independence (conditional on the square meter or value of θ) between P (parasites per host) and H (hosts per square meter) or perfect correlation of P among hosts in a square meter.
Prompted by the goal of developing a theory to relate the number of parasites per host individual to the number of parasites per square meter of habitat, we posited on theoretical grounds a relationship (Eq. 3) called “host–parasite density scaling,” which was consistent with our data (Fig. 1B). This negative relationship summarizes the broad tendency of the mean parasite density per host to decline as the mean host density per square meter increases. For mathematical convenience in working with the power law of TL, we picked a power law for the mathematical form of this relationship, recognizing that the widely scattered data are compatible with other ways of expressing it. Except for a qualitatively similar finding from different analyses of the raw data (30), we are not aware that host–parasite density scaling (Eq. 3) has been previously posited theoretically or supported empirically.
The tendency of the mean parasite density per host to decline as the mean host density per square meter increases may be caused by multiple mechanisms, including herd immunity, dilution (when a constant input of infectious propagules is distributed over a larger number of potential hosts), or host body size (bigger hosts are rarer, and each individual host can accommodate more parasites). Determining which of these mechanisms or others accounts for negative host–parasite density scaling remains a project for future research.
Discussion of Theoretical Results.
Our four theoretical models make three assumptions. The first assumption is that the total number of parasites in 1 m2 is the sum of the numbers of parasites in all of the hosts in that 1 m2. Models 1 and 2 assume that the number of parasites in one host individual is independent of the numbers of parasites in all other host individuals in the same 1 m2. Models 3 and 4 assume that the numbers of parasites in every host individual are identical, although independent of the numbers of hosts in the 1 m2. The second assumption is that the mean numbers of parasites per host are a power law function of the mean numbers of hosts per square meter. The third assumption is that the numbers of hosts per square meter obey TL. We verified the second and third assumptions empirically.
Under these assumptions, we found mathematically that, if the numbers of parasites per host obey TL, then the variance function of the numbers of parasites per square meter is a sum of two (model 1) or three (model 3) power functions of the mean numbers of parasites per square meter with different exponents and therefore, could not, in general, satisfy TL exactly. However, asymptotically for large mean numbers of parasites per square meter and also asymptotically for small mean numbers of parasites per square meter, the variance function approaches linearity on log–log coordinates. The slopes differ in the large and small limits when the numbers of parasites per host are uncorrelated (model 1). The slopes differ little in the large and small limits, at least for the observed parameter values, when the numbers of parasites per host are perfectly correlated (model 3).
Under the same assumptions, we showed that, if the variance function of parasites per host obeys the quadratic relationship (without constant term) of NBD, then the variance function of the numbers of parasites per square meter is a sum of three (model 2) or four (model 4) power functions of the mean numbers of parasites per square meter with different exponents and therefore, could not, in general, satisfy TL exactly. Again, however, asymptotically for large mean numbers of parasites per square meter and also asymptotically for small mean numbers of parasites per square meter, the variance function approaches linearity on log–log coordinates. The slopes differ considerably in the large and small limits when the numbers of parasites per host are uncorrelated (model 2). The slopes differ little in the large and small limits, at least for the observed parameter values, when the numbers of parasites per host are perfectly correlated (model 4).
Although TL cannot simultaneously hold exactly for P, H, and S under our general assumptions, model 3 [which posits TL for P (parasites per host) and perfect correlation of P for different hosts in a square meter] and model 4 (which posits an NBD variance function for P and perfect correlation of P for different hosts in a square meter) reproduce reasonably well the observed TL for the distribution of parasites in space. According to model 3, TL can hold approximately for parasites per host and parasites per square meter. The widely analyzed, empirically supported NBD fits the observed variance function of parasites per host better than TL, although model 4, which incorporates the NBD variance function of parasites per host, fits the observed variance function of parasites per square meter slightly worse than model 3.
The predictions of models 3 and 4 of the variance function of the number of parasites per square meter have the right shape, unlike those of models 1 and 2, but are slightly systematically too high relative to the empirical variance of the number of parasites per square meter (Fig. 3 C and D). It is a standard fact in statistics that the variance of a sum of correlated random variables increases with the average correlation among them. Therefore the excess in the predicted variance is very likely to be caused by the assumption of perfect rather than high but imperfect average correlation of the parasite loads per host individual. A slight lowering of that assumed level of correlation should adjust the level of the predicted variance to that observed.
Conclusions
This analysis draws attention to the key importance of interhost correlations in parasite loads in accounting for the spatial variance of parasite population densities. Our empirical experience, not formalized here, strongly suggests that the parasite loads of different host individuals within a small area, such as 1 m2, are very likely to be more similar to each other than to the parasite loads of host individuals from a distant square meter, because there are hotspots of infection even on small spatial scales. The correlation among parasite loads of different host individuals from the same square meter will never be 1 but will be somewhere between 0 and 1. We are unable to point to a field study that measures this correlation specifically. Future empirical research should measure directly interhost correlations in parasite loads at local (square meter) and large spatial scales.
Acknowledgments
J.E.C. acknowledges the assistance of Priscilla K. Rogerson. R.P. and C.L. thank Anne Besson, Isa Blasco-Costa, Manna Warburton, and Kim Garrett for assistance with field collection and laboratory processing of samples. We thank the referees for constructive criticisms and Bob Lester for a useful suggestion. This work was supported by US National Science Foundation Grant DMS-1225529 (to J.E.C.). A grant from the Marsden Fund (R.P.) funded the empirical portion of this study.
Footnotes
- ↵1To whom correspondence should be addressed. Email: cohen{at}rockefeller.edu.
Author contributions: J.E.C. designed research; J.E.C., R.P., and C.L. performed research; J.E.C. contributed new reagents/analytic tools; R.P. supervised data collection; C.L. collected data; J.E.C. analyzed data; and J.E.C. wrote the paper.
Reviewers: K.L., University of California, Santa Barbara; and R.M., University of Queensland.
The authors declare no conflict of interest.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1618803114/-/DCSupplemental.
References
- ↵
- ↵
- ↵
- ↵.
- Fredensborg BL,
- Mouritsen KN,
- Poulin R
- ↵
- ↵
- ↵
- ↵
- ↵.
- Bliss CI
- ↵.
- Fracker SB,
- Brischle HA
- ↵.
- Hayman BI,
- Lowe AD
- ↵.
- Hechinger RF,
- Lafferty KD,
- Dobson AP,
- Brown JH,
- Kuris AM
- ↵.
- Lagrue C,
- Poulin R,
- Cohen JE
- ↵
- ↵
- ↵
- ↵.
- Pielou EC
- ↵
- ↵.
- Ross SM
- ↵.
- Ghosh JK,
- Delampady M,
- Samanta T
- ↵.
- Yamamura K
- ↵
- ↵
- ↵
- ↵
- ↵.
- Cohen JE,
- Lai J,
- Coomes DA,
- Allen RB
- ↵
- ↵.
- MathWorks
- ↵
- ↵.
- Lagrue C,
- Poulin R
- ↵.
- Kingman JFC
Citation Manager Formats
Sign up for Article Alerts
Article Classifications
- Biological Sciences
- Ecology