New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
 Agricultural Sciences
 Anthropology
 Applied Biological Sciences
 Biochemistry
 Biophysics and Computational Biology
 Cell Biology
 Developmental Biology
 Ecology
 Environmental Sciences
 Evolution
 Genetics
 Immunology and Inflammation
 Medical Sciences
 Microbiology
 Neuroscience
 Pharmacology
 Physiology
 Plant Biology
 Population Biology
 Psychological and Cognitive Sciences
 Sustainability Science
 Systems Biology
Mixed model analysis of quantitative trait loci

Edited by M. T. Clegg, University of California, Riverside, CA, and approved October 6, 2000 (received for review May 22, 2000)
Abstract
We develop a mixed model approach of quantitative trait locus (QTL) mapping for a hybrid population derived from the crosses of two or more distinguished outbred populations. Under the mixed model, we treat the mean allelic value of each source population as the fixed effect and the allelic deviations from the mean as random effects so that we can partition the total genetic variance into between and withinpopulation variances. Statistical inference of the QTL parameters is obtained by using the Bayesian method implemented by Markov chain Monte Carlo (MCMC). This unified QTL mapping algorithm treats the fixed and random model approaches as special cases of the general mixed model methodology. Utility and flexibility of the method are demonstrated by using a set of simulated data.
Studies of the genetic basis for population differentiation are usually performed by methods of quantitative trait loci (QTL) analysis in line crossing experiments (1), each population being treated as an inbred line. Unfortunately, most natural populations are not inbred. Developing inbred lines and then conducting QTL analysis are unrealistic for some organisms. A common practice is to select a single parent from each population to form a cross. This approach may be practical for plants, but is not applicable for most animals because of their low fertility. In addition, a single parent may not be a good representative of the population from which the parent is sampled. Results obtained from this single cross may not represent the actual population difference, but largely reflect the genetic sampling error. These problems can be solved by sampling multiple parents from each population. Unfortunately, an optimal statistical method has not been available for such a design. Haley et al. (2) developed a leastsquares method to map QTL in crosses between segregating populations, assuming that alleles of QTL have fixed alternatively between populations. The leastsquares method will not detect QTL that have similar allele frequencies in the two populations. This is primarily because information comes from mean differences between populations. Another obvious flaw of the leastsquares method is that the withinpopulation variances will not disappear simply because they are not included in the model; rather, they will be absorbed by the residual variance. A large residual variance will decrease the power of QTL detection.
In this study, we develop a mixed model framework that allows the partitioning of the total genetic variance into within and betweenpopulation variances. We show that the mixed model approach provides a unified QTL mapping algorithm in which we can analyze data collected from any complicated mating designs.
Mixed Model
Throughout the study, the genetic parameters are defined exclusively in terms of allelic rather than genotypic values. We consider only a single locus in the description of the mixed model methodology, although multiple loci will be used in the simulation study. For simplicity, we consider two source populations only. Let us define the expectation and variance of the allelic values for population one by b_{1} and σ_{1}^{2}, respectively, and the corresponding parameters for population two by b_{2} and σ_{2}^{2}. For diploid organisms, both the mean and variance of the additive genetic values take twice the values of their allelic counterparts. The total additive genetic variance of the combined population in the current generation (before the cross) is σ_{A}^{2} = σ_{1}^{2} + σ_{2}^{2} + (b_{1} − b_{2})^{2}.
Let ½n_{1} and ½n_{2} be the numbers of founders from populations one and two, respectively. Assume that a parent from one population has an equal chance to mate with any parents from the other population. The mating of the F_{1}s are completely arbitrary so that the alleles of the two original populations are well integrated into the hybrid population. We can take F_{2} as our mapping population, but including advanced generations can be more efficient because alleles from different populations are better integrated. Unfortunately, such a mating design produces complex pedigrees that prevent the use of a simple statistical method. In the next section, we will introduce a Bayesian method for mapping QTL in complex pedigrees. Assume that there are N individuals in the mapping population. We define the effects of the paternal and maternal alleles of individual j by v_{j}^{p} and v_{j}^{m}, respectively, for j = 1, … , N. The phenotypic value of individual j can be described by the following linear model: 1 where μ is the population mean (fixed effect) and ɛ_{j} is the residual error with a N(0, σ_{ɛ}^{2}) distribution. Using the notation of Fernando and Grossman (3), we define v_{p}^{p} and v_{p}^{m} as the paternal and maternal alleles for the father of j so that v_{j}^{p} = z_{j}^{p}v_{p}^{p} + (1 − z_{j}^{p})v_{p}^{m}, where z_{j}^{p} indicates the allelic inheritance of the paternal allele of the father. Similarly, define v_{m}^{p} and v_{m}^{m} as the paternal and maternal alleles of the mother and v_{j}^{m} = z_{j}^{m}v_{m}^{p} + (1 − z_{j}^{m})v_{m}^{m}, where z_{j}^{m} indicates the allelic inheritance of the paternal allele of the mother. The above model can be rewritten as 2 We have now expressed the allelic values of the current generation as linear functions of the allelic values of their parents. The parental alleles can be further expressed as a linear function of the allelic values of their parents. With such a recursive process, each allele can be traced back to its origin in the two founder populations. Let us group the effects of the n = n_{1} + n_{2} founder alleles into an n × 1 vector named v. The elements of v are sorted by source population, the identification number (ID) of each founder within a source population and parental origin of each allele within a founder (paternal followed by maternal). Note that the IDs of founders are numbered from 1 to ½n. Consider a hybrid population originated from the crosses of 5 founders from population one and 3 founders from population two. In this case, n_{1} = 10 and n_{2} = 6, and vector v has n = 16 elements. The first 10 elements store the allelic values from population one and the last 6 elements store those from population two. If a founder has an ID of 4, we know that it comes from the first source population and the paternal and maternal alleles of this founder are stored as the 7th and 8th elements of v, respectively. In general, the two alleles of the ith founder are stored at elements 2i − 1 and 2i of v, respectively. Since each allele of individual j can be traced back to one of the founder alleles, we can express the phenotypic value of j by a linear model, 3 where A^{p} or A^{m} is an N × n indicator matrix connecting the paternal or maternal alleles of all individuals to the founders. Each row of A^{p} or A^{m} contains one and only one nonzero (unity) element. The positions of the nonzero elements in the matrices correspond to the founder alleles that have been passed to the mapping individual through their parents.
Let us define v_{i} as the ith element of v. The distribution of v_{i} depends on which source population v_{i} comes from. If v_{i} comes from population one, then v_{i} ∼ N(b_{1}, σ_{1}^{2}) is assumed. Otherwise, we assume v_{i} ∼ N(b_{2}, σ_{2}^{2}). Define w_{i} = 1 if i comes from population one and w_{i} = 0 otherwise. These w_{i}s are the source population indicators. We can express v_{i} by the following linear model, v_{i} = w_{i}b_{1} + (1 − w_{i})b_{2} + u_{i}, where u_{i} is the deviation of the ith allelic value from its corresponding population mean and is distributed as u_{i} ∼ N(0, w_{i}σ_{1}^{2} + (1 − w_{i})σ_{2}^{2}). Let W be a known n × 2 matrix storing the source population indicators, b = [b_{1}, b_{2}] and u be an n × 1 vector for all the u_{i}s. Eq. 3 can be rewritten in matrix notation as v = Wb + u. Substituting this equation into 3, we have 4 Let X = (A^{p} + A^{m})W and Z = A^{p} + A^{m}. The above model is then expressed as a typical mixed model: 5 where b is the vector of fixed effects and u is the vector of random effects, both being effects of QTL.
Paths of Gene Flow
Model 5 is different from the usual mixed model in that the design matrices are unknown because they are determined by the unobserved paths of gene flow. A complete description of the paths of gene flow is called a genetic descent graph (4). A probability statement of a genetic descent graph can be inferred by using marker information. From the inferred probability, we can draw a realization of the descent graph, which is then used to infer QTL parameters. In this section, we introduce a recursive algorithm to draw a descent graph.
Define i_{j}^{p} = 1, … , n as the founder allele identifier for the paternal allele of individual j and i_{j}^{m} = 1, … , n as that for the maternal allele of j. For example, if v_{j}^{p} is a copy of the first founder allele and v_{j}^{m} is a copy of the fourth founder allele, then i_{j}^{p} = 1 and i_{j}^{m} = 4. Using the founder allele identifiers, we can rewrite Eq. 1 by 6 Instead of using the awkward expression v_{ijp} for element i_{j}^{p} of vector v, here we adopt a pseudo code expression v(i_{j}^{p}). We have now formulated the problem of QTL mapping as that of finding the appropriate subscripts of v that an individual can possibly take. A complete description of the subscripts for all individuals represents a genetic descent graph.
There may be many generations from j to the founders, and thus it may be difficult to directly sample i_{j}^{p} and i_{j}^{m}. We use a recursive algorithm to sample the founder allele identifiers. The algorithm requires individuals to be entered into the pedigree in a chronological order so that the allele identifiers of parents must be sampled before their children. The recursive algorithm is performed as follows:
If individual j is a founder and it is the ith founder, then i_{j}^{p} = 2i − 1 and i_{j}^{m} = 2i for i = 1, … , ½n. If j is not a founder, we use the following recursive equations: 7 where i_{p}^{p} and i_{p}^{m} are the allele identifiers for the father of j and i_{m}^{p} and i_{m}^{m} are those for j's mother. The allelic transmission indicator, z_{j}^{p} or z_{j}^{m}, reflects the event of only one meiosis, and thus is easy to sample. We have now turned the problem of identifying the founder alleles into that of finding the allelic transmission indicators from parents to children, a much simpler problem.
The formation of a zygote requires two meioses that must be inferred jointly. Define the ordered genotypes of the father and mother of j by Q_{p}^{p}Q_{p}^{m} and Q_{m}^{p}Q_{m}^{m}, respectively. Individual j can take one of the four possible genotypes, {Q_{p}^{p}Q_{m}^{p}, Q_{p}^{p}Q_{m}^{m}, Q_{p}^{m}Q_{m}^{p}, Q_{p}^{m}Q_{m}^{m}}. Define another variable, U_{j} = 1, … , 4, to indicate one of the four ordered genotypes. For example, U_{j} = 2 if j is of type Q_{p}^{p}Q_{m}^{m}. The values of z_{j}^{p} and z_{j}^{m} are determined solely by U_{j} using z_{j}^{p} = I_{(Uj}_{=1)} + I_{(Uj}_{=2)} and z_{j}^{m} = I_{(Uj}_{=1)} + I_{(Uj}_{=3)}, where I_{(Uj}_{=k)} = 1 if U_{j} = k and I_{(Uj}_{=k)} = 0 otherwise.
We have now turned the problem of generating z_{j}^{p} and z_{j}^{m} into that of generating U_{j}. The joint distribution of U_{j} with marker genotypes is then used in the following Bayesian modeling.
Bayes and the Markov Chain Monte Carlo (MCMC) Algorithm
In Bayesian analysis, we first classify each item in the mixed model into one of two classes. The class of observables (also called data) includes the arrays of phenotypic values y and marker genotypes M. The class of unobservables includes the parameters of interest {μ, b, σ_{1}^{2}, σ_{2}^{2}, σ_{ɛ}^{2}, λ}, where λ is the position of the QTL, and the missing values, U = {U_{j}} and u. Define the collection of unobservables by θ = {μ, b, σ_{1}^{2}, σ_{2}^{2}, λ, σ_{ɛ}^{2}, U, u} and use p(x) as a generic notation for probability density whose actual form depends on the argument x rather than p. The joint posterior probability density of θ is 8 where 9 is the likelihood and 10 is the joint prior. A uniform prior is chosen for each unobservable except u, which takes p(uσ_{1}^{2}, σ_{2}^{2}) ∝ 1/(σ_{1}^{n1}σ_{2}^{n2}) exp {−½u′G^{−1}u}, where G = diag{I_{n1}σ_{1}^{2}, I_{n2}σ_{2}^{2}}. In practice, the priors should be customized according to the data structure and the experience of the researcher. The uniform priors selected in this study purely reflect our ignorance of the true parameters. With the uniform priors, the likelihood will play a major role in determining the posterior distribution of θ and the results will be more objective.
The actual Bayesian inference is to obtain the marginal posterior probability density for each parameter (θ_{i}) rather than the joint posterior density of all parameters. This requires multiple integration, p(θ_{i}y, M) = ∫∫_{θ−i} p(θ_{i}, θ_{−i}y, M)dθ_{−i}, where θ_{−i} stands for the array of remaining elements of θ that excludes θ_{i}. Unfortunately, there is no explicit form for the multiple integration. Here we adopt the MCMC algorithm to generate samples from the joint posterior distribution from which a marginal distribution can be easily inferred.
Given X and Z, model 5 is a standard mixed model. Bayesian inference of variance components under the standard mixed model has been extensively studied (e.g., refs. 5 and 6). Herein, we describe only methods of evaluating the likelihood, generating U, and simulating b and u.
Evaluating the Likelihood.
Conditional on their genotypic values, the phenotypic values of any two individuals are independent. Therefore, p(yθ) = ∏_{j=1}^{N} p(y_{j}μ, v(i_{j}^{p}), v(i_{j}^{m}), σ_{ɛ}^{2}), where the sampled values of founder allele identifiers, i_{j}^{p} and i_{j}^{m}, are used to calculate the genotypic value of j. We evaluate the likelihood value for each individual immediately after its founder allele identifiers have been sampled (described in the next paragraph). Therefore, the algorithm requires individuals to be entered into the pedigree in the chronological order of their birth so that the likelihoods of parents are always evaluated before their children (7).
Sampling Founder Allele Identifiers.
Founder allele identifiers are the keys of the proposed method. Each allele is connected to one of the founder alleles through its founder allele identifier. Sampling allele identifiers is accomplished by sampling U, which is then converted into z_{j}^{p} and z_{j}^{m}, which are eventually used for computing the founder allele identifiers.
We use a Gibbs sampler (8, 9) algorithm to sample U_{j} from its conditional posterior distribution. For simplicity, we describe only the posterior probability conditional on the genotypes of two flanking markers. Similar to U_{j}, we denote the genotype indicator vectors for the left and right markers by M_{j}^{L} and M_{j}^{R}, respectively. Then the posterior distribution of U_{j} is 11 for k, t, s = 1, … , 4. Calculation of p(y_{j}θ) is straightforward. Conditional on U_{j}, M_{j}^{L} and M_{j}^{R} are independent so that p(M_{j}^{L} = t, M_{j}^{R} = sU_{j} = k, λ) = p(M_{j}^{L} = tU_{j} = k, λ)p(M_{j}^{R} = sU_{j} = k, λ), where p(M_{j}^{L} = tU_{j} = k, λ) or p(M_{j}^{R} = sU_{j} = k, λ) is obtained from the following transition matrix: where c_{MU} is the recombination fraction between the QTL and the left or right marker. It is calculated from λ by using the Haldane (10) map function. Finally, we take the Mendelian prior p(U_{j} = k) = 1/4 for k = 1, … , 4 and j = 1, … , N.
Since only two flanking markers are used to calculate the posterior probability of U_{j}, the approach is called interval mapping (1). In pedigree analysis, the markers are usually not fully informative. In this situation, we generate a realization of M_{j}^{R} and M_{j}^{L} based on loci flanking them. A flanking locus can be a marker or a QTL, depending on which one is closer to the marker of interest. In fact, our computer program has been equipped with this utility. Alternatively, we can take the multipoint method to infer the probability of U_{j} (11). The multipoint method can improve the mixing property of the Markov chain, but it is hard to implement in the program. However, both methods would ultimately generate the same result if the chains are sufficient long.
Updating Values of the Founder Alleles.
The effect of the ith founder allele (i = 1, … , n) has been expressed as a fixed effect, w_{i}b_{1} + (1 − w_{i})b_{2}, plus a random deviation, u_{i} (see Eq. 3). Because w_{i} is known, updating the effects of founder alleles is actually accomplished by updating b and u. Although b_{1} and b_{2} can be drawn independently if an informative joint prior is chosen, to increase the speed of convergence, we set b_{1} = 0 and draw only b_{2}. The Metropolis–Hastings algorithm (12, 13) is used here for drawing b_{2}. Define θ_{i}^{(t)} = b_{2}^{(t)} as the current value of b_{2} and θ_{−i}^{(t)} as the current values of the remaining unobservables. We want to generate a θ_{i} from the following conditional posterior distribution, p(θ_{i}y, M, θ_{−i}) ∝ p(y, Mθ_{i}, θ_{−i})p(θ_{i}, θ_{−i}). A random walk Metropolis algorithm is used to generate the new value of θ_{i}. First, a θ^{*}_{i} is proposed from θ^{*}_{i} ∼ N(θ_{i}^{(t)}, Δ), where Δ is a predetermined proposal variance for b_{2}, a small positive value (tuning parameter). The transition probability from θ_{i}^{(t)} to θ^{*}_{i} is q(θ^{*}_{i}, θ_{i}^{(t)}) = N(θ_{i}^{(t)}, Δ), which is identical to q(θ_{i}^{(t)}, θ^{*}_{i}) = N(θ^{*}_{i}, Δ). Therefore, the acceptance probability for the candidate value of θ^{*}_{i} is min{1, α}, where α is 12 If θ^{*}_{i} is accepted, θ_{i}^{(t+1)} = θ^{*}_{i}, otherwise, θ_{i}^{(t+1)} = θ_{i}^{(t)} and no action will be taken.
The random deviations, u, are drawn one pair at a time. In this case, θ_{i} = {u_{2i−1}, u_{2i}} is a 2 × 1 vector for i = 1, … , ½n. The proposal value is sampled from a joint bivariate normal distribution, θ^{*}_{i} ∼ N(θ_{i}^{(t)}, I_{2}δ), where δ is the proposal variance common to both u_{2i−1} and u_{2i}. The pair of us are accepted or rejected simultaneously according to the Metropolis–Hastings rule (see Eq. 12).
The population mean, the variance components, and the QTL position are updated by following the same Metropolis–Hastings rule. Detailed steps are described in ref. 14, in which the marker linkage phases and the number of QTL are also treated as unknown variables. Note that sampling the number of QTL involves change in the dimension of the model. We adopted the reversible jump MCMC algorithm developed by Green (15) to add or delete a QTL in each MCMC step.
A Simulation Study
For illustration, we simulated a hybrid population derived from the cross of two outbred populations. Ten parents were randomly selected and genotyped from each population and formed a complete cross experiment in which each parent from one population was mated to every parent from the other population, leading to a total of 100 fullsib families in the F_{1} generation. One individual from each fullsib family was genotyped and phenotyped. From the 100 F_{1} individuals, we formed 50 pairs of matings in a completely random fashion. Each mating pair produced 10 progenies, leading to a total of 500 F_{2} individuals. The total sample size in the threegeneration pedigree was 500 + 100 + 20 = 620. Note that all the families are interrelated, forming a large complicated pedigree with a total of 20 founders.
We then simulated two chromosomes 90 and 60 centimorgans (cM) long, respectively. The marker coverage is one marker in every 10 cM. Each marker allele in the founders was sampled from one of six equally frequent alleles. We put two QTL on chromosome I at positions 25 cM and 77 cM, respectively, and one QTL on chromosome II at position 32 cM. The effects of the three QTL are given in Table 1. The environmental error is distributed as N(0, 1). Given this setup, the first QTL explains 24% of the phenotypic variance, all due to the betweenpopulation variance. The second QTL explains 23% of the phenotypic variance, due to the betweenpopulation variance and the variance within population one, and the third QTL explains 15% of the phenotypic variance, due to only the withinpopulation variances. The QTL allelic effects in the founders were sampled from normal distributions. One data set was simulated and analyzed by using two different models, the mixed model and the fixed model. To implement the fixed model analysis, we simply added one restriction to the same computer program: σ_{1}^{2} = σ_{2}^{2} = 0 for all QTL fitted to the model. The analysis was then identical to QTL mapping in an F_{2} line cross.
For the mixed model analysis, the MCMC was started with no QTL in the model. The distribution of the number of QTL appeared to reach its stationary distribution quickly. The total length of the chain was 10^{6} cycles. With the removal of 1,000 cycles for the burnin period and the saving of one observation for every 50 cycles, the total number of saved data points was 20,000. These observations were subject to the postBayesian analysis. We recorded the number of hits by QTL within a short interval, say 1 cM, of the chromosome and defined the proportion of the hits among the posterior sample as the QTL intensity. We then plotted the QTL intensity against the chromosomal position and formed a QTL intensity profile for each chromosome (see Fig. 1 a and b). The intensity profiles indicated three possible QTL. The estimated positions of the QTL are close to the true locations. For each effect, we calculated the average QTL effect for each short interval (1 cM long). We then plotted the average effect against the chromosomal position, forming a profile for each QTL effect (see Fig. 1 c and d). As Sillanpää and Arjas (16) stated, the effect profile is meaningful only in the region where the QTL intensity is reasonably high. For example, the first QTL intensity is concentrated on (10, 30) cM on chromosome I. The population difference of the first QTL shows an average effect around 0.75 in that region. Similarly, the second QTL effect shows an average effect around 0.60 in the region corresponding to the peak of the second QTL. Interestingly, the population difference profile shows an average effect around −0.5 in the region between the two QTL. However, this region was rarely hit by QTL, and thus the negative effect does not mean anything.
We proposed a method to partition the QTL intensity profile into various components, each corresponding to one specific effect. These effectspecific intensity profiles are also called the weighted intensity profiles because they are the QTL intensity weighted by the effects. The weighted profiles allow us to visualize the sources of genetic variation for the detected QTL. For instance, the weighted profiles for chromosome I (Fig. 2a) show that the two QTL are primarily caused by the population difference. In contrast, the weighted profiles for chromosome II (Fig. 2b) show that the QTL is caused primarily by the variance within population one, rather than by the population difference.
For the fixed model analysis in which σ_{1}^{2} = σ_{2}^{2} = 0 has been assumed, only two QTL were detected (Fig. 3a) and the third QTL on chromosome II was completely missing (Fig. 3b). By ignoring the withinpopulation segregation, we not only missed the third QTL but also got a confused estimate of the position for the second QTL (Fig. 3a). The posterior mean of the QTL number is 2.4 rather than 2.0. This is because 28% of the posterior sample shows three QTL. The position of the third QTL detected is highly concentrated at the end (80–90 cM) of chromosome I rather than on chromosome II. This faulty QTL is essentially due to the split of the second QTL, another piece of evidence that the fixed model is inferior. The effect profiles for the fixed model analysis are given in Fig. 3 c and d. The weighted intensity profiles are given in Fig. 4, which shows no sign of QTL on chromosome II.
Bayesian mapping allows the number of QTL to change. This involves the change in the dimension of the model. We adopted the reversible jump algorithm of Green (15) to infer the posterior distribution for the number of QTL (see Table 2). The posterior mean of the QTL number in the mixed model analysis is ∼3.0, which coincides with the true value. The posterior mean in the fixed model analysis is ∼2.4, which is obviously inferior to the mixed model analysis.
Discussion
We recently proposed a Bayesian mapping method under the random model framework. This method can analyze data collected from arbitrary mating designs, including selfed and related founders, without any approximation (14). The method, however, is a pure random model approach and applicable only to situations where the founders are randomly sampled. The mixed model approach developed in this study is an extension of our random model to handle populations with a hybrid origin. Most of the sampling techniques used in this study—e.g., sampling the number and locations of QTL—have been described by Yi and Xu (14).
Bayesian analysis is preferable for its convenience and flexibility in regard to using pedigree data and mapping multiple QTL (17). It takes full account of the uncertainty associated with all unknowns, including the number and locations of QTL, and the genotypes of QTL. Earlier works of Bayesian mapping include Satagopan et al. (18) and Sillanpää and Arjas (16) for line crossing data, and Uimari and Hoeschele (19), Heath (20), and Bink and van Arendonk (21) for pedigree data. The works for line crossing data always use the fixed model approach. The works for pedigree data usually use the random model, but most often assume two alleles (20). When multiple alleles are assumed, the genotypes of QTL are not sampled, but the conditional expectations of allelic IBD (identicalbydescent) are calculated and used in place of the covariance matrix at the QTL of interest (21). This expected IBD method is an approximation to the Bayesian analysis because it uses an approximate likelihood. Nonetheless, the above Bayesian methods cannot handle data with arbitrarily complicated mating designs, especially when selfing is involved in pedigree data.
The mixed model approach provides a unified QTL mapping algorithm. It can analyze data collected from any complicated mating design. As demonstrated in the simulation study, when the withinpopulation variances are set to zero, the algorithm becomes a fixed model approach and automatically analyzes a typical F_{2} cross family. On the other hand, if we disregard the population difference and simply set b = 0, the algorithm will turn into a random model approach and automatically analyze an outbred population.
Under the mixed model framework, we treat the mean effects of the source populations as fixed effects and the allelic values within each population as random effects. If the number of founders within each population is small, a meaningful estimate of the withinpopulation variance is impossible. In this case, the allelic values of the founders may be treated as fixed effects with the allelic variance, σ_{k}^{2}, treated as a prior variance. As a consequence, the model is considered as a fixed model. If the mapping population is derived from the hybrid of many source populations, it is not convenient to estimate the b_{k}s. Instead, we can treat b_{k} as a random variable sampled from a N(0, σ_{b}^{2}) distribution. In this case, σ_{b}^{2} is one of the parameters of interest. The withinpopulation variances may not be estimated separately for individual populations; instead, they may be pooled as a consensus estimate of the withinpopulation variance. This results in a hierarchical random model analysis of QTL. Therefore, the difference between a fixed model and a random model is vague in the context of Bayesian mapping. When the variances of effects are treated as hyperparameters (prior variances), the model is fixed. If the variances of effects are treated as the parameters of interest, the model is random. Both the fixed and random models can be implemented in the same mixed model computer program, with one additional statement to turn on/off the fixed/random option. The proposed mixed model approach provides a unified QTL mapping algorithm suitable for all kinds of populations.
Acknowledgments
We thank Dr. Claus Vogl for his comments on the first draft of the manuscript. The presentation of the manuscript has been significantly improved after incorporating the comments of two anonymous reviewers. This research was supported by National Institutes of Health Grant GM55321 and the U.S. Department of Agriculture National Research Initiative Competitive Grants Program Grant 97352055075 to S.X.
Footnotes

↵† To whom reprint requests should be addressed. Email: xu{at}genetics.ucr.edu.

This paper was submitted directly (Track II) to the PNAS office.

Article published online before print: Proc. Natl. Acad. Sci. USA, 10.1073/pnas.250235197.

Article and publication date are at www.pnas.org/cgi/doi/10.1073/pnas.250235197
Abbreviations
 QTL,
 quantitative trait locus or loci;
 MCMC,
 Markov chain Monte Carlo;
 cM,
 centimorgans
 Received May 22, 2000.
 Copyright © 2000, The National Academy of Sciences
References
 ↵
 Lander E S,
 Botstein D
 ↵
 Haley C S,
 Knott S A,
 Elsen JM
 ↵
 ↵
 ↵
 ↵
 ↵
 van Arendonk J A M,
 Tier B,
 Kinghorn B P
 ↵
 ↵
 ↵
 Haldane J B S
 ↵
 Lander E S,
 Green P
 ↵
 ↵
 Hastings W K
 ↵
Yi, N. & Xu, S. (2001) Genetics, in press.
 ↵
 Green P
 ↵
 Sillanpää M J,
 Arjas E
 ↵
 Hoeschele I,
 Uimari P,
 Grignola F E,
 Zhang Q,
 Gage K M
 ↵
 Satagopan J M,
 Yandell B S,
 Newton M A,
 Osborn T C
 ↵
 Uimari P,
 Hoeschele I
 ↵
 ↵
 Bink M C,
 van Arendonk J M