New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
 Agricultural Sciences
 Anthropology
 Applied Biological Sciences
 Biochemistry
 Biophysics and Computational Biology
 Cell Biology
 Developmental Biology
 Ecology
 Environmental Sciences
 Evolution
 Genetics
 Immunology and Inflammation
 Medical Sciences
 Microbiology
 Neuroscience
 Pharmacology
 Physiology
 Plant Biology
 Population Biology
 Psychological and Cognitive Sciences
 Sustainability Science
 Systems Biology
Model criticism based on likelihoodfree inference, with an application to protein network evolution

Edited by Elizabeth A. Thompson, University of Washington, Seattle, WA, and approved March 26, 2009 (received for review August 13, 2008)
Abstract
Mathematical models are an important tool to explain and comprehend complex phenomena, and unparalleled computational advances enable us to easily explore them without any or little understanding of their global properties. In fact, the likelihood of the data under complex stochastic models is often analytically or numerically intractable in many areas of sciences. This makes it even more important to simultaneously investigate the adequacy of these models—in absolute terms, against the data, rather than relative to the performance of other models—but no such procedure has been formally discussed when the likelihood is intractable. We provide a statistical interpretation to current developments in likelihoodfree Bayesian inference that explicitly accounts for discrepancies between the model and the data, termed Approximate Bayesian Computation under model uncertainty (ABCμ). We augment the likelihood of the data with unknown error terms that correspond to freely chosen checking functions, and provide Monte Carlo strategies for sampling from the associated joint posterior distribution without the need of evaluating the likelihood. We discuss the benefit of incorporating model diagnostics within an ABC framework, and demonstrate how this method diagnoses model mismatch and guides model refinement by contrasting three qualitative models of protein network evolution to the protein interaction datasets of Helicobacter pylori and Treponema pallidum. Our results make a number of model deficiencies explicit, and suggest that the T. pallidum network topology is inconsistent with evolution dominated by link turnover or lateral gene transfer alone.
 Bayesian inference
 intractable likelihoods
 Markov chain Monte Carlo
 Approximate Bayesian Computation
 model uncertainty
In the quest to comprehend complex observations, hypotheses about underlying mechanisms are formalized in terms of precise mathematical models (1). Much of statistical reasoning then proceeds in an iterative process between data acquisition, data analysis, and model development (2). At the i th iteration, the interpretation of observed data x_{0} in terms of some target parameters θ conditional on a tentative, probabilistic model M_{i} has a long tradition in Bayesian inference (3). The focus is typically on the posterior density f(θ  x_{0},M_{i}), which is related to the likelihood f(x_{0}  θ,M_{i}) and the prior π_{θ}(θ  M_{i}) via Bayes' Theorem: To explore whether the current model M_{i} is consonant with x_{0}, and to guide further model development, Bayesian predictive diagnostics (4, 5) ask whether x_{0} can be viewed as a random observation from a predictive distribution m(xM_{i}) in terms of a chosen discrepancy function ρ(x,x_{0}), x ∼ m; interpretation of such diagnostics has been the subject of lively debate (6, 7). Application of this machinery for complex models is provided by the workhorses of Bayesian inference (8), such as Markov Chain Monte Carlo (MCMC), as long as the likelihood is readily evaluable up to a normalizing constant.
In many areas of science, such as econometrics (9), molecular genetics (10), epidemiology (11), and evolutionary systems biology (12), the likelihood is sometimes intractable. Nevertheless, given a value of θ, it is typically easy to simulate data from f(·θ,M_{i}). Approximate Bayesian Computation (ABC), reviewed in ref. 10, proposes to infer θ by comparing simulated data x to the observed data x_{0}, in terms of a (realvalued) univariate discrepancy ρ that combines a set of (computationally tractable) summaries s= (S_{1},…,S_{k},…,S_{K}). In its simplest form, values of θ for which the discrepancies are within τ ≥ 0 are retained to define the “approximate likelihood” in the sense that as τ → 0, t_{τ}(θ) should approach the likelihood of the summaries, f(s(x_{0})θ,M_{i}); see Fig.1A and the supporting information (SI) Appendix, subsection S1.2. ABC may be embedded into Bayesian methods to formally select one model from a specified collection of models, or to average them (13–15); see Fig. 1B. However, relative comparisons between models do not convey whether models correspond adequately to the observed data and, without exploring the adequacy of models to explain the data, the meaning of reporting θ from Eq. 2 remains unclear, see Fig. 2.
We continue in believing that “all models are wrong but some are useful” (2), which prompts us to interpret several ρ_{k}(S_{k}(x),S_{k}(x_{0})) as realizations of realvalued error terms, denoted by ɛ = (ɛ_{1},…,ɛ_{K}) (16). Error terms are not observed, and must be estimated from the data; we develop a theoretical framework and provide an algorithm, ABCμ, for this purpose when the likelihood is intractable. We intentionally focus on the posterior distributions of components of ɛ to make probabilistic statements of mismatch between the model and the data (17) and hence to facilitate model criticism, as summarized in Fig. 1C.
Postgenomic data such as protein interaction networks (PINs) are now available for a growing number of organisms, (e.g., refs. 18 and 19). They offer a new perspective on the function of all organisms, and are, in addition to individual gene or genomic approaches, increasingly useful to elucidate the evolution of living systems, (e.g., refs. 12, 20, and 21), despite being noisy, incomplete, and static descriptions of the real, transient protein network (22). To elucidate the network evolution of prokaryotes, we here analyze the compatibility of the Treponema pallidum and Helicobacter pylori PIN datasets with a set of competing models inspired by fundamentally different modes of network evolution.
ABC under Model Uncertainty*
Joint Posterior Density of Model Parameters and Summary Errors.
For the purpose of model criticism in situations where the likelihood is intractable, define the unknown error ɛ as the random variable with conditional probability distribution Next, we assume that ℙ_{θ,x0}(ɛ ≤ e) has a density ξ_{θ, x0} with respect to an appropriate measure for ρ.† It is natural to suggest using this quantity as an augmented likelihood for x_{0} under ρ while adhering to the current model, We thus capture the direct information brought by the discrepancies ρ on θ and/or model M_{i} in a scalar value. For a given prior π_{θ,ɛ}(θ,ɛM_{i}), we embrace two aspects of statistical reasoning, parameter inference and model criticism, simultaneously by the joint posterior density using the data once.‡ In practice, we take π_{θ,ɛ} = π_{ɛ} × π_{θ},§ reflecting our inability to quantify a priori model adequacy for a value of θ. The posterior relationship Eq. 5 exploits the dependence between model error and model parameterization. ABC only infers model parameterization from realized model errors after simulation and does not question the adequacy of the likelihood model.
The simplest algorithm to sample from Eq. 5 is:
StdABCμ1 Sample θ ∼ π_{θ}(θ  M_{i}), simulate x ∼ f(·  θ,M_{i}) and compute ɛ = ρ(s(x),s(x_{0})).
StdABCμ2 Accept (θ,ɛ) with probability proportional to π_{ɛ}(ɛ  M_{i}), and go to StdABCμ 1.
Interpretation of the Marginals: Parameter Inference and Model Criticism.
The thrust of this article is to recognize the utility of the unknown error ɛ for model criticism. By design, nonzero values of ρ indicate discrepancies between the model and the data, so that intuitively, only if the model matches the data, we expect the mode of ξ_{θ, x0}(ɛ) to be on average zero for some value of θ if the summaries behave sufficiently well. Parameter inference based on the marginal posterior distribution f_{ρ}(θx_{0},M_{i}) is justified in an “approximate likelihood” sense, because, under regularity assumptions (SI Appendix, S1.1) on ξ_{θ, x0}(ɛ) which we assume throughout this section, Setting π_{ɛ}(ɛM_{i}) = 1{ɛ ≤ τ /2}/τ, we recover the “standard” ABC approximation Eq. 2; please see SI Appendix, S1.2 for more details and examples. We can interpret the variety of ABC kernels as exerting a particular prior belief on the adequacy of the current model (23). In agreement with methods of ABC (SI Appendix, S1.2), we always choose a prior π_{ɛ}(ɛM_{i}) with mode at zero to accommodate a prior belief that the model is plausible. For the purpose of parameter inference, it is sufficient to “plugin” realized errors in Eq. 6, but here we also focus on the marginal posterior error f_{ρ}(ɛx_{0},M_{i}). For the prior predictive error density L_{ρ}(ɛ) = ∫δ{ρ(s(x),s(x_{0})) = ɛ}π(xM_{i})dx we have that (see SI Appendix S1.1). Hence, f_{ρ}(ɛx_{0},M_{i}) can be understood as an error density under the prior predictive distribution that is weighted according to error magnitude. Small error boosts the prior belief for a particular value of θ, see Eq. 6. We thus prefer model criticism based on Eq. 7 rather than L_{ρ}(ɛ) as it focuses on those θ actually inferred from the perspective of Eq. 5, and attenuates the dependence of L_{ρ}(ɛ) on π_{θ}(θM_{i}). This dependence is undesirable in that a faultless model could appear questionable under unfortunate prior choice (SI Appendix, S1.3). For the practitioner, we provide a computationally feasible method for model criticism within the prior predictive setting as an alternative to using datasplitting techniques that are here difficult or too expensive to construct (5, 7, 17).
Multidimensional Error Terms ɛ.
The complexity of the settings to which ABC is typically applied makes it difficult to think of a universal discrepancy function ρ. The joint posterior distribution of multiple errors ɛ = (ɛ_{1},…,ɛ_{K}), corresponding to K discrepancies ρ_{k}(S_{k}(x),S_{k}(x_{0})) (SI Appendix, S1.4), facilitates to diagnose model mismatch more systematically and comprehensively; we have In the ABC literature, it has been recognized that attempting to match jointly a set of summaries is too conservative, and instead a linear combination of summaries is typically employed (14). Nonetheless, we believe that each summary captures aspects of model discrepancy. To control several summaries stringently for accurate and robust parameter inference (12), min_{k}ξ_{k,θ, x0}(ɛ_{k}) here supersedes ξ_{θ, x0}(ɛ) (see Materials and Methods, section 3, and SI Appendix, S1.6).
Algorithm.
The major impediment in ABC—that the likelihood surface is turned into a “bouncy castle,” see Fig. 1—is in the multivariate case exacerbated by the fact that the unknown error terms are correlated by design, and easily outnumber the θ's. To obtain a smoothed, stabilized approximation to ξ_{k,θ, x0}(ξ_{k}) that better controls the volatility of the simulated datasets, we employ kernel density estimates ξ̂_{k}(ɛ_{k};x):= 1/(Bh_{k})∑_{b = 1}^{B}K([ɛ_{k} − ρ_{k}(S_{k}(x_{b}),S_{k}(x_{0}))]/h_{k}) in line with ABC (SI Appendix, S1.5). In theory, this corresponds to replacing ξ_{θ, x0}(ɛ) in Eq. 8 with min_{k} ∫h_{k}^{−1}K ((ɛ_{k} − ν_{k})/h_{k})ξ_{θ,x0}(v)dv. In practice (SI Appendix, S1.7), we set B = 50 and attain under technical modifications (see Material and Methods, section 3) a smoothing approximaion on the auxiliary space (θ,ɛ,x). Various Monte Carlo strategies (8, 24) may be devised to sample from Eq. 9 (SI Appendix, S1.8); our MCMC implementation (SI Appendix, S1.9), particularly addresses the codependencies of ρ_{k} with a careful choice of q(ɛ → ɛ′). Suppose an initial sample (θ,ɛ) and prior specifications (SI Appendix, S1.10);
ABCμ1 if now at θ, move to θ′ according to q(θ → θ′) (SI Appendix, S1.12).
ABCμ2 Generate x′ ∼ f(·θ′,M_{i}), and construct ξ̂_{k}(·x′) for all k. If now at ɛ, move to ɛ′ according to q(ɛ → ɛ′). We guide this proposal with ξ̂_{k}(·x) and ξ̂_{k}(·x′) (SI Appendix, S1.12).
ABCμ3 Accept (θ′,ɛ′, x′) with probability and otherwise stay at (θ,ɛ,x), then return to ABCμ 1.
Please see Materials and Methods, section 4, for a technical discussion and Tables 1 and 2 for a comparison of the efficiency of ABCμ with related samplers.
Model Criticism by Revealing Model Inconsistency Across Discrepancies.
For large datasets and/or complex models, the discrepancies ρ_{k} are often codependent (5, 17). Our approach to model criticism capitalizes on the fact that the codependencies among S_{k}(x) under the predictive distribution π(xM_{i}) are typically different from those among S_{k}(x_{0}) if the model is not adequate, revealing model inconsistency in terms of conflicting, codependent summaries. As exemplified in Fig. 2, only a comprehensive set of summaries may enable model criticism; it is our view that choosing comprehensive summaries and discrepancy functions is crucial to ensure the approximation quality of Eq. 6 to the likelihood (12) as well as for model criticism based on posterior densities of summary errors (Eq. 7). To explore model adequacy, we recommend investigating various posterior quantities of f̂_{ρ}(ɛx_{0}, M_{i}) and using centrality measures such as high probability density (HPD) intervals; we remark at this point that marginal properties are in our setting typically not independent and caution against the overinterpretation of marginal diagnostics (see further Fig. 3).
Example: Criticizing Models of PIN Evolution
The structure of PINs derives from multiple stochastic processes over evolutionary timescales, and a number of mechanisms, based on randomly growing graphs, have been proposed to capture aspects of network growth (ref. 29 and references therein). We briefly motivate three models of network evolution. Recent comprehensive analyses across 181 prokaryotic genomes suggest that lateral gene transfer probably occurs at a low rate, but that, cumulatively, ≈80% of all genes in a prokaryotic genome are involved in lateral gene transfer (30); model PAP¶ is inspired by this scenario as it proposes network evolution in terms of attachment processes only (Materials and Methods, section 2). At least 40% of genes in prokaryotes appear to be products of gene duplication (31). Model DDA+PA (Materials and Methods, section 2) is designed to quantify the potential role of duplication and divergence in network evolution (12). At least for eukaryotes, the formation or degeneration of functional links between proteins (link turnover) is estimated to occur at a fast rate of ≈10^{−5} changed interactions per My per protein pair (20). We extend model DDA+PA into Model DD+LNK+PA (Materials and Methods, section 2), which includes link turnover in terms of preferential loss and gain of protein interactions. Crucially, ABC enables us to account for data incompleteness. Previously, we modeled missing data by randomly sampling proteins from the simulated data (RS1) (12). Here, we examine an alternative model that randomly samples from those proteins that have an interaction in the simulated data (RS2). (For the former, we add “r” to the model acronyms.) Necessarily, all models remain conceptually limited and must be cautiously interpreted; for example, the assumption that the network as a whole evolves at homogeneous rates has been questioned (32).
Models of Network Evolution Inspired by Horizontal Gene Transfer, DuplicationDivergence, and Link Turnover.
We ask whether the T. pallidum PIN topology is compatible with a number of fundamentally different modes of network evolution, in guise of simplified models. We successfully checked (Materials and Methods, section 5) and applied ABCμ (Materials and Methods, section 6) to sample from f̂_{ρ}(θ,ɛx_{0},·) for the models PAP, DDA+PA, and DD+LNK+PA (under both sampling schemes RS1 and RS2); see Fig. 3 and SI Appendix, Figs. S5 and S6. Based on RS1 (12), all models depart significantly in FRAG as exemplified for PAPr in Table 3. This motivated us to consider alternative models of missing data, and we found no significant departures in FRAG for any of the considered evolution models under RS2; see Table 3. Turning to the 6 remaining discrepancy functions, we observe (Table 3 and Fig. 3), that only model DDA+PA matches the Treponema pallidum PIN adequately, suggesting that an evolutionary mode of duplicationdivergence is most consistent with the T. pallidum PIN dataset. Repeating our analysis based on RS2 for the Helicobacter pylori PIN dataset, we could not substantiate our results further, because all considered models provide an adequate fit to the data. This is surprising, because we expect a similar power of our method on both datasets (SI Appendix, S1.19), and may point to qualitative differences among the two PINs owing to different, underlying experimental protocols.
Conclusion
The growing complexity of realistic models renders Bayesian model criticism increasingly important and difficult. In this article, we provide Bayesian techniques to comprehensively quantify discrepancies between the likelihood model and the data, simultaneously with parameter inference in situations when the likelihood is intractable, thus providing valuable guidance on the interpretability of parameter estimates and on how to improve models. We found this methodology helpful in iteratively identifying an adequate model of network evolution in terms of a large number of summaries; in particular, the PIN topology of the prokaryote T. pallidum provides little support for network evolution dominated by link turnover, or by lateral gene transfer alone. We close by cautioning that it is difficult to convincingly associate a formal framework to our high probability density intervals on multiple diagnostic error terms (6, 7). The presented methods will be useful in the initial stages of model and data exploration (16), and in particular, in efficiently scrutinizly several models by direct, inspection of their summary errors (5), prior to more formal analyses (14).
Materials and Methods
1. Summaries.
PINs can be described as graphs that contain a set of nodes, interacting proteins, and undirected binary edges, representing the observed interactions between the proteins. We consider the following topological summary statistics of PINs: Order, the number of nodes; size, the number of edges; node degree, the number of edges associated with a node;
2. Algorithmic Details of the Models of Network Evolution.
Given a PIN x_{0}, we simulate a network under a given model to the number of genes in the respective genome, and account for incompleteness in x_{0} by either RS1 or RS2. In model PAP evolution proceeds only by preferential attachment (33); at each step the number of attachments minus one is Poisson distributed with mean m. DDA+PA (12) features preferential attachment of a new node to one node of the existing network with probability α, or, with probability 1−α, a step of node duplication and immediate link divergence. In the latter case, a parent node is randomly chosen and its edges are duplicated. For each parental edge, the parental and duplicated one are then lost with probability δ_{Div} each, but not both; moreover, at least one link is retained to any node. The parent node may be attached to its child with probability δ_{Att}. DD+LNK+PA is a mixture of duplicationdivergence (as above with parameter δ_{Div} but fixed δ_{Att} = 0), link addition and deletion, and preferential attachment as in DDA+PA. Link addition (deletion) proceeds by choosing a node randomly, and attaching it preferentially to another node (deleting it preferentially from its interaction partners) (20). At each step unnormalized weights are calculated as follows. For duplicationdivergence, the rate κ_{Dup} is multiplied by the order of the current network; for link addition, the rate κ_{LnkAdd} is multiplied by
3. Combining Multiple Error Terms.
It is difficult to compare ξ̂_{k,θ,x0}(ɛ_{k}) across k without further transformation, because summaries differ in their sensitivity to changes in θ (12) so that the scales of the density estimates vary across summaries and (to a lesser extent) across θ; see SI Appendix, Fig. S1. In RejectionABCμ (SI Appendix, S1.8), summaries may be precomputed and standardized, but this is not applicable in MCMC. We propose to standardize the variance of each ξ̂_{k} to one, bearing in mind that this might reduce approximation quality in some cases.
4. Details of ABCμ.
ABCμ is similar to the MCMC algorithm proposed in ref. 27; the latter also extends the state space, but includes a scalar τ (circumventing the need to design an efficient proposal q(ɛ → ɛ′)). We show that ABCμ eventually samples from f̂_{ρ}(θ,ɛx), provide convergence results for the smoothing approximation (SI Appendix, S1.5 and S1.11), discuss our nonstandard proposal kernel (SI Appendix, S1.12), and provide final details (SI Appendix, S1.13). We do not claim that our smoothing approach based on repeated sampling from f(·θ,M_{i}) comes at no cost. What we contend is that (i) we obtain improved mixing quality relative to ABC within MCMC, owing to a stabilized, numerical approximation of the likelihood with Eq. 9, and (ii) that we can construct more efficient proposal kernels, a point particularly relevant for ABCμ, where the number of error terms easily exceeds the dimensionality of θ. With respect to (i), we compared ABCμ with ABCMCMC (26), zoomABCMCMC (12), and AUXABC (27) on the H. pylori PIN dataset (SI Appendix, S1.15). Table 1 illustrates that ABCμ results in much improved acceptance rates and better mixing; this has already been suggested by Becquet and Przeworski (28) when f(s(x)θ,M_{i}) is available in closed form. As for (ii), we devised a guided, asymmetric random walk in ɛ (SI Appendix, S1.12). This greatly improved both overall acceptance rate and mixing in ɛ compared to AUXABC and ABCμ with a (symmetric) Gaussian random walk (GRW) in ɛ (Table 2), exemplifying that effectively, repeated sampling may improve the efficiency of standard MCMC methods.
5. Testing ABCμ on PINs.
It was unclear whether our implementation is efficient enough to sample from Eq. 9. First, estimates of Eq. 7 might be inherently biased as for technically similar algorithms, and/or the PIN topology, in terms of the chosen summaries, might not be informative enough to evidence discrepancies between the model and the data. Second, given our smoothing approximation based on an adaptively chosen bandwidth h = h(x), we might be worried that posterior quantities of θ may be unreliable. We have addressed both concerns empirically (SI Appendix, S1.16), comforting that ABCμ provides useful numerical estimates of f̂_{ρ}(ɛx_{0},M_{i}) to criticize the models of network evolution considered here, and suggesting that samples θ  x_{0} from the marginal of Eq. 9, obtained by ABCμ, provide a good approximation to f_{ρ}(θ  x_{0},M_{i}).
6. Criticizing Models of Network Evolution.
To contrast models PAP, DDA+PA, and DD+LNK+PA to the T. pallidum PIN dataset, ABCμ based on the summaries CONN, WR, ODBOX, DIA, CC,
Acknowledgments
We thank M.P.H. Stumpf for stimulating discussions, T. Hinkley for providing an efficient C + + library to evaluate network summaries, and M. Sternberg for comments on an earlier version of the manuscript, and two anonymous referees for valuable comments on an earlier version of this article. Computations were performed at the Imperial College High Performance Computing Centre http://www3.imperial.ac.uk/ict/services/teachingandresearchservices/highperformancecomputing. O.R. was supported by the Wellcome Trust; C.A. by an Advance Research Fellowship from the Engineering and Physical Sciences Research Council, C.W. by the Danish Cancer Society and the Danish Research Councils, and S.R. by the Biotechnology and Biological Sciences Research Council and the Centre for Integrative Systems Biology at Imperial College.
Footnotes
 ^{1}To whom correspondence should be addressed. Email: oliver.ratmann{at}imperial.ac.uk

Author contributions: O.R. and S.R. designed research; O.R. performed research; O.R., C.A., and S.R. analyzed data; and O.R., C.A., C.W., and S.R. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/cgi/content/full/0807882106/DCSupplemental.

↵* For ease of exposition, we start with a scalar error term ɛ corresponding to a univariate discrepancy ρ, and later generalize to multidimensional error terms. In ABC, a set s of summaries is commonly combined into the univariate ρ; at a first reading it may help to think of s as a single summary. In particular, it may be useful to take f(xθ,M_{i}) as the onedimensional Gaussian density with mean θ and fixed variance, and ρ(s(x), s(x_{0})) as the difference x − x_{0}.

↵† We denote the Indicator function with 1, and particular limits of a sequence of functions with δ (see Eq. S1 in SI Appendix, S1.1). If ρ is continuous, ξ_{θ, x0} is taken with respect to the Lebesgue measure; in many applications, X is a finite set and ξ_{θ, x0} is then understood with respect to a counting measure.

↵‡ Our developments are subject to the integrability of Eq. 5.

↵§ For clarity, we subscript π_{θ} and π_{ɛ} to denote the priors in θ and ɛ, respectively. From now on, we drop the conditioning of π_{ɛ} on θ. Finally, we denote with π(xM_{i}) the prior predictive density ∫f(xθ,M_{i})π(θM_{i})dθ.

↵¶ Model acronyms are explained in Materials and Methods, section M2, with underlined characters.

Freely available online through the PNAS open access option.
References
 ↵
 May RM
 ↵
 ↵
 Bernado JM,
 Smith AFM
 ↵
 Box GEP
 ↵
 Bernardo JM,
 Berger JO,
 Dawid AP,
 Smith AFM
 Gelfand AE,
 Dey DK,
 Chang H
 ↵
 ↵
 Bernardo JM,
 Berger JO,
 Dawid AP,
 Smith AFM
 Bayarri MJ,
 Berger JO
 ↵
 Liu JS
 ↵
 Gouriéroux C,
 Monfort A
 ↵
 ↵
 Riley S,
 et al.
 ↵
 ↵
 Wilkinson RD
 ↵
 Fagundes NJR,
 et al.
 ↵
 Toni T,
 Welch D,
 Strelkowa N,
 Ipsen A,
 Stumpf MPH
 ↵
 ↵
 Green PJ,
 Hjort NL,
 Richardson S
 O'Hagan A
 ↵
 ↵
 ↵
 ↵
 Pinney JW,
 Amoutzias GD,
 Rattray M,
 Robertson DL
 ↵
 ↵
 Wilkinson RD
 ↵
 Sisson SA,
 Fan Y,
 Tanaka MM

 Gilks WR,
 Richardson S,
 Spiegelhalter DJ
 ↵
 Marjoram P,
 Molitor J,
 Plagnol V,
 Tavaré S
 ↵
 ↵
 Becquet C,
 Przeworski M
 ↵
 Knudsen M,
 Wiuf C
 ↵
 Dagan T,
 ArtzyRandrup Y,
 Martin W
 ↵
 Chothia C,
 Gough J,
 Vogel C,
 Teichmann SA
 ↵
 Davidson EH,
 Erwin DH
 ↵
 Barabási A,
 Albert R
Citation Manager Formats
Sign up for Article Alerts
Article Classifications
 Physical Sciences
 Statistics
 Biological Sciences
 Evolution