Model criticism based on likelihood-free inference, with an application to protein network evolution
- aDepartment of Public Health and Epidemiology, Imperial College London, London W2 1PG, United Kingdom;
- bDepartment of Mathematics, University of Bristol, Bristol BS8 1TW, United Kingdom;
- cBioinformatics Research Center, University of Aarhus, 8000 Aarhus C, Denmark; and
- dCentre for Biostatistics, Imperial College London, London W1 1PG, United Kingdom
-
Edited by Elizabeth A. Thompson, University of Washington, Seattle, WA, and approved March 26, 2009 (received for review August 13, 2008)
Abstract
Mathematical models are an important tool to explain and comprehend complex phenomena, and unparalleled computational advances enable us to easily explore them without any or little understanding of their global properties. In fact, the likelihood of the data under complex stochastic models is often analytically or numerically intractable in many areas of sciences. This makes it even more important to simultaneously investigate the adequacy of these models—in absolute terms, against the data, rather than relative to the performance of other models—but no such procedure has been formally discussed when the likelihood is intractable. We provide a statistical interpretation to current developments in likelihood-free Bayesian inference that explicitly accounts for discrepancies between the model and the data, termed Approximate Bayesian Computation under model uncertainty (ABCμ). We augment the likelihood of the data with unknown error terms that correspond to freely chosen checking functions, and provide Monte Carlo strategies for sampling from the associated joint posterior distribution without the need of evaluating the likelihood. We discuss the benefit of incorporating model diagnostics within an ABC framework, and demonstrate how this method diagnoses model mismatch and guides model refinement by contrasting three qualitative models of protein network evolution to the protein interaction datasets of Helicobacter pylori and Treponema pallidum. Our results make a number of model deficiencies explicit, and suggest that the T. pallidum network topology is inconsistent with evolution dominated by link turnover or lateral gene transfer alone.
- Bayesian inference
- intractable likelihoods
- Markov chain Monte Carlo
- Approximate Bayesian Computation
- model uncertainty
Footnotes
- 1To whom correspondence should be addressed. E-mail: oliver.ratmann{at}imperial.ac.uk
-
Author contributions: O.R. and S.R. designed research; O.R. performed research; O.R., C.A., and S.R. analyzed data; and O.R., C.A., C.W., and S.R. wrote the paper.
-
The authors declare no conflict of interest.
-
This article is a PNAS Direct Submission.
-
This article contains supporting information online at www.pnas.org/cgi/content/full/0807882106/DCSupplemental.
-
↵* For ease of exposition, we start with a scalar error term ɛ corresponding to a univariate discrepancy ρ, and later generalize to multidimensional error terms. In ABC, a set s of summaries is commonly combined into the univariate ρ; at a first reading it may help to think of s as a single summary. In particular, it may be useful to take f(x|θ,Mi) as the one-dimensional Gaussian density with mean θ and fixed variance, and ρ(s(x), s(x0)) as the difference x − x0.
-
↵† We denote the Indicator function with 1, and particular limits of a sequence of functions with δ (see Eq. S1 in SI Appendix, S1.1). If ρ is continuous, ξθ, x0 is taken with respect to the Lebesgue measure; in many applications, X is a finite set and ξθ, x0 is then understood with respect to a counting measure.
-
↵‡ Our developments are subject to the integrability of Eq. 5.
-
↵§ For clarity, we subscript πθ and πɛ to denote the priors in θ and ɛ, respectively. From now on, we drop the conditioning of πɛ on θ. Finally, we denote with π(x|Mi) the prior predictive density ∫f(x|θ,Mi)π(θ|Mi)dθ.
-
↵¶ Model acronyms are explained in Materials and Methods, section M2, with underlined characters.
-
Freely available online through the PNAS open access option.










