New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology
Spatial and temporal dynamics of superspreading events in the 2014–2015 West Africa Ebola epidemic
Edited by David Cox, Nuffield College, Oxford, United Kingdom, and approved January 5, 2017 (received for review September 8, 2016)

Significance
For many infections, some infected individuals transmit to disproportionately more susceptibles than others, a phenomenon referred to as “superspreading.” Understanding superspreading can facilitate devising individually targeted control measures, which may outperform population-level measures. Superspreading has been described for a recent Ebola virus (EBOV) outbreak, but systematic characterizations of its spatiotemporal dynamics are still lacking. We introduce a statistical framework that allows us to identify core characteristics of EBOV superspreading. We find that the epidemic was largely driven and sustained by superspreadings that are ubiquitous throughout the outbreak and that age is an important demographic predictor for superspreading. Our results highlight the importance of control measures targeted at potential superspreaders and enhance understanding of causes and consequences of superspreading for EBOV.
Abstract
The unprecedented scale of the Ebola outbreak in Western Africa (2014–2015) has prompted an explosion of efforts to understand the transmission dynamics of the virus and to analyze the performance of possible containment strategies. Models have focused primarily on the reproductive numbers of the disease that represent the average number of secondary infections produced by a random infectious individual. However, these population-level estimates may conflate important systematic variation in the number of cases generated by infected individuals, particularly found in spatially localized transmission and superspreading events. Although superspreading features prominently in first-hand narratives of Ebola transmission, its dynamics have not been systematically characterized, hindering refinements of future epidemic predictions and explorations of targeted interventions. We used Bayesian model inference to integrate individual-level spatial information with other epidemiological data of community-based (undetected within clinical-care systems) cases and to explicitly infer distribution of the cases generated by each infected individual. Our results show that superspreaders play a key role in sustaining onward transmission of the epidemic, and they are responsible for a significant proportion (
The outbreak size of the 2014 Ebola virus (EBOV) epidemic in Western Africa was unprecedented, and control measures failed to contain the epidemic at its early rapidly growing stage (1, 2). Mathematical models played a key role in inferring the transmission dynamics of EBOV (3). Modeling work succeeded in inferring, in particular, the basic reproductive number
An important phenomenon in disease transmission is so-called superspreading, in which certain individuals (i.e., superspreaders) disproportionately infect a large number of secondary cases relative to an “average” infectious individual (whose infectivity may be well-represented by
Although contact-tracing data has revealed superspreading of EBOV (10, 11), systematic understanding of how EBOV superspreading events varied over space and time is still lacking. For instance, it is unclear how the role of EBOV superspreading varies over the course of the outbreak. We aimed to answer, primarily in a spatiotemporal setting, (i) how superspreading may have impacted overall transmission dynamics, and (ii) what the potential drivers of superspreading are. We attacked these problems by analyzing a dataset with individual-level spatial data (to the level of individual houses; Study Data). Such community-based surveillance data offer a unique window to study localized transmissions of EBOV and complement formal surveillance by detecting cases that did not interface with clinical care. In this work, we built an age-specific spatiotemporal framework, which allowed us to explicitly infer the probability distribution of the number of new cases generated by each infected individual (hereafter, offspring distribution). This framework was applied to the community-based EBOV case dataset and deployed to infer transmission dynamics and identify superspreaders. Specifically, we used Bayesian inferential techniques to synthesize individual-level spatial data (i.e., GPS coordinates), age data, symptoms onset time, and burial time (Study Data), and to impute unobserved infection time and transmission network (Materials and Methods and SI Text).
Study Data
We analyzed a community-based dataset collected from the Safe and Dignified Burials program conducted by the International Federation of Red Cross, between October 20, 2014, and March 30, 2015, in Western Area (which comprises the capital Freetown and its surrounding area) in Sierra Leone. These data contain GPS locations (collected by mobile phones) of where the bodies of 200 dead who tested positive for Ebola were collected (typically at their homes). Age, sex, time of burial (which was usually performed within 24 h of death), and symptom onset time were also recorded. Symptom onset time was reported retrospectively by next of kin.
Results
Natural History Parameters.
We estimated that
Estimates of reproductive number. (A) Posterior distribution of the basic reproductive number,
The mean of infectious period (i.e., duration from symptoms onset to death/burial) was estimated to be 3.9 d [3.75, 4.0]. Because the transmission tree and times of infection were imputed (Materials and Methods), we were also able to infer the mean generation time of EBOV, which was estimated to be 10.9 d [9.25, 13.01]. Both estimates were lower than that estimated from cases detected within the clinical care system [e.g., mean infectious period 8 d estimated for patients who received clinical care (13) and mean generation time 15.3 d estimated by the WHO (1)]. These discrepancies potentially highlight systematic differences between community-based cases and cases notified in clinical care systems, with terminal community-based cases progressing significantly more rapidly.
Superspreading in Space and Time.
Fig. 2 A and B show a clear asymmetry in the average number of “offspring” at the individual level, quantifying the impact of superspreading. In particular, it was observed that most secondary cases generated less than one offspring on average. Thus, the epidemic growth appeared to be fueled mostly by only a few superspreaders (i.e., the outliers in the boxplot). A common empirical measure of degree-of-transmission heterogeneity and superspreading is the dispersion parameter
(A) Spatial distribution of mean number of offspring resulting from initial cases at the individual level. An infection is classified as an index case if it has a posterior probability of importation (i.e., not infected by any cases in the data) >0.5; otherwise, it is classified as a secondary case. Lat, latitude; Lon, longitude. (B) Distribution of mean number of offspring by different sources of infection. (C) Proportion of infected individuals who are direct and indirect descendants of the first five superspreaders (i.e., first five individuals with highest number of mean offspring; note that the choice of five is arbitrary here). “Any” includes superspreaders who were also the index cases (i.e., the roots of transmission trees).
In Fig. 3A, we show the time dependence of superspreading, illustrating that superspreading becomes relatively more important over time (i.e., within
Spatial and temporal dependence of superspreading. (A) Reported weekly deaths and inferred mean offspring distributions and the corresponding empirical estimates of
Heterogeneity of Infectiousness by Age.
Although superspreading in EBOV was evident and may be partly attributed to unsafe burial practice during the early stage of the outbreak (14), other drivers (e.g., social contact pattern) of this process remain unclear. In Fig. 4A, as expected, the infectious period had a clear positive relationship with mean offspring number. Despite the clear relationship between infectious period and the magnitude of superspreading, this covariate cannot be used as a predictor of superspreading, because it is not known a priori. More importantly, there is a significant difference in instantaneous infectious hazard exerted by different age groups (Fig. 4B)
Heterogeneity of infectiousness in age. (A) Relation between mean offspring and infectious period. It is worth noting that here an infectious period is strictly referred to the mean of the posterior samples of imputed infectious period of an individual, rather than the assumed universal infectious period distribution. (B) Instantaneous risk exerted by different age groups.
Sensitivity Analysis.
Underreporting is a ubiquitous feature of epidemiological data (17, 18). In this section, we explore the effect of underreporting on our analysis under two probable scenarios: (i) All unreported cases were circulating in the community and not hospitalized; and (ii) all unreported cases were hospitalized and therefore not reported in our database. In both scenarios, we tested with constant underreporting rates, across the whole study period and region, ranging from a very low (10%) to a very high one (90%). Doing so allowed us to investigate the probable lower and upper bound of our estimates. We also tested with time-varying underreporting rates in both scenarios. Details of how to include underreported cases are provided in Materials and Methods.
We focused on investigating the effect on
Effect of constant underreporting rates on estimates of transmission dynamics. (A) Estimates of
Effect of time-varying underreporting on estimates of transmission dynamics. (A) Estimates of
Our model assumed an isotropic spatial dispersal (Materials and Methods). Spatial infectivity, however, may depend on the population density
Testing the assumption of an isotropic spatial dispersal. (A) The distribution of mean offsprings under different scenarios. (B) The distribution of R0 under different scenarios. (C) The distribution of transmission distance under different scenarios. Here we considered three scenarios. In scenario 1 (base scenario), we assumed an isotropic dispersal and did not take into account the potential effect of population density. We considered in scenarios 2 and 3 that the dispersal kernel value was “moderated” by the relative population density of the
Population density and spatial distribution of the cases in the study area. Other than the smaller clusters near the center of the study area, most cases were found in more populated regions. It was noted that the raw grid resolution is
Testing alternative parameterizations of the incubation period
Testing alternative parameterizations of the infectious period
Testing alternative uninformative priors
Discussion
Superspreading is a core process for the transmission of many infections (7, 8). However, the importance of superspreading in driving epidemics varies with context. For instance, its impact depends on how it persists over the course of an epidemic. Quantifying superspreading and identifying scenarios where it is more likely to occur can facilitate refining future epidemics predictions and help in devising targeted intervention strategies that may outperform population-level control measures (9). To date, a systematic understanding of how EBOV has been (super)spreading in the recent outbreak in Western Africa is lacking, particularly in terms of individual-level covariates, and across the spatiotemporal setting. The key contributions of this work are to highlight and quantify the importance of superspreading and to show that it is in some senses systematic.
Community-based surveillance data offer a valuable opportunity to study superspreading, by focusing on nonhospitalized cases that may have been involved in superspreading events and not detected by formal surveillance. Here, we introduce a continuous-time spatiotemporal model that integrates individual spatial information with other epidemiological information of community-based cases and deploy it to quantify superspreading and its drivers for EBOV. Our framework enabled us to sample likely realizations of the unobserved transmission network among cases from which the offspring distribution of each case could be inferred, providing explicitly a machinery for understanding superspreading in space and time.
Our analysis is broadly consistent with previous work, indicating values of
We also extended previous analyses by showing that a substantial proportion of secondary cases were either direct or indirect descendants of a small number of superspreaders, underscoring the importance of superspreading in driving the epidemic
We reveal that age-dependent social contact structure may play an important role in (super)spreading EBOV in the local community. Specifically, our results identify age groups that have higher instantaneous transmissibility and show that cases in the more infectious age groups tend to be superspreaders when combined with a relatively long infectious duration. One plausible explanation, from the social perspective, may be that the young and old are much more likely to have (and infect) lots of visitors, compared to other age groups; a parallel corollary is that the young and old might be more likely to have others caring for them. Also, our results highlight systematic differences between community-based cases and cases notified in clinical care systems, with terminal community-based cases progressing significantly more rapidly. Our results stress the importance of characterizing superspreading of EBOV, enhance current understandings of its spatiotemporal dynamics, and highlight the potential importance of targeted control measures
There are limitations of our results. First of all, although community-based surveillance data complement formal surveillance by detecting cases that did not interface with clinical care, they contain only partial information about the epidemic, with hospitalized cases omitted. Also, it is possible that, by underreporting some community cases who generated subsequent cases, certain reported cases may be falsely attributed as sources of infection for those subsequent cases, overestimating the degree of superspreading. Accordingly, our sensitivity analysis evaluated the impact of these sources of underreporting, showing that our estimated degree of superspreading may in fact be conservative and represents a lower bound
Materials and Methods
Spatiotemporal Transmission Model.
We developed a continuous-time spatiotemporal transmission model that allowed us to sample the transmission tree among cases, integrating observed spatial and temporal individual data. This approach allowed us to infer explicitly the mean offspring distribution of each case. Specifically, the total probability of individual
Data Augmentation and Model Fit and Validation.
We estimated
Prior and posterior distributions of model parameters
Assessing the model fit. We used the estimated model to simulate (500 times) forward the transmission path and timings of events (i.e., infection time, onset time, and death time). (A) Comparison of the observed weekly temporal distribution of the cases with that summarized from the simulated data. Gray area represents the 95% C.I., and the black dots and line are the observed data, with 5 of 500 random realizations (colored lines) of the simulated epidemics imposed. We compared the temporal autocorrelations (at lag = 1 and lag = 2) of the observed and simulated epidemics. We also compared the peak height, the growth rate before peak, and decay rate after peak between the observed and simulated (the growth and decay rates correspond to the slopes of best-fitted linear lines to the observed or simulated data). Dotted lines represent the values of the summary statistics corresponding to the observed data. (B) Comparison of the observed spatial autocorrelation and the simulated. Here we used two common measures, Moran’s I and Geary’s C indices (33, 34), which range from −1 to 1 (a value close 1 indicates strong clustering and close to −1 indicates strong dispersion). Dotted lines represent the values of the summary statistics corresponding to the observed data.
Checking of the implementation of the inference procedures. We simulated 10 independent pseudodata from the model, with the model parameter values close to the posterior means obtained from fitting with the real dataset. The model is then fitted to each of the simulated datasets, and the resultant posterior distributions of the model parameters are shown. The true values of the model parameters are indicated by the red lines.
Testing Underreporting.
We divided the observational period into many 3-d-wide intervals. Within each time interval, we had the total number of unreported cases
SI Text
Likelihood Function.
Let
MCMC Algorithm.
Parameters in
Denote
Note that the background infection can be accommodated by adding a permanent infectious source presenting an additional challenge of strength
Acknowledgments
This work was supported by Bill & Melinda Gates Foundation Grant OPP1091919; the RAPIDD program of the Science and Technology Directorate Department of Homeland Security and the Fogarty International Center, National Institutes of Health; and the UK Medical Research Council (MRC). S.F. was also supported by MRC Career Award in Biostatistics MR/K021680/1.
Footnotes
- ↵1To whom correspondence should be addressed. Email: msylau{at}princeton.edu.
Author contributions: M.S.Y.L. designed research; M.S.Y.L., B.D.D., and B.T.G. performed research; M.S.Y.L. analyzed data; and M.S.Y.L., B.D.D., S.F., A.M., A.T., S.R., C.J.E.M., and B.T.G. wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1614595114/-/DCSupplemental.
Freely available online through the PNAS open access option.
References
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Weitz JS,
- Dushoff J
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Geoghegan JL,
- Senior AM,
- Di Giallonardo F,
- Holmes EC
- ↵.
- Doyle TJ,
- Glynn MK,
- Groseclose SL
- ↵
- ↵.
- Viboud C, et al.
- ↵.
- Yang W, et al.
- ↵
- ↵
- ↵
- ↵
- ↵.
- Haydon DT, et al.
- ↵.
- Cottam EM, et al.
- ↵
- ↵
- ↵.
- Lau MS,
- Marion G,
- Streftaris G,
- Gibson G
- ↵.
- Gibson GJ,
- Renshaw E
- ↵
- ↵
- ↵.
- Getis A
- ↵.
- Lau MSY,
- Marion G,
- Streftaris G,
- Gibson GJ
Citation Manager Formats
Sign up for Article Alerts
Article Classifications
- Biological Sciences
- Medical Sciences