Impact of human mobility on the emergence of dengue epidemics in Pakistan
- aDepartment of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA 02115;
- bCenter for Communicable Disease Dynamics, Harvard T. H. Chan School of Public Health, Boston, MA 02115;
- cTelenor Research, Telenor Group, N-1360 Fornebu, Norway;
- dOxford University Clinical Research Unit, Wellcome Trust Major Overseas Programme, Ho Chi Minh City, Vietnam;
- eCentre for Tropical Medicine, Nuffield Department of Clinical Medicine, University of Oxford, Oxford OX3 7FZ, United Kingdom;
- fDivision of Vector-Borne Diseases, Centers for Disease Control, San Juan, Puerto Rico 00920;
- gDepartment of Zoology, University of Peshawar, Peshawar 25120, Pakistan
See allHide authors and affiliations
Edited by Burton H. Singer, University of Florida, Gainesville, FL, and approved August 6, 2015 (received for review April 2, 2015)

Significance
Dengue virus has rapidly spread into new human populations due to human travel and changing suitability for the mosquito vector, causing severe febrile illness and significant mortality. Accurate predictive models identifying changing vulnerability to dengue outbreaks are necessary for epidemic preparedness and containment of the virus. Here we show that an epidemiological model of dengue transmission in travelers, based on mobility data from ∼40 million mobile phone subscribers and climatic information, predicts the geographic spread and timing of epidemics throughout the country. We generate fine-scale dynamic risk maps with direct application to dengue containment and epidemic preparedness.
Abstract
The recent emergence of dengue viruses into new susceptible human populations throughout Asia and the Middle East, driven in part by human travel on both local and global scales, represents a significant global health risk, particularly in areas with changing climatic suitability for the mosquito vector. In Pakistan, dengue has been endemic for decades in the southern port city of Karachi, but large epidemics in the northeast have emerged only since 2011. Pakistan is therefore representative of many countries on the verge of countrywide endemic dengue transmission, where prevention, surveillance, and preparedness are key priorities in previously dengue-free regions. We analyze spatially explicit dengue case data from a large outbreak in Pakistan in 2013 and compare the dynamics of the epidemic to an epidemiological model of dengue virus transmission based on climate and mobility data from ∼40 million mobile phone subscribers. We find that mobile phone-based mobility estimates predict the geographic spread and timing of epidemics in both recently epidemic and emerging locations. We combine transmission suitability maps with estimates of seasonal dengue virus importation to generate fine-scale dynamic risk maps with direct application to dengue containment and epidemic preparedness.
Dengue is the most rapidly spreading mosquito-borne disease worldwide (1, 2). Half the global population now lives in at-risk regions for dengue virus transmission, due to the wide distribution of the mosquito vector, Aedes aegypti, which thrives in peri-urban areas and transmits the virus between humans (3). Dengue virus can cause acute febrile illness and carries the risk of severe disease, hospitalization, and shock syndrome, especially in clinical settings with little experience treating dengue patients. There is currently no specific therapeutic protocol for, or vaccine against, infection (1). Current control measures focus on vector control, although these measures are often logistically difficult and have shown varying efficacy in controlling epidemics (4). In the absence of effective prevention and treatment, public health system preparedness remains the single most important tool for minimizing morbidity and mortality as dengue epidemics spread beyond endemic areas (5, 6).
The introduction of dengue into new populations is mediated by travel of infected individuals to areas that can support transmission, because mosquito vectors move only short distances during their lifespans (3, 7⇓⇓⇓⇓–12). International travel to endemic countries has resulted in imported cases and outbreaks in Europe and the Americas (2, 8, 10, 13). Local variation in transmission, within a single city for example, is also driven by mobility patterns of individuals on short timescales (7). Forecasting methods are needed to spatially target interventions and epidemic preparedness measures that reflect both the changing temporal risks of importation and environmental suitability that go beyond solely climate-based methods (14).
Dengue has long been endemic in most Southeast Asian countries (1), but has more recently emerged in parts of the Middle East and South Asia, including Pakistan (15, 16). In Pakistan, the transmission of dengue viruses was largely confined to the southern city of Karachi until 2011 when a large dengue epidemic with over 20,000 cases occurred in the northeastern city of Lahore (16), causing significant morbidity and mortality. In 2013, a second large epidemic occurred in northeastern Pakistan in Punjab and Khyber-Pakhtunkhwa (KP) provinces, establishing the region as an emerging focus of seasonal dengue epidemics. It has been hypothesized that the recent geographic expansion of A. aegypti mosquito vectors, changing environmental suitability, and human importation of dengue from endemic regions all contributed to the emergence of dengue in northern areas (17). Pakistan is therefore representative of many countries that are on the verge of countrywide endemic dengue transmission and are struggling to contain its emergence into previously dengue-free regions.
Measuring changing risks of importation events that spark epidemics has been extremely challenging on the refined temporal and spatial scales necessary to inform local policies (18). Being able to predict when to prepare surveillance systems and health facilities for dengue outbreaks could dramatically reduce the morbidity and mortality associated with epidemics and would allow policy makers to pinpoint regions that are particularly vulnerable to imported cases, for vector control. Mobile phone data offer direct measures of human aggregation and movement and represent a unique source of information on the human determinants of the geographic expansion of emerging epidemic diseases like dengue. Here, we conduct a retrospective epidemiological analysis of large dengue outbreaks in Pakistan in 2013, to examine the predictive ability of an epidemiological model that integrates human mobility from the largest mobile phone dataset analyzed to date with climate information. We show that within-country human mobility predicts emerging epidemics in Pakistan, and epidemiological models incorporating this type of data can predict the spatial extent and timing of outbreaks, providing a new approach to forecasting.
Results
Human Mobility in Pakistan Does Not Conform to Standard Model Predictions.
To measure human travel patterns underlying the spread of dengue virus across Pakistan in 2013, we estimated the mobility of 39,785,786 mobile phone subscribers between geolocated mobile phone towers in Pakistan between June 1 and December 31 of 2013 (representing ∼22% of the population; Materials and Methods). Daily locations and movements were aggregated to measure travel between 356 small, politically defined areas called tehsils (Fig. 1A and Materials and Methods). We compared our data to gravity models of mobility, developed from transportation theory and commonly used to parameterize infectious disease frameworks, to assess whether observed mobility measured using mobile phone data significantly improves upon this standard approach. We have focused on intracountry mobility patterns (Discussion).
Human mobility dynamics in Pakistan. (A) Population density (red, high density; yellow, low density) and mobile phone tower coverage from the mobile phone operator in Pakistan (colored in gray) per tehsil. (B) The top routes of travel between pairs of tehsils in Pakistan. A line is drawn if at least 20,000 trips occurred between the origin and destination between June and December 2013. The top routes occur between Karachi and cities in northern Punjab province, particularly Lahore tehsil. (C) Relative direction and volume of travel. For each trip, we calculated the distance traveled from the origin and the destination. The origin location was centered at 0,0 and the longitude distance and latitude distance to the destination are shown. Although many trips occurred over short distances, a substantial amount of travel occurred between the southeastern and northern parts of the country, reflecting the geography and population distribution of Pakistan.
The sampled population was extremely mobile. We estimated that between 2.4 million and 4.8 million subscribers traveled between tehsils each day (95% quantile interval: 3.1–4.6 million; details in Materials and Methods). Most travel followed a NW–SE corridor along the major highways (Fig. 1B). Large volumes of travel occurred to and from Karachi, a major population and economic hub of Pakistan, with ∼710,000 subscribers traveling to or from the city each day on average (95% quantile interval: 570,000–813,000; Fig. S1). In contrast to expectations of standard mobility models, there was almost no decay in travel with increasing distance (correlation coefficient: −0.064, P < 0.001), although the most frequent destinations for travel were often in a nearby tehsil (Fig. 1C). This pattern reflects the topography, road infrastructure, and population distribution in Pakistan, with the largest cities outside Karachi being located in the northern part of Punjab province that includes the Rawalpindi/Islamabad metropolitan area, Lahore, and Faisalabad. Although there was a decrease in overall movement during Ramadan, we did not observe a systematic difference in the amount or direction of travel patterns in the weeks before and after the holiday (Fig. S1).
The mobility values derived from the mobile phone data. (A) The mobility estimates for travel from Karachi to Lahore. Using the mobile phone data, we calculated both the number of trips between tehsils (A, Top Left) and the daily number of active subscribers: flux value (A, Top Right). The number of trips was normalized by the daily number of active subscribers (A, Bottom Left) that was used to estimate the importation of infected travelers from Karachi. Our dataset runs from June to December 2013 and we inferred the mobility estimates (gray) for January–June of that year (Materials and Methods) (A, Bottom Right). (B) The most frequently traveled routes based on the mobile phone data. We compared the most traveled routes (top 10,000 routes) 1 mo before Ramadan (B, Left) (June), during Ramadan (July) (B, Center), and after Ramadan (B, Right) (September). The top routes remain consistent throughout the major holiday, in particular the large amount of travel between Karachi and Punjab province. A line is drawn if more than 10,000 trips were taken over a route. (C) Country-wide population flux values per day. Total population flux per day was measured (red line). Ramadan is shown in gray. We have adjusted the mobile phone data (CDRs) (black line shows adjustment) to account for a sharp drop in the raw CDRs after day 145 and fitted the adjusted data [spline, blue line; moving average (MA) trend, purple line], using two common time series methods (a MA and smoothing spline) to smooth the data. For the remaining analysis, we use these trend lines from these time series smoothing methods.
Epidemiological Modeling of Dengue Epidemics in 2013.
There were 15,535 reported dengue cases in 82 tehsils over 7 mo in Pakistan in 2013 (Fig. 2A and Materials and Methods). Peak timing of the epidemic varied by location, and the majority of cases occurred in and around Karachi, in the northern district of Swat, and in the cities of Lahore and Rawalpindi (Fig. 2A and Table S1). About half of the dengue cases reported occurred in the Mingora area of Swat (KP province), marking the first major outbreak in the region [n = 7,950, compared with a previous maximum of 300 cases reported in KP in 2011 (16)].
Dengue epidemiology in Pakistan in 2013. (A) The location of dengue cases throughout Pakistan and the number of cases per week by tehsil. Tehsils that reported at least 15 cases are shown on the map with corresponding color shown in the time series. The majority of cases were reported in Karachi (gray), Lahore (blue), and Mingora (orange). The dengue season in the entire country lasted 35 wk, with the first reported case in Karachi during week 18 (end of April). (B) The reported cases (red), temperature (blue), and model fit (black) for Karachi are shown. Using the case and temperature data, the human and vector population dynamics were modeled (Materials and Methods).
The timing and number of reported dengue cases per tehsil in 2013 used in the analysis
We first fitted an ento-epidemiological model to the reported dengue cases in Southern Pakistan, where transmission occurs year round, and to case data in Lahore and Swat in northern Pakistan, where transmission is seasonal due to climatic variation. Southern Pakistan has year-round climatic suitability for dengue vectors and is therefore the most likely source of exported cases to other parts of the country, where greater seasonal temperature extremes limit suitability (16) (SI Text). Although importation from international regions is technically a possibility, given the fairly restrictive political borders we assume here that importation will be negligible compared with the within-country importation rate. We fitted an ento-epidemiological model of dengue dynamics in Karachi (Fig. 2B), with vector dynamics being determined by temperature (19). We fitted the mosquito-biting rate (
In northern Pakistan, we used our epidemiological framework to estimate the timing of the introduction of the first case that sparked the epidemics in each area, to compare these estimates with our mechanistic model of imported infections using mobile phone data. We performed a sensitivity analysis to estimate an expected range of dates for the introduction of dengue to Lahore and Mingora, where most cases were concentrated (Materials and Methods and Fig. 3 A and B). Interestingly, Lahore had suffered its first major outbreak 2 y before this epidemic, and we expect that immunity may have played a significant role in determining transmission dynamics (16). Mingora, on the other hand, represents an effectively naive population. In the absence of serotype information, we took the simplest approach and assumed each population was immunologically naive; however, we hypothesize that the delay between the first cases and the peak of the epidemic in Lahore may have been caused by immunity from the 2011 outbreak. We estimated that the first case was introduced to Lahore during the second week of May (between days 124 and 130; SI Text and Fig. 3A) a few days earlier than the first reported case (day 133). In Mingora, on the other hand, we estimated that the first introductions likely occurred in August (between days 202 and 231) (Fig. 3B), a few weeks before the first reported case in the city.
Mobility estimates derived from mobile phone data predict the timing of introduced cases around the country that spark epidemics. (A and B) The estimated introduced cases from Karachi to (A) Lahore (total dengue cases: 1,538) and (B) Mingora (total dengue cases: 4,029). The estimated introductions (assuming 30% of individuals travel, a 2% reporting rate, and a probability of 0.01) from the mobile phone data (boxplot in blue), the diffusion model (boxplot in green), actual case data (red), and estimated dengue suitability (gray) are shown. Dengue suitability was defined based on temperature and relative humidity, using a measure that is linearly proportional to vectoral capacity (Materials and Methods). Values near zero are unsuitable for dengue transmission. For Lahore and Mingora, the estimated introduction from the case data alone is shown (red cross-hatched box). In all instances, the mobile phone data were able to predict the timing of the first introduced case in each tehsil. An arrow indicates the week of the first reported case in each tehsil.
Models of Dengue Virus Importation Based on Mobile Phone Data Accurately Predict the Spatial Extent and Timing of Epidemics.
We next modeled the number of infected individuals traveling from the endemic areas in southern Pakistan to all other tehsils, using different approaches to characterize mobility: direct observations from the mobile phone data and various modified gravity models of travel (Materials and Methods) (10, 19). To compare the performance of the mobile phone data against the next best alternative, we used a parameter-free gravity model (referred to as the diffusion model) that is equivalent to a population-weighted spatial diffusion model (Materials and Methods) based on the travel time distance between the origin and the destination. In addition to these models, we fitted a gravity model to the mobile phone data, to determine whether simple adjustments would significantly improve our predictions (SI Text and Fig. S2).
The various distance measures and gravity model fits to the mobile phone data. We calculated the Euclidean distance (kilometers), road distance (gray, in kilometers), and travel time distance (red, in minutes) between tehsil centroids. (A) The three measures were highly correlated (Pearson’s correlation coefficient between Euclidean distance and road distance, 0.988; between Euclidean distance and travel time distance, 0.99). We then fitted gravity models to the mobile phone data, using each distance measure: (B) Euclidean distance, (C) road distance, and (D) travel time distance. Shown is the normalized predicted amount of travel from each gravity model compared to the mobile phone data.
We compared the timing of predicted importation events from endemic areas in southern Pakistan, based on our mobility model, to the estimated first dengue case inferred from case report data from Lahore and Mingora (SI Text). The timing of the first dengue case estimated from the epidemiological data overlapped well with the predicted introductions from southern Pakistan to Lahore from the importation model (Fig. 3A), with the first introductions occurring approximately 1 mo earlier. In Mingora, the predicted timing of imported infections using our model occurred 2 wk before the first reported case, consistent with the serial interval for dengue and the estimates of the first case from epidemiological data (Fig. 3B). Crucially, the diffusion model does not predict any introductions from endemic areas in southern Pakistan to Mingora. Thus, travel patterns measured using mobile phone data predict introduction events consistent with outbreaks in both emerging (Lahore) and previously dengue-free (Mingora) regions. Our ability to measure these importations in more remote places like Mingora was somewhat sensitive to the Karachi model fit, in particular the reporting rate, although the mobile phone data are always able to predict earlier, more frequent, and more accurate introductions than the diffusion model (SI Text). The modeled interaction between seasonal variations in vectorial capacity and the dynamics of importation events provide accurate predictions about the location and timing of epidemics in different epidemiological settings and regions of the country.
We combined our estimates of imported cases with an index of climatic suitability for dengue vectors (19) across Pakistan to predict the potential for spread of the virus in areas with introduced cases and assessed how well the spatiotemporal dynamics of the epidemic were captured. In general, the spatial extent and epidemic timing of cases were predicted more accurately using mobile phone data (Fig. 4) compared with the diffusion models, which both predicted early spread along the major highway connecting Karachi to other highly populated areas.
The estimated spatial spread of imported dengue. Using the modeled dengue dynamics in Karachi and mobility measured from the mobile phone data or a diffusion model, we estimated the time of the first introduced case to the rest of the country. The mobile phone data predict the earliest introductions in eastern Pakistan near Lahore and inland toward Swat Valley (Mingora). In comparison, the mobility model predicts early introductions in southern Pakistan with few introductions in Mingora. These differences are highlighted in the difference in predictions plot—the number of days earlier (red) from the mobile phone predictions or earlier (yellow) from the diffusion model (without the mobile phone data).
A New Approach to Dynamic Risk Mapping for Epidemic Preparedness.
We next constructed risk maps to identify areas of the country that were vulnerable to epidemics due to a combination of seasonally varying climatic suitability based on temperature and relative humidity (vectorial capacity index) for dengue transmission and seasonal patterns of importation of dengue virus from southern Pakistan. By combining the daily suitability index (Fig. 5A) with estimated introductions of infected travelers from endemic areas we created a composite epidemic risk measure that varied in space and time and showed a distinctively different pattern of risk than a climate-based suitability map alone (Fig. 5B). In southern Pakistan outside of Karachi, for example, a low risk of epidemic dengue results from low numbers of infected travelers visiting the area. Along the Afghanistan border, on the other hand, low risk of epidemics is due to temperature extremes that prevent continuous dengue transmission. The highest-risk regions are in Punjab and KP provinces in the northeast of the country, where a combination of frequent introductions from Karachi and long periods of sustained climatic suitability for transmission create the necessary conditions for outbreaks.
Dynamic risk mapping for dengue epidemics in Pakistan. (A) Average dengue suitability in Pakistan during 2013 based on temperature and humidity. The southern part of the country is the most suitable for dengue (high values are shown in red). (B) Epidemic risk, measured as a composite of vector suitability and frequency of introductions from endemic areas in southern Pakistan. Places in red are the most at risk whereas those in blue have low risk of epidemics. (C) Variation in the timing of dengue risk by location (per tehsil). A box is drawn if suitability multiplied by estimated introduction events is greater than 0, indicating a nonnegative risk. Each line shown is a tehsil, with the y axis corresponding to the ranked distance from Karachi. The dengue risk is shown using the mobile phone data (blue) or the diffusion model (green) to estimate introduction events from Karachi. In general, the mobile phone data predict earlier risk to tehsils farther away from Karachi, in particular tehsils in Punjab (Faisalabad and Lahore, for example), than the diffusion model data. The diffusion model predicts the earliest risk to nearby tehsils such as Malir.
The timing of epidemic risk varied considerably in different tehsils (Fig. 5C), with the risk of outbreaks varying by up to 4 mo in different regions. Based on the mobile phone data, epidemic risk occurred much earlier in tehsils in northern regions relatively far from Karachi, in contrast to the diffusion model. Because preparation of local health facilities and surveillance for dengue cases are key components of national response to outbreaks, this variation in timing has important implications for decision making. In particular, given the 2-wk serial interval of dengue (20), mobile phone and disease surveillance data could be used to produce dynamic risk maps and provide location-specific predictions about future risk of epidemics in near real time, giving policy makers time to prepare.
Discussion
For emerging epidemic diseases like dengue, changing environments and dynamic population movements create conditions for geographic expansion of pathogens into naive populations. Predicting the changing patterns of risk due to these emerging public health threats will be critical to containing their spread and preparing for outbreaks. Mobile phone data provide dynamic population mobility estimates that can be combined with infectious disease surveillance data and seasonally varying environmental data to map these changing patterns of vulnerability in a country where dengue outbreaks are emerging and irregular in many regions. Because these data are continuously being collected by mobile phone operators, these methods could be integrated into national control programs in near real time.
A major limitation of mobile phone data generated by national operators is the difficulty in capturing cross-border travel patterns. International travel may be less relevant for dengue dynamics in Pakistan than in other dengue-endemic countries, however. Less than 2% of the entire Pakistan population are migrants (where their country of origin is not Pakistan) (21). More than half of these migrants are from Afghanistan (2.3 million) (21). Although Pakistan borders four countries (India, Afghanistan, China, and Iran), all aside from India are not endemic for dengue near the Pakistan borders, making importations of dengue cases to Pakistan unlikely (3, 22⇓–24). Estimates of travel between India and Pakistan are difficult to obtain, but travel restrictions between the countries and a limited number of international flights may make travel difficult. Another potential issue with this type of data is the sampling frame, because mobile phone ownership may not represent a representative sample of the population, although we have previously found in an African context that mobility estimates from this kind of data were not significantly affected by ownership bias (25). We expect that the substantial market share and widespread coverage of the mobile phone operator make the possibility of systematic skew in our estimates unlikely, but this cannot be ruled out.
Uncertainties are inherent to any epidemiological model of dengue transmission, in particular regarding the large asymptomatic reservoir and the multiple circulating serotypes of the virus, which impact the ability to estimate both the fraction of the population that is susceptible and the reporting rate accurately (26, 27). In the absence of serotype information, for example, we chose not to include immunological covariates in our model. As Pakistan moves toward an endemic transmission setting in the northeast of the country, however, serotype data will be increasingly important to determine how strain dynamics are likely to affect the likelihood of outbreaks (28). Indeed, we believe that the substantial lag time in Lahore we observed was probably due to the recent large outbreak in 2011, which may have generated substantial immunity and delayed the 2013 epidemic (16, 29). These uncertainties will continue to make forecasting challenging, although serotype data could improve model accuracy.
Given these difficulties, forecasting dengue epidemics will always need to encompass substantial stochastic variation, despite its importance for targeting limited public health resources. Currently, vector control programs in Pakistan begin during the monsoon season uniformly across the country. We believe that the large estimated lead times this approach offers could aid control managers, providing an early warning system. This approach provides policy-relevant, real-time information about where and when to expect dengue epidemics and therefore how to effectively target interventions, surveillance, and clinical response.
Materials and Methods
Population Data.
Pakistan’s large population (182 million) is broadly divided into one capital territory, Federally Administered Tribal Area (FATA), Gilgit-Baltistan, Azad Jammu and Kashmir, and four provinces, which are further subdivided into 388 tehsils (equivalent to administrative unit 3 from 2008; Fig. 1A). Karachi is the most populated city in the country and located along the southern coast. The majority of the population in southern Pakistan lives along the Indus river, whereas in the northern half the population lives in an arc between Faisalabad and Peshawar that includes the major population centers of Lahore, Islamabad, and Rawalpindi. We used population data obtained from worldpop.org.uk.
Dengue Data.
Data for Punjab province were collected by the Provincial Health Department, whereas the District Health Office collected case data for Swat District (Table S1) daily. All public and private hospitals, health clinics, and laboratories reported any case of a patient presenting with dengue symptoms. The data listed each patient who presented to a hospital or clinic, regardless of whether the patient was admitted in the province or district, seeking treatment for high fever, body aches, petechiae, and low platelet counts (with a cutoff value for thrombocytopenia of <50,000/mm3). These cases were then confirmed by using IgM or NS 1-Ag enzyme-linked immunosorbent assay (ELISA). The data reported from Karachi were based on official public releases from the Sindh Health department. All case data were deidentified and aggregated to the tehsil level.
Mobile Phone Data.
We analyzed all voice-based, originated, call data records (CDRs) from 39,785,786 subscriber SIMs (security information management) over a 7-mo period, from June 1 to December 31, 2013. The mobile operator has the largest coverage of tehsil headquarters (352 in the dataset of 388 total tehsils) across Pakistan, particularly in rural areas (two-thirds of the network). To comply with national laws and regulations of Pakistan and the privacy policy of the Telenor group, the following measures were implemented to preserve the privacy rights of Telenor Pakistan’s customers: (i) The CDR/mobility data were processed on a backup and recovery server made available by Telenor Pakistan. Only Telenor employees have access to the detailed CDR/mobility data. (ii) Given the server arrangements, no detailed CDR/mobility data were taken out of Pakistan or left the premises of Telenor Pakistan. (iii) The processing of the detailed CDR/mobility data resulted in aggregations of the data on a tower-level granularity that was accessed only by Telenor employees. Further spatial aggregations to the level of the tehsil were made available to the remaining coauthors.
On average, 28 million subscriber SIMs were recorded as active on a given day, and of these 15.2 million subscribers generated outgoing, voice-event CDRs that encoded location information. At the time of data acquisition, the mobile phone operator had approximately a 25% market share (22% of the population) and was the second largest provider of mobile telecommunication services in Pakistan. Multi-SIM activity is common in Pakistan, but we believe that this should not create a systematic bias in mobility estimates because the geographic coverage of the operator is so extensive.
Quantifying Travel Using the Mobile Phone Data.
Every caller was assigned to his or her most frequently used base station/mobile phone tower on a given day, as in previous studies (30). For a given location, defined through the location of a base station, the flux of that location is defined through the number of active callers assigned to the base station and was then aggregated to the tehsil level. On an average day, ∼12% of the total population of Pakistan made a call. We measured daily travel between mobile phone towers relative to subscriber location on the previous day. Trips were aggregated to each tehsil based on the location of the origin and destination tower. We normalized trip counts by the origin tehsils’ number of active subscribers on each day (SI Text and Fig. S1). Aggregated forms of these mobility patterns can be made available upon request.
On a country-wide scale there were two significant decreases in the recorded CDR activity, one occurring during Ramadan (July 9 to August 7, 2013), reflecting less subscriber activity (Fig. S1). The decrease in the use of mobile communication services during Ramadan has been confirmed by the mobile operator as an expected effect, and this effect is seen every year. The other decrease in activity was after October 25, due to a major system upgrade to the core mobile system infrastructure, which impacted all of the customers of the mobile operator. In the collected dataset, this led to a drop in the number of location-generating subscribers recorded per day of ∼3.38 million. We adjusted the time series under the assumption that the system upgrade was a 1-d event only, based on expert opinion from the operator, and that the population in the customer base did not change its overall behavior. The flux values were adjusted after this date, assuming that the average would remain the same as in the beginning of the dataset (Fig. S1). To analyze the relationship between mobility and dengue dynamics, we approximated travel patterns between January 1 and June 1. We simulated travel, assuming that the mean number of normalized trips (normalized by flux) remained the same, and added noise [
Climate Data.
A. aegypti entomological and dengue viral factors are highly influenced by temperature (19). Daily mean temperature and total precipitation were recorded at 39 weather stations across Pakistan, obtained from the National Oceanic and Atmospheric Administration National Climate Data Center (Fig. S3). Temperatures peak in the middle of the year, June, July, and August; are lowest in January and December; and are variable across the country (daily averages between 9 °C and 27 °C) (for example, Fig. S3). We converted temperature to dew-point temperature based on the Numerical Terradynamic Simulation Group proposed model at the University of Montana (31). Dew-point values and temperature were then converted to relative humidity.
The temperature data from weather stations in Pakistan. (A) The location of weather stations in Pakistan. (B) Daily temperature (in Celsius) from selected weather stations in Pakistan. (C) The temperature-dependent parameters used in the ento-epidemiological framework for Karachi.
Ento-Epidemiological Framework.
To model dengue dynamics, we used an ordinary differential equation model based on a model by Lourenco and Recker to describe a dengue outbreak in Madeira, Portugal (10). This model captures the dynamics of dengue between human and mosquito hosts where the mosquito dynamics are dependent on temperature and relative humidity (SI Text and Fig. S3). Here we assume that individuals can be infected only once and we do not consider multiple serotypes. We assigned temperature-dependent epidemiological variables as in Lourenco and Recker (SI Text and Fig. S3) (10).
The relationship between relative humidity and dengue suitability has been explored in a number of environments (3, 32). We have added an additional variable to the temperature-dependent epidemiological variable, the mortality factor that is based on relative humidity (33). In contrast to ref. 10, we analyzed the reporting rate
Sensitivity analysis on the carrying capacity (K) fitted to the biting rate (a) for each value of K, using a maximum-likelihood framework. The case data for Karachi are shown as points.
Importation of Infected Travelers from Endemic Areas in Southern Pakistan.
To estimate the number and role of importation of cases from endemic areas in southern Pakistan (Karachi) to all other tehsils, we first modeled the dengue dynamics in Karachi, using the ento-epidemiological framework described above. We are focused only on the role of travel within Pakistan as opposed to cross-border migration (Discussion). Dengue is endemic in Karachi (16), due to year-round climatic suitability and availability of vectors, and thus we assumed the date of the first introduced was day 1 (
We then estimated the flow of infected travelers from Karachi. We first estimated the number of infected travelers per day who have left Karachi (
Sensitivity analysis of estimated introductions to tehsils based on the CDR varying the percentage that travel or the probability of introduction. We analyzed the estimated introductions, using the mobile phone data to quantify travel from Karachi to other tehsils. We varied the percentage of travel from Karachi (Left) and the probability of introduction (Right).
Here, the amount of travel between two locations,
The raw estimates of infected travelers from Karachi to all other tehsils per day represent an upper limit. To account for variation in epidemiological and individual factors, such as the within-host viral dynamics, we assumed that a smaller subset of this upper bound of infected visitors could contribute to transmission. We sampled the actual number of effective infected travelers from a binomial distribution with a fixed probability [
Mobility models derived from mobile phone data predict the timing of introduced cases around the country that spark epidemics. Shown are the estimated introduced cases from Karachi to all tehsils that have reported at least 15 cases. The estimated introductions from the mobile phone data (blue), the diffusion model (orange), the gravity model (green), and actual case data (red) are shown. Introductions are estimated assuming that 30% of individuals travel consistent with the mobile phone data (prob = 0.01). If there were no estimated introductions from both the mobile phone data and the diffusion model, no boxes are drawn. In the majority of the KPK tehsils, both the mobile phone data and the diffusion model did not predict any introductions, likely because these cases were associated with the initial outbreak in Mingora (Alai–Peshawar). However, in the Punjab tehsils (Faisalabad, Ferozawala, Rawalpindi, and Sheikhupura) the mobile phone data consistently predicted an earlier introduction than the diffusion model and before the date of the first reported case.
The estimated introductions for Mingora/Babuzai tehsil and Lahore while varying parameters. (A and B) The estimated introductions for Mingora/Babuzai and Lahore assuming various reporting rates for Karachi. Shown are the estimated introduced cases from Karachi to Mingora and Lahore (assuming 30% of individuals travel and a probability of 0.01) with various reporting rates: 2%, 3%, 5%, and 10%. Introduced cases were estimated using the mobile phone data (blue), a gravity model fitted to the mobile phone data (orange), or a diffusion model (green). The mobile phone data consistently predict earlier introductions than either the gravity model or the diffusion model, with the diffusion model predicting the latest introductions of the three methods. In particular, the diffusion model does not report introductions into Mingora and for high reporting rates (10%), there are no predicted introductions into this tehsil. (C and D) The estimated introductions for Mingora/Babuzai and Lahore, assuming various percentages of individuals travel from Karachi. Shown are the estimated introduced cases from Karachi to Mingora and Lahore (assuming a 2% reporting rate and a probability of 0.01) with various percentages of individuals traveling from Karachi: 10%, 20%, and 30%. Unsurprisingly, as the percentage of individuals who travel increases, the number of importations to both Mingora and Lahore increases. The timing of these introductions also increases as the percentage of individuals who travel from Karachi increases. (E and F) The estimated introductions for Mingora and Lahore, assuming various probabilities. Shown are the estimated introduced cases from Karachi to Mingora and Lahore (assuming 30% of individuals travel and a 2% reporting rate) with various probabilities: 0.001, 0.01, 0.1, and 0.9 (Materials and Methods).
The estimated introductions to Lahore. Using the mobile phone data, the number of introductions from (A) Karachi and (B) Mingora to Lahore are shown. As in Fig. 3, we estimated the introduced cases from Karachi or Mingora (Swat) to Lahore. (C) A comparison of the median number of introductions per week.
Environmental Suitability and Epidemic Risk.
We defined dengue suitability as a function of temperature
We created a composite measure of environmental suitability and introduction of infected travelers from Karachi,
SI Text
Mobility Dynamics Derived from Mobile Phone Data
As described in Materials and Methods, for all pairs of consecutive days over the timeframe of the data, we calculated the number of trips between all pairs of tehsils. This created a time series of travel between tehsils. We normalized this time series by the number of daily active subscribers in each origin tehsil (flux value) (Fig. S1).
Dengue in Pakistan
The first confirmed case of dengue in Pakistan occurred in 1994 (16) in Karachi. Before 2008, the majority of confirmed cases were confined to Karachi. Since then, there have been cases reported in a number of provinces with the most severe outbreak occurring in 2011 with over 20,000 confirmed cases, the majority of these in the city of Lahore in Punjab province.
The dengue season and peak have varied by year, although the peak season often occurs in the fall (October–November). A comprehensive countrywide seroprevalence survey has not been conducted in Pakistan, although smaller studies have been conducted in Karachi. In a study from 1994, DEN-1 and DEN-2 were found in Karachi (36). In a later study, authors found that DEN-2 (56.2%) was slightly more prevalent than DEN-3 (43.8%) (16) based on serum samples collected at a tertiary care hospital in Karachi from the 2006 outbreak. Based on samples collected during the 2011 outbreak in Punjab, researchers found 100% prevalence of DEN-2 in dengue-positive samples (495 total samples) (36). In the same study, 26 of these samples had concurrent infection with DEN-2 and DEN-3 with no evidence of DEN-1 or DEN-4.
Gravity Model
We also fitted a gravity model (three parameters) to the average number of trips from Karachi and all other tehsils per day (Fig. S2):
Converting Temperature to Relative Humidity
Relative humidity data are not regularly collected at the 39 weather stations in our dataset (see Fig. S3 for the location of weather stations). We converted temperature to relative humidity by first converting temperature to dew point (
Ento-Epidemiological Framework
To model dengue dynamics, we used an ordinary differential equation model based on a model by Lourenco and Recker to describe a dengue outbreak in Madeira, Portugal (10). This model captures the dynamics of dengue between human and mosquito hosts, where the mosquito dynamics are dependent on temperature and relative humidity (SI Text and Fig. S3). Here we assume that individuals can be infected only once and we do not consider multiple serotypes.
For a fully susceptible, fixed human population size
The dynamics of the human population are defined by the following equations:
The vector to human (
The value of dengue’s basic reproduction number is
Temperature-Dependent Parameters
The ento-epidemiological dengue model included seven entomological parameters that were dependent on temperature (
Similar to ref. 10, we used a number of fixed parameters including the human latency period (
We fitted the biting rate and the reporting rate, using a maximum-likelihood framework.
Fitting the First Introduced Case into Lahore and Mingora from Karachi, Using the Epidemiological Data
To compare our importation estimates from the mobile phone data, we estimated the first introduced case into both Mingora (Babuzai tehsil) and Lahore, using only the dengue case data for these locations. Using the same ento-epidemiological framework, we estimated the date of the first introduced case to Mingora or Lahore (
Using a range of reporting rates (for comparison the fitted reporting rate for Karachi is 0.03), we estimated the date of the first introduced case (
Introductions into Other Tehsils from Karachi
We simulated the introduction of infected travelers from Karachi to all other tehsils based on either the mobile phone data or a gravity model. We first estimated the number of infected individuals in Karachi each day, using the ento-epidemiological framework described above. In each simulation we selected the percentage of infected individuals in Karachi who traveled to another tehsil (range of 10%, 20%, or 30%). This percentage provided an estimate on the number of infected travelers from Karachi. The destination for each infected traveler was determined based on the actual distribution of trip destinations from Karachi per day. We then varied the probability that these infected visitors in all other tehsils would impact the dengue dynamics in that location and sampled a binomial distribution with a fixed probability (probability varied from 0.001 to 0.9). For each fixed percentage of infected individuals who travel and probability (P), we ran 200 simulations.
Introductions from Swat to Lahore
The Swat epidemic peaked 2 mo earlier than the Lahore epidemic; we also investigated the possible role of importation from Swat to Lahore. First we compared travel from Swat tehsils that have had at least 15 cases of dengue (Alai, Alpuri, Babuzai, Bahrain, Barikot, Charbagh, Chiksar, Dassu, Kabal, Kalam, Khwazakhela, Mamund, Matta Khararai, Matta Sebujni, and Peshawar) to the amount of travel from Karachi to Lahore. Based on the CDR data, the amount of travel from Karachi to Lahore is an order of magnitude greater than the amount of travel from Swat to Lahore (number of trips per day from Karachi to Lahore, median, 18,980; min, 10,130; max, 21,940) (number of trips per day from Swat to Lahore, median, 4,463; min, 2,556; max, 5,345).
We modeled the dengue dynamics in Swat, using the same ento-epidemiological framework described in Materials and Methods with
A. aegypti Suitability in Pakistan
A. aegypti have been observed in many parts of Pakistan, including at least 13 major cities, based on a genetic population structure study conducted in 2009 and 2010 (38). In Karachi, the vector has been present in all parts of the city since at least 2000 (39), with a follow-up study in 2010 (40). It has been hypothesized that malaria control programs in the 1950s reduced the A. aegypti population and since the early 2000s the tire trade has spread the vector to other parts of Pakistan from Karachi.
Dengue Temperature Suitability
We used a relative dengue suitability index, a function proportional to vectoral capacity. This measure is based on temperature and relative humidity and does not take into account other factors such as population density or percentage urban, which has been shown to impact Aedes environmental suitability (19).
We defined dengue suitability as a function of temperature
Acknowledgments
A.W. is funded by a James S. McDonnell Foundation postdoctoral fellowship. M.F.B. is supported by Wellcome Trust Grant 098511/Z/12/Z. C.O.B., A.W., and M.A.J. were supported by the Models of Infectious Disease Agent Study program (Cooperative Agreement 1U54GM088558).
Footnotes
- ↵1To whom correspondence should be addressed. Email: cbuckee{at}hsph.harvard.edu.
Author contributions: A.W. and C.O.B. designed research; A.W., T.Q., M.F.B., K.E.-M., and C.O.B. performed research; A.W., M.F.B., M.A.J., S.B.R., and K.E.-M. contributed new reagents/analytic tools; A.W., T.Q., P.R.S., S.B.R., and K.E.-M. analyzed data; and A.W., T.Q., M.F.B., P.R.S., M.A.J., S.B.R., K.E.-M., and C.O.B. wrote the paper.
Conflict of interest statement: M.F.B. has worked as a paid consultant to Visterra, Inc. in Cambridge, MA.
This article is a PNAS Direct Submission.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1504964112/-/DCSupplemental.
Freely available online through the PNAS open access option.
References
- ↵.
- WHO
- ↵
- ↵
- ↵.
- Heintze C,
- Velasco Garrido M,
- Kroeger A
- ↵.
- Halloran ME,
- Longini IM Jr
- ↵
- ↵.
- Stoddard ST, et al.
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Amarasinghe A,
- Letson GW
- ↵
- ↵.
- Ali A, et al.
- ↵.
- Bharti N, et al.
- ↵
- ↵
- ↵.
- United Nations, Department of Economic and Social Affairs, Population Division
- ↵
- ↵
- ↵
- ↵.
- Wesolowski A,
- Eagle N,
- Noor AM,
- Snow RW,
- Buckee CO
- ↵
- ↵
- ↵.
- Reich NS, et al.
- ↵.
- Koo CN, et al.
- ↵.
- Wesolowski A, et al.
- ↵
- ↵Hales S, deWet N, Maindonald J, Woodward A (2002) Potential effect of population and climate changes on global distribution of dengue fever: An empirical model. Lancet 360(9336):830–834.
- ↵.
- Focks DA,
- Haile DG,
- Daniels E,
- Mount GA
- ↵Organization WH (2010) Situation Update of Dengue in the SEA Region 2010 (World Health Organization, Geneva).
- ↵
- ↵
- ↵
- ↵
- ↵.
- Tariq RMZS
- ↵.
- Tariq RMAI,
- Qadri SS
Citation Manager Formats
Article Classifications
- Biological Sciences
- Ecology
Sign up for Article Alerts
Jump to section
- Article
- Abstract
- Results
- Discussion
- Materials and Methods
- SI Text
- Mobility Dynamics Derived from Mobile Phone Data
- Dengue in Pakistan
- Gravity Model
- Converting Temperature to Relative Humidity
- Ento-Epidemiological Framework
- Temperature-Dependent Parameters
- Fitting the First Introduced Case into Lahore and Mingora from Karachi, Using the Epidemiological Data
- Introductions into Other Tehsils from Karachi
- Introductions from Swat to Lahore
- A. aegypti Suitability in Pakistan
- Dengue Temperature Suitability
- Acknowledgments
- Footnotes
- References
- Figures & SI
- Info & Metrics