Assessment of the Legionnaires’ disease outbreak in Flint, Michigan

Edited by Andrea Rinaldo, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland, and approved January 5, 2018 (received for review October 27, 2017)
February 5, 2018
115 (8) E1730-E1739

Significance

Unresolved is the etiology of the 2014–2015 Legionnaires’ disease outbreak in Genesee County, MI. Flint is the most populous city in Genesee County, and the outbreak coincided with damaged water infrastructure and the subsequent Flint water crisis. The unprecedented disturbance in water quality within Flint’s drinking water distribution system allowed the evaluation of the statistical relationship between free chlorine residual and Legionnaires’ disease risk within a full-scale drinking water system. Through the integration of multiple datasets, results from numerous causal inference tests implicate changes in water quality, as reflected by changes in free chlorine residual, in the City of Flint as responsible for the outbreak. These findings provide public health professionals and engineers unparalleled scientific evidence to reduce waterborne disease.

Abstract

The 2014–2015 Legionnaires’ disease (LD) outbreak in Genesee County, MI, and the outbreak resolution in 2016 coincided with changes in the source of drinking water to Flint’s municipal water system. Following the switch in water supply from Detroit to Flint River water, the odds of a Flint resident presenting with LD increased 6.3-fold (95% CI: 2.5, 14.0). This risk subsided following boil water advisories, likely due to residents avoiding water, and returned to historically normal levels with the switch back in water supply. During the crisis, as the concentration of free chlorine in water delivered to Flint residents decreased, their risk of acquiring LD increased. When the average weekly chlorine level in a census tract was <0.5 mg/L or <0.2 mg/L, the odds of an LD case presenting from a Flint neighborhood increased by a factor of 2.9 (95% CI: 1.4, 6.3) or 3.9 (95% CI: 1.8, 8.7), respectively. During the switch, the risk of a Flint neighborhood having a case of LD increased by 80% per 1 mg/L decrease in free chlorine, as calculated from the extensive variation in chlorine observed. In communities adjacent to Flint, the probability of LD occurring increased with the flow of commuters into Flint. Together, the results support the hypothesis that a system-wide proliferation of legionellae was responsible for the LD outbreak in Genesee County, MI.
Over the last decade in the United States, the leading cause of disease outbreaks due to drinking water has shifted from gastrointestinal microbes to the respiratory pathogen Legionella pneumophila (1). Despite this trend, federal drinking water regulations target only microorganisms that indicate risk of gastrointestinal disease. Altogether, waterborne diseases in the United States are estimated to generate annual hospitalization expenses of more than $1 billion, an economic demand compounded by the costs of outpatient treatments, lost productivity, and life (2). Among people with waterborne diseases, patients with Legionnaires’ disease (LD) are the most expensive to treat because they typically require hospital care for ∼10 d (2). Accordingly, an analysis of the effect of water quality changes on risk of LD can advance evidence-based public policies and cost-effective infrastructure investments to reduce the social and economic burden of disease due to L. pneumophila and other waterborne pathogens, especially as more municipalities are challenged by aging infrastructure vulnerable to corrosion events.
In 2014 and 2015, the residents of Genesee County, MI, endured the third largest recorded LD outbreak in American history. The 87 disease cases coincided with changes in the source and treatment of drinking water in Flint’s municipal water system (i.e., changes in water regime). In April 2014, treated water from Lake Huron supplied by the Detroit Water and Sewerage Department (DWSD) was replaced with locally treated water from the Flint River (3, 4). Almost immediately following the switch, residents complained of rashes, odors, and red water, a sign of iron corrosion. Several boil water alerts were issued in the summer of 2014 when coliform bacteria were detected in the distribution system. During that time, the utility struggled to control disinfection by-products within the distribution system, and General Motors Corporation complained that the water was corroding parts at their Flint engine plant (5). Within weeks of the switch in water regime, the number of LD cases in Genesee County increased (Fig. 1A). During the summer of 2015, the LD incidence was again elevated. After considerable public outcry triggered by lead contamination of the water, on October 16, 2015, Flint switched back to purchasing treated Lake Huron water from DWSD. Within weeks of returning to DWSD water, the LD outbreak in Genesee County subsided.
Fig. 1.
Spike in LD cases coincident with switch in water supply and increased variation observed in the Flint water distribution system. (A) Quarterly LD incidence in Genesee County, MI, 2010 through 2016. The count of LD cases in Genesee County as compiled in the Michigan Disease and Surveillance System at the quarterly time step. Bars in gray correspond to the preswitch period, bars in maroon correspond to the postswitch period, and bars in navy correspond to the switch back period. (B) Free chlorine at eight monitoring locations in Flint’s water distribution system, 2013–2016. Free chlorine (mg/L as Cl2) was reported weekly during the three water regime phases defined above (vertical lines) and the periods and dates (year/week) shown at eight locations in Flint.
Large fluctuations in water quality, including free chlorine residual, within Flint’s municipal water system occurred with the switch from DWSD supplied water (4). Free chlorine reacts with a number of chemicals [e.g., naturally occurring organic matter (NOM); reduced metals, such as ferrous (Fe2+) and manganese (II); and ammonia] and microorganisms. Furthermore, free chlorine residual levels are influenced by contact time, pH, turbidity, water temperature, corrosion rate(s), and pipe wall effects. Ultimately, the loss of free chlorine in a distribution system is complex, and the myriad of factors leading to it are commonly referred to, collectively, as chlorine demand. With limited historical data available, it is impossible to identify which constituents caused chlorine demand in the Flint system or had a direct effect on biofilm and legionellae growth. Because some of these potentially confounding water quality and water system variables promote the growth of legionellae, the influence of free chlorine alone on LD would likely be underestimated. Therefore, although the literature indicates insufficient chlorine is a contributing factor to LD outbreaks (611), we do not attempt to link the lack of chlorine residual as the sole mechanistic cause of LD. Instead, here we use free chlorine concentration as an indicator of the potential for legionellae growth. Utilities are required by law to measure disinfectant residual, which is an easily measured value; therefore, we posit that disinfectant residual concentration may be a useful surrogate for indicating LD risk.
Before the switch in Flint’s water source, the concentration of free chlorine (mg/L as Cl2) across eight water monitoring locations in the city was similar (Fig. 1B), as demonstrated by the strong between-monitor correlation in free chlorine (r > 0.70; Table S1). Within weeks of the switch, significant fluctuations in the concentration of free chlorine were observed both between monitors (spatial variation) and at each monitor over time (temporal variation), with the mean between-monitor correlation falling by ∼30% and the average within monitor SD increasing from 0.200 to 0.416 mg/L as Cl2. For example, throughout the postswitch period, a sustained collapse of free chlorine below 0.5 mg/L was observed at monitoring location 6 for all but a few weeks, the chlorine residual measured at monitoring location 8 was persistently >1.5 mg/L, and that at monitoring location 7 varied greatly. With the switch back to DWSD-supplied water, the extreme variation in free chlorine between locations subsided, with the exception of location 6.
During the period that treated Flint River water was distributed to Flint residents, poor water quality and extended periods of low chlorine residual may have enabled legionellae growth in the distribution system (4, 12). Residual chlorine is maintained in water distribution systems to inhibit the growth of pathogens, including L. pneumophila (6, 7). Free-living L. pneumophila are inactivated within 15 min of exposure to 0.4 mg Cl2/L (8). However, this pathogen also resides in biofilms attached to pipe walls and replicates within predatory free-living protozoa (13, 14), two habitats that require higher doses of chlorine to kill legionellae (911). To reduce the risk associated with bacterial growth in water distribution systems, regulatory agencies recommend a minimum free chlorine residual of 0.2–0.5 mg/L (1517). The effectiveness of chlorine disinfection depends on system conditions and chemistry; for example, iron and assimilable organic carbon can both consume chlorine and support L. pneumophila growth (18). However, because chlorine residual is one of the most common measurements of water conditions within distribution systems, here we exploit the chlorine residual values reported in Flint from 2013 to 2016 to investigate how these levels associated with the occurrence of LD.
Analytically, the timing of changes in Flint’s source water and treatment, the accompanying spatiotemporal variations in free chlorine, and the enhanced level of monitoring allow us to statistically calculate the effect of water disinfection on LD risk at the scale of a municipal water system. To evaluate the hypothesis that changes in Flint’s source water and treatment resulted in the Genesee County LD outbreak, we develop a series of statistical tests that exploit spatiotemporal details for the complete inventory of LD cases that occurred from 2010 to 2016 in Genesee and neighboring Wayne and Oakland Counties. LD case data obtained from the Michigan Department of Health and Human Services (MDHHS) included relevant epidemiological information on dates of symptom onset and referral to the Michigan Disease Surveillance System, as well as residence of LD cases by census tract. In analyses that follow, we construct a series of regression models that capture the variation in LD risk attributable to four distinct phases of exposure to water regimes in Flint’s municipal water system. Using these models, we derive the incidence of human LD as a function of residual chlorine concentration in a full-scale municipal water distribution system. We assess the robustness of our models by excluding all likely hospital-acquired LD cases and by ascertaining if the risk of LD in census tracts adjacent to Flint increased as a function of commuter flow into Flint. Results of this analysis can inform the management of water systems dependent on chlorine disinfection.

Water Exposure Risk

To capture the effects of changes in source water and treatment, we constructed a series of difference-in-differences regression models that exploit spatial variation (in Flint versus outside of Flint) and four distinct phases of water-related LD exposure risk during the Flint water crisis (Fig. 2A and Table S2). The difference-in-differences method infers an exposure risk effect by comparing the difference between periods on the outcome of interest (the presence of an LD case in a census tract in a given week) for treated census tracts relative to not treated census tracts (Table S2). In the preswitch period, LD risk in Flint and non-Flint census tracts was similar (Fig. S1). Thus, the difference-in-differences method posits that the LD risk in non-Flint census tracts represents what would have occurred in Flint census tracts if not for the switch in source water and treatment.
Fig. 2.
Probability of observing a case of LD in Genesee County during four phases of the Flint water crisis. (A) The four phases of water exposure risk are defined. Phase A is the period before switch with water supplied by the DWSD from Lake Huron. Phase B is the period after switch to Flint River water treated by the City of Flint and before water boil advisories. Phase C is the period after switch to treated Flint River water and after boil advisories. Phase D is the period after switch back to water derived from Lake Huron. Start and end dates for each phase are indicated. (B) The probability of observing a case of LD in Flint and non-Flint census tracts by phases in the Flint water crisis. The estimated probability (with 95% confidence intervals) of observing a case of LD in a census tract in each of the four phases of water regime exposure risk in Oakland and Wayne census tracts (control group, non-Flint tracts, navy) and in Flint census tracts (treatment group, Flint, maroon) are shown. Estimated probabilities are derived with all other model covariates (i.e., meteorological and demographic) fixed at sample means.
The association between a census tract in Flint presenting with an LD case and the switch in water supply from DWSD to the Flint River is measured in weekly periods as an odds ratio (OR). Throughout, the estimated treatment effect is the coefficient of the interaction between space and time. In Table 1, models 1, 2, and 3 contrast Flint × postswitch, phase B vs. C vs. A, and phase D vs. A, respectively, as defined in Fig. 2. In model 1, the postswitch period combines phases B and C. Other factors held equal, model 1 of Table 1 shows that the switch in source water and treatment increased the odds of a census tract in Flint having a case of LD by factor 7.3 [95% confidence interval (CI): 3.5–15.0]. When the non-Flint Genesee County census tracts are included in our control group (Table S3, model 1), the switch in water regime increased the risk of LD incidence in Flint by 440% (OR = 5.4, 95% CI: 2.7–11.2).
Table 1.
Odds ratios of tract presenting with case of Legionnaires’ disease: water regime exposure effects
VariablesModel 1: phases B and C vs. A non-Flint Genesee census tracts excluded from control groupModel 2: phase B vs. C vs. A non-Flint Genesee census tracts excluded from control groupModel 3: phase D vs. A non-Flint Genesee census tracts excluded from control group
Flint0.6030.6030.584*
 [0.325, 1.118][0.325, 1.118][0.314, 1.085]
Postswitch0.822*  
 [0.674, 1.002]  
Flint × postswitch7.245***  
 [3.504, 14.979]  
Postswitch/preadvise 0.779 
  [0.567, 1.071] 
Postswitch/postadvise 0.844 
  [0.670, 1.063] 
Flint × postswitch/preadvise 10.007*** 
  [4.211, 23.782] 
Flint × after switch/postadvise 5.854*** 
  [2.595, 13.204] 
Switch back  1.381***
   [1.135, 1.680]
Flint × switch back  0.990
   [0.310, 3.165]
N309,192309,192277,480
Ntracts991991991
Log likelihood−3,930.58−3,929.78−3,784.70
Wald χ2286.79293.15279.08
Notes: 95% confidence intervals in braces, ***P < 0.01, **P < 0.05, *P < 0.1. Models 1 through 3 control for average temperature, average humidity, average precipitation, percent of households in a census tract receiving public assistance, and percent of population 50 y of age and include a census tract random effect.
Next, we test whether boil water advisories issued by authorities in Flint attenuated the risk of LD. After positive tests for Escherichia coli contamination, public boil water advisories increased water avoidance by residents across the city (19, 20). Indeed, the odds of an LD case in a Flint census tract increased by a factor of 10 (OR = 10.0, 95% CI: 4.2–23.8) in the postswitch preadvisory period compared with a 6 factor increase (OR = 5.9, 95% CI: 2.6–13.2) in the postswitch postadvisory period (Table 1, model 2). Although the difference in LD risk in Flint between the preadvisory versus postadvisory periods is epidemiologically substantive, it is not statistically significant.
In October 2015, the MDHHS and the Genesee County Health Department jointly announced a state of emergency and instructed residents to avoid drinking the water. On October 16, Flint reconnected to the DWSD water system. This switch back in water source and treatment provides another test of the water system hypothesis. In particular, we compare LD risk in the switch back phase D versus phase A both in Flint (Table 1, model 3) and in control neighborhoods of Oakland and Wayne Counties (Table S3, model 3). In both models, the risk of an LD case appearing in a Flint census tract during the switch back period is indistinguishable from the preswitch period, indicating that the switch back in water supply ended the LD outbreak in Flint.
The estimated probabilities of observing an LD case in a census tract inside or outside Flint through the four phases of water exposure risk (Fig. 2A) is plotted in Fig. 2B. Before the switch to the Flint River water source, there is only negligible difference in the estimated probabilities of LD incidence between Flint and non-Flint census tracts. In contrast, the LD risk in Flint increases significantly in the postswitch water regime period and then lessens somewhat following water advisories. After the switch back to the DWSD water source, the LD risk in Flint returns to the level before the regime switch. These distinct shifts in estimated probabilities in Flint versus non-Flint neighborhoods between each water exposure phase support the hypothesis that changes in Flint’s municipal water system were responsible for the outbreak and subsidence of LD incidence.

Free Chlorine

The extreme temporal and spatial variation in free chlorine induced by the switch in Flint’s water supply (Fig. 1B and Table S1) provides an unprecedented opportunity to analyze the relationship between a water quality parameter and LD incidence in a full-scale municipal water distribution system. For this purpose, we develop a monitor-to-parcel assignment algorithm that leverages best available information on parcel occupancy/vacancy, residence time of water (i.e., water age), and the Flint water distribution system pipe network (Fig. S2).
Table 2 reports ORs of a census tract in Flint presenting with an LD case by the estimated chlorine residual in water delivered to residents. In addition to controlling for demographic and meteorological factors that influence LD outcomes, this model captures other factors that may contribute to neighborhood variation in LD risk, including socioeconomic status and age >50 y. Model 1 shows results where free chlorine is measured as a continuous variable. We find that a unit increase in free chlorine (1 mg/L) reduced the odds (OR = 0.21, 95% CI: 0.07–0.62) of an LD case being reported by about 80%. Models 2 and 3 show results where the concentration of free chlorine is measured categorically as <0.5 mg/L and <0.2 mg/L, respectively. In 43% and 17% of census tract-week observations in the postswitch period in Flint, we observe free chlorine concentrations of <0.5 mg/L and <0.2 mg/L, respectively. The likelihood of a neighborhood in Flint presenting with an LD case increases by factors of 2.9 (95% CI: 1.36–6.34) and 3.9 (95% CI: 1.77–8.73) when the average weekly chlorine levels were <0.5 mg/L and <0.2 mg/L, respectively.
Table 2.
Odds ratios of a census tract presenting with a case of Legionnaires’ disease: free chlorine residual effects
VariablesModel 1 mg Cl2/L odds ratiosModel 2 mg Cl2/L < 0.5 odds ratiosModel 3 mg Cl2/L < 0.2 odds ratios
Free chlorine (mg Cl2/L)0.211***  
 [0.071, 0.624]  
Free chlorine (mg Cl2/L < 0.5) 2.933*** 
  [1.357, 6.342] 
Free chlorine (mg Cl2/L < 0.2)  3.932***
   [1.772, 8.725]
Postswitch5.491***5.014***4.210**
 [1.626, 18.543][1.494, 16.829][1.203, 14.730]
Switch back1.8831.5181.264
 [0.407, 8.717][0.330, 6.992][0.274, 5.834]
N8,0008,0008,000
Ntracts404040
Log likelihood−198.75−199.52−198.30
Wald χ239.9239.0843.88
Notes: 95% confidence intervals in braces, ***P < 0.01, **P < 0.05, *P < 0.1. Models 1 through 3 control for average temperature, average humidity, average precipitation, percent of households in a census tract receiving public assistance, and percent of population 50 y of age and include a census tract random effect.
The relationship between free chlorine in water delivered to residents in Flint and the probability of an LD case before, during, and after the change in water supply and treatment is displayed in Fig. 3. As expected, the relationship is downward sloping. Both before the switch and after the switch back in water supply, the relationship between LD risk and the concentration of free chlorine is similar statistically. However, during the period when Flint River water was treated and distributed, when iron and natural organic matter concentrations were likely to be elevated, complete suppression of LD risk is not observed until water contains a free chlorine residual concentration of 1.4 mg/L, a level slightly above the typical concentration of 1 mg/L entering water distribution systems (21). For treated Flint River water that contained a free chlorine concentration of 1.4 mg/L, the LD risk was equivalent to that observed before the switch when the water contained a free chlorine concentration of 0.3 mg/L, as judged by the corresponding point estimates. Accordingly, our statistical analyses (Fig. 3 and Table 2) support the hypothesis that the loss of free chlorine in Flint’s municipal water system during the crisis accounts for both the time and place variations of the LD outbreak in Flint.
Fig. 3.
The probability of an LD case being observed in a given week as a function of free chlorine residual. The estimated probability is calculated for each census tract within the Flint water distribution system for a given week as a function of free chlorine (mg/L as Cl2) before, during, and after the change in water supply. The probabilities are estimated with other observed model covariates (i.e., meteorological and demographic) fixed at sample means. Bars indicate 95% confidence intervals.

Exclusion of Plausible Hospital Cases

In 2016, the US Centers for Disease Control and Prevention established a genetic link between L. pneumophila strains isolated from sputum of three Flint residents diagnosed with LD and water samples collected in a Flint hospital (hospital A) (22). Two of the three infected residents were patients at this hospital, raising questions as to whether the observed spike in LD incidence in Flint was caused by a nosocomial outbreak. To test this alternative hypothesis, we repeated statistical analyses after excluding all LD cases that could be plausibly related by case report data to hospital A. Even when these purportedly nosocomial cases are omitted, the switch in water regime increases the odds of a census tract in Flint presenting with an LD case by a factor of 5.7 (OR = 5.7, 95% CI: 2.7–12.1; Table S4, Model 1). Similar to results observed for all LD cases, a unit increase in free chlorine residual (1 mg/L) suppresses the odds of an LD case appearing in a Flint neighborhood by ∼70% (OR = 0.29, 95% CI: 0.09–0.88). Furthermore, when the average weekly free chlorine concentration was <0.5 mg/L, the likelihood of an LD case occurring in a Flint neighborhood increases by a factor of 4.0 (OR = 4.0, 95% CI: 1.2–13.5). Therefore, the hospital outbreak hypothesis cannot fully account for the increase in LD cases in Flint during the water crisis.

Commuter Flow into Flint Predicts LD Risk in Adjacent Neighborhoods

Genesee County neighborhoods adjacent to the City of Flint also experienced a measurable increase in LD cases in 2014–2015 (Fig. S1). To test the hypothesis that persons residing outside the Flint water distribution system were exposed to Flint water while inside Flint, we analyzed data on commuters to Flint from neighboring census tracts obtained from the Longitudinal Employer-Household Dynamics Employment Statistics dataset. An estimated 15,857 workers commute into Flint every day, with 12,843 of them residing in neighboring areas in Genesee County, constituting 62% of all employed persons working inside Flint. Although commuting data captures inflow of workers, leisure and shopping activities also expose nonresidents to the Flint water distribution system. Indeed, the risk of an LD case appearing in a non-Flint Genesee County census tract increases as a function of the flow of commuters into Flint after the city switched to the Flint River as its municipal water source but not before (Fig. 4). This statistical relationship indicates that exposure to Flint water partially accounts for the observed increase in LD in neighboring municipalities.
Fig. 4.
LD incident risk in non-Flint census tracts by commuter flow to Flint. LD incident risk is shown as a function of the number of commuters from Genesee County locations other than Flint either before (navy) or after (maroon) the switch to the Flint River as the Flint municipal water source. Probabilities are estimated with all other model covariates fixed at their sample means, and bars indicate 95% confidence intervals.

Discussion

That a sustained and widespread inability to maintain adequate free chlorine residuals in Flint’s municipal water system was responsible for the LD outbreak in Genesee County in 2014 and 2015 is supported by this ensemble of causal inference tests, integration of multiple datasets, and repeated substantiation of hypotheses. The odds of a neighborhood (i.e., census tract) in Flint reporting a case of LD increased by a factor of 7.3 in the period after the switch to the Flint River water source (Table 1, model 1a). The relative risk between Flint and non-Flint census tracts in this postswitch period was over 6–1, with an estimated 80% of LD cases in Flint attributable to the change in water source and treatment. When boil water advisories increased water avoidance by residents, the odds of an LD case reporting from a Flint neighborhood subsided from an OR of 10.0–5.9 (Table 1, model 2). The advisories, along with General Motors Corporation’s statement that the water was too corrosive to use at their engine plant, likely confirmed residents’ suspicions that the water was unsafe and resulted in behavior change that reduced their exposure. Furthermore, the risk of LD returned to pre-Flint water crisis levels after the switch back to the Lake Huron water supply (Table 1, model 3).
During the period when water was drawn from the Flint River, the free chlorine residual that associated with mitigation of LD risk was nearly five times greater than it was before the switch in water supply (1.4 versus 0.3 mg/L; Fig. 3). This response in our model is indicative of an increase in free chlorine demand that is consistent with reports during this period of enhanced levels of iron and assimilable organic matter, both of which promote legionellae growth and react chemically with free chlorine, thereby reducing its availability for disinfection reactions.
Exploiting the extraordinary variation in water quality in the Flint distribution system, we developed an analysis of human LD incidence as a function of free chlorine residual in a community-scale distribution system. When water was supplied by the Flint River, a 1 mg/L increase in free chlorine reduced the risk of an LD case in a neighborhood by about 80% (Table 2, model 1). Conversely, the odds of an LD case increased by factors of 2.9 and 3.9 when the average weekly chlorine levels in a census tract were <0.5 mg/L and <0.2 mg/L, respectively (Table 2, models 2 and 3). Thus, the relationship demonstrates that a free chlorine residual of 0.2–0.5 mg/L (1517) was insufficient to protect public health during the Flint corrosion event. Our results are consistent with the hypothesized etiology: changes in drinking water composition and distribution system condition, which is represented by a reduced free chlorine residual, enhanced legionellae growth in the water distribution system and increased the risk of LD.
Predictions of the LD risk in Genesee County based on insufficient chlorine residuals in the Flint municipal water system holds against other potential explanations. The hospital-based outbreak hypothesis cannot fully account for variations in when and where LD cases resided in Flint during the postswitch period (Table S4). The incidence of LD in non-Flint Genesee County census tracts during the postswitch period increased proportionally with the flow of commuters into Flint (Fig. 4), suggesting that exposure to Flint water caused the LD outbreak in neighboring municipalities. Finally, based on the best available data for urinary antigen test kit sales, we do not find evidence of detection bias due to a differential increase in market demand for such tests in the postswitch period in Michigan (Table S5).
The public health value of this quantitative analysis of free chlorine residual in the Flint water supply is hard to overstate. More than two-thirds of the 156,000 public drinking water systems in the United States are estimated to rely on free chlorine disinfection (23, 24). Because most people are exposed every day to municipal water with moderate levels of chlorine, it is challenging to investigate illnesses linked to drinking water using classical epidemiological tools, such as case-control studies. Chlorine residual is a widely reported measure of disinfection potential within distribution systems, underscoring the broad utility of our analyses. In municipal water systems similar to that in Flint, increasing the amount of free chlorine residual above trace levels at all points in the distribution network is likely to reduce LD risk. The optimal level of chlorine residuals must take into account potentially detrimental effects, such as formation of disinfection by-products and increased rates of corrosion. However, our analyses establish that other things held equal, maintaining disinfectant residual at all points within water distribution systems can substantially minimize the risk of Legionnaires’ disease.

Materials and Methods

Data.

Deidentified data on LD cases from 2010 to 2016 were obtained from the MDHHS by Data Use and Confidentiality Agreements following approval from the Institutional Review Board for the Protection of Human Research Subjects (MDHHS IRB 201608-01-EA, Wayne State University IRB 067016B3E). Data represent a complete inventory of LD cases in Genesee, Oakland, and Wayne Counties over this 6-y period. Each case is time-stamped with dates of symptom onset, patient diagnosis, and referral to the Michigan Disease Surveillance System (MDSS). The precise date of referral to the MDSS is available for all LD cases. Only 623 of the 833 (25.2% missing) LD cases in Genesee, Oakland, and Wayne Counties have a recorded diagnosis date. Although MDHHS staff generously provided enhanced onset date data collected from case investigations, only 694 cases (16.7% missing) had a verified date of symptom onset. In consultation with scientific personnel at MDHHS, the date of the referral is the most reliable and valid indication of timing. Median difference in elapsed time between referral and diagnosis dates is 1 d, and 5 d between referral and symptom onset dates. These lags inform our use of referral timing data in chlorine analyses. Given the very high correlation between referral and onset date (R2 = 0.9973), using referral date with a time lag adjustment in chlorine models resolves the timing error and preserves maximum information (limiting the missing information bias that arises with use of onset date). MDHHS data are also referenced geographically by the residence of the LD case at the census tract scale. Of the 833 LD cases observed over this time period, all but 27 cases included a verifiable address (or census tract residential indicator), including 3 in Genesee, 2 in Oakland, and 22 in Wayne County.
To test the water regime hypotheses, we exploited the temporal and spatial properties of MDHHS case data to develop an outcome variable, LD incidence, that is observable in time (before and after the switch in water regime) and space (in and outside regime and chlorine-treated neighborhoods). LD incidence is a binary variable equal to 1 if a confirmed case of LD is observed in census tract i in week t and 0 otherwise.
To estimate LD effects from the switch in water source and treatment, and the ensuing variability in chlorine residuals, we also collected a suite of demographic and meteorological control variables. With respect to neighborhood (census tract) demography, two variables from the US Census Bureau are used: percent of population receiving public assistance and percent of population 50 y of age. Both socioeconomic status and age (50 y) are known correlates of LD risk. Three county-level meteorological variables are used: average weekly temperature, average weekly humidity, and average weekly precipitation. The thermal forces of temperature, humidity, and precipitation are known to govern the observed seasonality of LD incidence rates through growth effects on legionellae bacteria (2527).

Capturing Water Regime Effects.

To capture effects of changes to source water and treatment, or water regime, we deployed a quasi-experimental method called difference-in-differences. An illustration of the method is summarized in Table S2. Our first difference is spatial, corresponding to whether a census tract is located inside Flint (F), and therefore treated by the shift in water regime, or not in Flint (NF). Our second difference is period-based, corresponding to whether a census tract is observed before (A) or after (B) the switch in water regime. The difference-in-differences method infers a causal effect by comparing the difference between A and B on the outcome of interest (LD incidence) for treated census tracts (F) relative to not treated census tracts (NF).
A key assumption of the difference-in-differences method, known as the parallel paths requirement, posits that the average period difference (A − B) in control group (NF) census tracts constitutes the counterfactual average difference between A and B in F census tracts if not for the treatment or switch in water regime. In this analysis, the preperiod parallel paths requirement is satisfied (Supporting Information). The differential behavior of LD incidence rates in post-water regime switch period of Fig. S1 is indicative of a powerful place-specific period effect. Note in Fig. S1 the extraordinary increase in the LD incidence rate in Genesee County. Although it is analytically tempting to conclude that the switch in water regime governs the LD spike in Genesee, our analysis plan aimed to rule out forces coincidental with the regime switch, evaluate plausible alternative explanations, and identify a potential causal mechanism for the observed increase in LD in Genesee County.
Our analysis begins by identifying regime effects. In the regime effect equations detailed below, our first difference is always geographic, corresponding to whether a census tract is located in Flint (and is therefore a recipient of Flint water) or is not in Flint (either located in Oakland, Wayne, or parts of Genesee County not in Flint). Our second difference is a period indicator that is variously defined by whether or not parameters from a given census tract are observed before the first switch in water source (phase A), after the switch to the Flint River but before the issuance of water quality alerts (phase B), after the switch and after the issuance of water quality alerts (phase C), and after the switch back to the DSWD water supply (phase D). Fig. 2A summarizes the precise timing of each phase. Expectations of parameter behavior involving various statistical comparisons by phase are described below.

Water Regime Effects: Preswitch vs. Postswitch.

We begin with a more global test of a water regime effect, estimating a baseline census tract random effects logistic equation for the probability of census i in week t presenting with a case of LD (1 = yes, and 0 = no):
Prob(LDit=1|Fi,Pt,Rit,Xi)
=Λ[β0+β1Fi+β2Pt+δ(Fi×Pt)+Γ1Rit+Γ2Xit+ζi],
[1]
where Λ[] is the cumulative distribution function (CDF) of the logistic distribution; Rit is a vector of temperature, precipitation, and humidity measures (from Weather Underground); Xit is a vector of census tract control variables including percentage of population 50 y of age and percent of households receiving public assistance (from US Census Bureau); ζi is the random effect of census tract i; Fi is an indicator variable = 1 if the census tract is in Flint; and Pt = 1 if the census tract is observed in the postswitch period (combining phases B and C detailed in Fig. 2A), with the treatment effect of the regime switch captured by the estimated coefficient (δ), constituting our difference-in-differences of F and P. In the presentation of logit model results below we exponentiate the estimated coefficient δ to derive an odds ratio, with the expectation that expδ>1 indicating that the switch from Detroit to Flint River water caused an increase the risk of a census tract in Flint presenting with an LD case.

Water Regime Effects: Division of the Postswitch Period.

Next, we test whether boil water advisories issued by authorities in Flint attenuated the risk of LD, providing an additional probe of whether the switch in water regime caused the observed spike in LD incidence in Flint. We divide the postswitch period into two phases, B and C (as detailed in Fig. 2A); compare LD outcomes in census tracts in phase B (after switch, before advisories) and phase C (after switch, after advisories) to phase A (before the switch in water regime); and then compare outcomes in phase C to B. These comparisons assume that the issuance of boil water advisories induced a meaningful reduction in water use by residents. Boil advisories were not issued to address the presence of legionellae in the Flint water supply: authorities issued advisories after positive tests for E. coli contamination. However, the advisories may have confirmed suspicion among residents that the drinking water was unsafe, thereby unintentionally increasing water avoidance by residents. Two sources indicate that the advisories meaningfully affected residential water exposure. First, Google Trend search interest data on water contamination in the Flint–Saginaw–Bay City metropolitan area increased measurably around boil advisory dates, indicating awareness of the official warnings among the local population (see ref. 19). Second, a large, statistically significant, and sustained increase in sales of bottled water in Genesee County corresponded with the issuance of advisories (18).
To examine whether advisories reduced LD risk (through a water avoidance pathway), we estimate a census tract random effects logistic equation for the probability of census i in week t presenting with an incidence of LD (1 = yes, and 0 = no):
Prob(LDit=1|GCi,PBt,PCt,Rit,Xi)
=Λ[β0+β1Fi+β2PBt+β3PCt+δ1(Fi×PBt)+δ2(Fi×PCt)+Γ1Rit+Γ2Xit+ζi],
[2]
where all terms carry from Eq. 1, with the exception of PBt, which is equal to 1 if the census tract is observed in the postperiod but before boil water advisories, and PCt, which assumes a value of 1 if the census tract is observed in the postperiod and after boil water advisories. The comparison of estimated coefficients δ1 and δ2 indicates whether water avoidance behavior of residents in Flint helped attenuate the LD outbreak and provides support for the water regime hypothesis. Insofar as waterborne exposure to legionellae in Flint is linked to LD risk in a given census tract in time, and boil water advisories helped to reduce LD risk in Flint by inducing water avoidance in resident population, it is expected that expδ1>1, expδ2>1, and expδ1>expδ2.

Water Regime Effects: Switch Back in Water Supply.

We analyze whether the switch back in water supply on October 16, 2015, caused a reduction in LD incidence by estimating a census tract random effects logistic equation for the probability of census i in week t presenting with an incidence of LD (1 = yes, and 0 = no):
Prob(LDit=1|Fi,PDt,Rit,Xi)
=Λ[β0+β1Fi+β2PDt+δ1(Fi×PDt)+Γ1Rit+Γ2Xit+ζi],
[3]
where all terms carry from Eq. 1, with the exception of PDt which is equal to 1 if the census tract is observed in the switch back period (phase D in Fig. 2A) and 0 if observed in the preswitch period (phase A in Fig. 2A). The causal effect of the switch back in water regime is captured by the estimated coefficient (δ). Insofar as the rise and fall of LD incidence in Flint was caused by a city-wide failure in water treatment, this test is expected to yield an expδ1, indicating that the switch back to Detroit water returned the risk of a census tract in Flint presenting with an LD case to precrisis levels.

Chlorine Residual.

Free chlorine (mg/L as Cl2) measurements at eight monitoring locations in Flint from 2013 to 2016 were obtained from the Monthly Operating Reports provided by the Flint Water Department (Supporting Information). Chlorine was measured at each location two to three times per week. Fig. 1B illustrates the behavior of average free chlorine at the weekly time step at each monitor site. Vertical lines bisecting the space correspond to water regime switch moments. Note the high between-monitor agreement in the level of free chlorine in the preswitch period, indicating negligible spatial variation in water quality across the City of Flint. In the postswitch period, we observe extraordinary temporal (or within monitor) and spatial (or between monitor) variation in the level of free chlorine.
Table S1 summarizes the statistical behavior of free chlorine within and between monitors in time illustrated in Fig. 1B. Analytically, the unprecedented exogenous variation in free chlorine levels observed in Fig. 1B and Table S1 is what we exploit to identify statistically the effect of changes in water quality on LD incidence in Flint, MI.

Free Chlorine Data Assignment.

To test the chlorine residual hypothesis, we needed to select a method for associating a location in the City of Flint map with chlorine residual monitoring points. Although a number of approaches could be used, we choose to develop a physically relevant monitor-to-parcel assignment algorithm that leverages best available information on parcel occupancy/vacancy, residence time of water, and the Flint water distribution system pipe network. We tried a number of common approaches, including various proximity and Thiessen polygon-based methods that delivered quantitatively similar results. A hydraulic-based chemical transport model of the water distribution system could be used to estimate free chlorine residual. However, it is unclear if this will have a meaningful effect on the results given the spatial and epidemiologic limitations of surveillance data. The algorithm begins by finding the shortest path (or spine) from the centroid of each parcel to the Flint Water Plant (FWP) via the pipe network obeying the water age gradient. The water age gradient used is correlated with spatial variation in blood lead levels during the switch in water supply (28), demonstrating the utility of this metric to account for physical variability within the water distribution system. Water age is dynamic and likely varied spatially during the study period. Utilizing the water age gradient helps to incorporate major hydraulic constraints that proximity-based methods fail to accommodate (e.g., flow restrictions due to pipe size). This results in 41,286 parcel to FWP spines. Next, the algorithm finds the shortest path of each monitor to each spine, again obeying the water age gradient. Each parcel is then assigned the (weekly average) chlorine value of the monitor with a spine juncture nearest to the parcel. Because LD incidence data are organized at the census tract scale, we average parcel chlorine to the census tract to generate 8,000 fully observed census tract (i) by week (t) observations in Flint from 2013 to 2016. The outcome of the monitor-to-parcel assignment algorithm is provided in Fig. S2.

Effect of Free Chlorine.

At adequate levels (commonly assumed to be concentrations 0.2 mg/L as Cl2), chlorine effectively suppresses legionellae growth. Under normal circumstances, it is near impossible to identify statistically a chlorine residual → legionellae → LD incidence pathway because of insufficient time and space variation in chlorine within water distribution systems. As observed in Fig. 1B, the switch in water regime in Flint induced striking temporal and spatial variation of free chlorine in the Flint water distribution system. Our estimation strategy analytically leverages this quasi-random behavior in free chlorine throughout the city. We estimate a census tract random effects logistic equation for the probability of a census i in week t presenting with an incidence of LD (1 = yes, and 0 = no):
Prob(LDit=1|Cit1,Pt,PDt,Rit,Xi)
=Λ[β0+β1Cit1+Γ1Pt+Γ2PDt+Γ3Rit+Γ4Xit+ζi],
[4]
where all terms carry from Eqs. 1 and 3 with the exception Cit1 denoting the average weekly free chlorine (mg Cl2/L) at census tract i in time t1. The 1-wk lag in free chlorine is included to account for the difference between symptom onset and referral date information in MDHSS case data. Recall the use of referral date information was necessitated due to missing and imprecise data for symptom onset date. In addition to a continuous measure of free chlorine, we examine threshold effects of free chlorine, with Cit1 = 1 if < 0.5 mg/L or <0.2 mg/L. Insofar as the loss of free chlorine increases the risk of LD, our expectation is that expβ1<1 in the continuous model and expβ1>1 in threshold models.
The census tract-specific residual, ζi, in the random effects model is meant to capture the combined effect of all omitted census tract-specific covariates that cause neighborhood variation in LD susceptibility. Omitted variables may include the underlying health frailty of residents or other sources of water chemistry parameters that affect legionellae growth, such as iron, pH, water temperature, or assimilable organic carbon. The census tract-specific random effect measures the difference in LD risk in a given census tract versus LD risk across all tracts in the City of Flint. Results from a Hausman specification test (χ2=2.54,p=0.96) indicate that model coefficients are efficiently estimated by random as opposed to census tract fixed effects.

Robustness Test: Hospital Hypothesis.

To test the plausibility of the hospital-based outbreak hypothesis, we recapitulate Eq. 1 through Eq. 4 but limit analysis to non-hospital A-related LD cases. Potential hospital A-related cases are identified by screening (i) all cases in the MDHSS case file indicating admission to hospital A and LD between 2014 and 2016 and/or whether MDHHS staff, on the basis of case analysis, override the reported admission indication and assign the case as a hospital A admission (n = 64) and (ii) all non-hospital A hospitalized cases with epidemiological investigation notes indicating a prior admission to hospital A between 2014 and 2016 (n = 19). All 19 cases in screen ii appear in screen i, giving a total of 64 cases potentially related to hospital A. Granting the hospital outbreak hypothesis maximum explanatory power, we assume that all 19 cases with a prior admission to hospital A contracted LD at hospital A. Of the 45 cases remaining from screen i, all but 6 were returned to our analysis pool because the recorded symptom onset date was before the hospital admission date. Adding 6 and 19 gives 25 cases plausibly resulting from a hospital-based outbreak.
By limiting the analysis to non-hospital A-related cases, we test whether estimated water regime and free chlorine suppression effects appreciably change. If hospital A-related cases fully govern LD outcomes in Flint, then our statistical results pertaining to water regime and loss of free chlorine effects ought to disappear. However, if water regime and chlorine coefficients do not appreciably change with the exclusion of hospital A-related LD cases, then it is highly unlikely that the LD spike in Flint resulted from a hospital-based outbreak only. Although the hospital exposure thesis is not incompatible with our water regime/chlorine hypothesis—the hospital is similarly drawing water from a portion of Flint’s water distribution system where chlorine residual was often very low—our case exclusion tests allow one to rule out an exclusively hospital-based argument.

The Genesee County (Outside of Flint) Epidemic.

To test the hypothesis that the sizeable increase in LD incidence in neighborhoods (or census tracts) adjacent to Flint, in Genesee County, were due to water exposure in Flint, we utilized the Longitudinal Employer-Household Dynamics Employment Statistics dataset (https://lehd.ces.census.gov/data/). This dataset estimates that 15,857 workers flow into Flint every day, with 12,843 of them residing in neighboring areas in Genesee County, constituting a remarkable 61.7% of all employed persons working inside Flint. Although commuting data capture inflows for the purposes of work, it is reasonable to assume exposure to the Flint water distribution system through leisure and other activities as well.
Restricting to not-Flint Genesee County census tracts, we test the commuter flow hypothesis by estimating the following census tract random effects logistic equation for the probability of census tract i in week t presenting with a case of LD (1 = yes, and 0 = no):
Prob(LDit=1|GCi,Pt,Rit,Xi)
=Λ[β0+β1GCi+β2Pt+δ(GCi×Pt)+Γ1Rit+Γ2Xit+ζi],
[5]
where all terms carry from Eq. 1, with the exception of GC which is equal to the observed count of daily commuters to Flint originating in non-Flint Genesee County census tract i. Insofar as exposure to Flint water is the source of the non-Flint outbreak, it is expected that δ increases monotonically in GC.

Data Availability

Data deposition: Chlorine residual data derived from Monthly Operating Reports is archived by the Michigan Department of Environmental Quality pursuant to the US Environmental Protection Agency’s (EPA) order issued January 21, 2016: www.michigan.gov/flintwater/0,6092,7-345–377816–,00.html#Monthly Operation Reports. The assignment algorithm developed to assign residential parcels to relevant monitoring stations within the water distribution system was written in Python 2.7.13. The source code is available at https://figshare.com/s/2628b3393ac7a4c0b127.

Acknowledgments

We are grateful for the assistance of all members of the Flint Area Community Health and Environment Partnership which helped guide the development of this manuscript. Specifically, Marcus Zervos (Henry Ford) provided guidance on epidemiologic surveillance and LD pathogenesis. Lead investigators of this group not already identified include (listed alphabetically): Carol Miller [Wayne State University (WSU)], Jessica Robbins-Ruszkowski (WSU), Joanne Smith-Darden (WSU), Judith Moldenhauer (WSU), Lara Treemore-Spears (WSU), Ben Pauli (Kettering), Joanne Sobeck (WSU), Poco Kernsmith (WSU), Susan Lebold (WSU), Tam E. Perry (WSU), Yongli Zhang (WSU), Matt Seeger (WSU), and Laura Sullivan (Kettering). Mariana Runho and Mohammed Dardona (WSU) assisted in compiling the chlorine dataset. The work reported was supported by MDHHS under Contract 20163753-00 and National Institute of Environmental Health Sciences of the National Institutes of Health (NIH) under Award R21 ES027199-01. As contractually mandated, the manuscript was submitted to the MDHHS for review more than 30 d in advance of being submitted for publication. The content is solely the responsibility of the authors and does not necessarily represent the official views of the MDHHS or NIH.

Supporting Information

Supporting Information (PDF)

References

1
KD Beer, et al., Surveillance for waterborne disease outbreaks associated with drinking water—United States, 2011–2012. MMWR Morb Mortal Wkly Rep 64, 842–848 (2015).
2
SA Collier, et al., Direct healthcare costs of selected diseases primarily or partially transmitted by water. Epidemiol Infect 140, 2003–2013 (2012).
3
G Liu, et al., Potential impacts of changing supply-water quality on drinking water distribution: A review. Water Res 116, 135–148 (2017).
4
MB Rosen, LR Pokhrel, MH Weir, A discussion about public health, lead and Legionella pneumophila in drinking water supplies in the United States. Sci Total Environ 590–591, 843–852 (2017).
5
R Fonger, General Motors shutting off Flint River water at engine plant over corrosion worries. MLive. Available at www.mlive.com/news/flint/index.ssf/2014/10/general_motors_wont_use_flint.html. Accessed December 6, 2017. (October 13, 2014).
6
P Muraca, JE Stout, VL Yu, Comparative assessment of chlorine, heat, ozone, and UV light for killing Legionella pneumophila within a model plumbing system. Appl Environ Microbiol 53, 447–453 (1987).
7
Z Zhang, et al., Legionella control by chlorine dioxide in hospital water systems. J Am Water Works Assoc 101, 117–127 (2009).
8
E Yabuuchi, et al., An outbreak of Pontiac fever due to Legionella pneumophila serogroup 7. II. Epidemiological aspects. Kansenshogaku Zasshi 69, 654–665 (1995).
9
NJ Ashbolt, Environmental (saprozoic) pathogens of engineered water systems: Understanding their ecology for risk assessment and management. Pathogens 4, 390–405 (2015).
10
S Cervero-Aragó, S Rodríguez-Martínez, A Puertas-Bennasar, RM Araujo, Effect of common drinking water disinfectants, chlorine and heat, on free Legionella and amoebae-associated Legionella. PLoS One 10, e0134726 (2015).
11
V Thomas, et al., Amoebae in domestic water systems: Resistance to disinfection treatments and implication in Legionella persistence. J Appl Microbiol 97, 950–963 (2004).
12
GF Craun, et al., Causes of outbreaks associated with drinking water in the United States from 1971 to 2006. Clin Microbiol Rev 23, 507–528 (2010).
13
3rd JO Falkinham, ED Hilborn, MJ Arduino, A Pruden, MA Edwards, Epidemiology and ecology of opportunistic premise plumbing pathogens: Legionella pneumophila, Mycobacterium avium, and Pseudomonas aeruginosa. Environ Health Perspect 123, 749–758 (2015).
14
M Taylor, K Ross, R Bentham, Legionella, protozoa, and biofilms: Interactions within complex microbial systems. Microb Ecol 58, 538–547 (2009).
15
ML Davis Water and Wastewater Engineering: Design Principles and Practice. Professional Edition (McGraw-Hill, New York, 2010).
16
; Great Lakes – Upper Mississippi River Board of State and Provincial Public Health and Environmental Managers Recommended Standards for Water Works (Health Research Inc, Albany, NY, 2012).
17
; World Health Organization, Water Safety in Distribution Systems (WHO, Geneva). (2014).
18
C Manske, H Hilbi, Metabolism of the vacuolar pathogen Legionella and implications for virulence. Front Cell Infect Microbiol 4, 125 (2014).
19
P Christensen, DA Keiser, GE Lade, Economic effects of environmental crisis: Evidence from Flint, Michigan (Iowa State University, Ames, IA), Mimeo. (2017).
20
S Zahran, SP McElmurry, RC Sadler, Four phases of the Flint water crisis: Evidence from blood lead levels in children. Environ Res 157, 160–172 (2017).
21
; American Water Works Association Disinfection Systems Committee, Committee report: Disinfection survey, Part 1-Recent changes, current practices, and water quality. J Am Water Works Assoc 100, 76–90 (2008).
22
R Fonger, CDC finds first genetic link between Legionnaires’ outbreak, Flint water. M Live. Available at www.mlive.com/news/flint/index.ssf/2017/02/cdc_finds_first_genetic_link_b.html. Accessed January 18, 2018. (February 16, 2017).
23
; AWWA Disinfection Systems Committee, Committee report: Disinfection survey, Part 2-Alternatives, experiences, and future plans. J Am Water Works Assoc 100, 110–124 (2008).
24
; US Environmental Protection Agency, FACTOIDS: Drinking water and ground water statistics for 2007 (US Environmental Protection Agency, Office of Water, Washington, DC), EPA 816-K-07-004. (2008).
25
C Garcia-Vidal, et al., Rainfall is a risk factor for sporadic cases of Legionella pneumophila pneumonia. PLoS One 8, e61036 (2013).
26
LE Garrison, et al., Vital signs: Deficiencies in environmental control identified in outbreaks of Legionnaires’ disease–North America, 2000–2014. MMWR Morb Mortal Wkly Rep 65, 576–584 (2016).
27
LA Hicks, et al., Increased rainfall is associated with increased risk for legionellosis. Epidemiol Infect 135, 811–817 (2007).
28
RC Sadler, J LaChance, M Hanna-Attisha, Social and built environmental correlates of predicted blood lead levels in the Flint water crisis. Am J Public Health 107, 763–769 (2017).