Linking fecal bacteria in rivers to landscape, geochemical, and hydrologic factors and sources at the basin scale

Edited* by Rita R. Colwell, University of Maryland, College Park, MD, and approved June 29, 2015 (received for review August 15, 2014)
August 3, 2015
112 (33) 10419-10424


New microbial source-tracking tools can be used to elucidate important nonpoint sources of water quality degradation and potential human health risks at large scales. Pollution arising from septic system discharges is likely more important than previously realized. Identifying these sources and providing reference levels for water quality provides a basis to assess water quality trends and ultimately remediate degraded areas.


Linking fecal indicator bacteria concentrations in large mixed-use watersheds back to diffuse human sources, such as septic systems, has met limited success. In this study, 64 rivers that drain 84% of Michigan’s Lower Peninsula were sampled under baseflow conditions for Escherichia coli, Bacteroides thetaiotaomicron (a human source-tracking marker), landscape characteristics, and geochemical and hydrologic variables. E. coli and B. thetaiotaomicron were routinely detected in sampled rivers and an E. coli reference level was defined (1.4 log10 most probable number⋅100 mL−1). Using classification and regression tree analysis and demographic estimates of wastewater treatments per watershed, septic systems seem to be the primary driver of fecal bacteria levels. In particular, watersheds with more than 1,621 septic systems exhibited significantly higher concentrations of B. thetaiotaomicron. This information is vital for evaluating water quality and health implications, determining the impacts of septic systems on watersheds, and improving management decisions for locating, constructing, and maintaining on-site wastewater treatment systems.
Water quality degradation influenced by diffuse sources at large watershed scales has been difficult to describe. Human modifications of natural landscapes can permanently alter hydrologic cycles and affect water quality (1, 2). Deforestation (3) and increased impervious surface area (4) have been linked with decreased infiltration and thus increased surface runoff. Overland flows concentrate pollutants and rapidly transport them down gradient where they eventually enter surface water systems and affect water quality (5, 6). A number of models have been developed to calculate overland and surface water flows (7, 8) and nutrient/chemical transport (9), but few studies have focused on microbial movement from land to water, particularly nontraditional fecal indicator bacteria that can be used to track human sources of pollution.
Microbial contamination poses one of the greatest health risks to swimming areas, drinking water intakes, and fishing/shellfish harvesting zones where human exposures are highest (1012). These highly visible areas often receive more attention than sources of contamination because identifying the origin of pollution in complex watersheds requires costly comprehensive investigation of environmental and hydrologic conditions across temporal and spatial scales (13). Grayson et al. (14) suggest using a “snapshot” approach that captures water quality characteristics at a single point in time across broad areas to provide information frequently missed during routine monitoring. Compared with long-term comprehensive investigations, the snapshot approach reduces the number of samples, cost, and personnel required to examine pollution sources.
Escherichia coli concentrations are commonly used to describe the relative human health risk during water quality monitoring in lieu of pathogen detection. Studies attempting to trace pollution in water back to a specific land use with E. coli have rarely produced definitive conclusions (15, 16). Using molecular approaches, specific source targets can be isolated in complex systems and have recently been used to investigate land use and water quality impairments (17). Furtula et al. (18) demonstrated ruminant, pig, and dog fecal contamination in an agriculturally dominated watershed (Canada) using Bacteroides markers. The Bacteroides thetaiotaomicron α-1–6 mannanase (B. theta) gene has a high human specificity (1922), but no studies to date have linked its presence to land use patterns.
Reference conditions have been established for minimally disturbed environments based on measurements of macroinvertebrates, fish, and diatoms (2325), but microbial reference conditions have not been adequately explored or defined. Based on 15 unimpaired California streams, microbial reference conditions for E. coli [1.0 log10 most probably number (MPN)⋅100 mL−1] and enterococci (1.2 log10 MPN⋅100 mL−1) were defined as being below state water quality thresholds (26). In the Great Lakes, a human health threshold of 2.37 log10 E. coli MPN⋅100 mL−1 (27), or a level equally protective of human health, has been adopted by all state governments. However, this health-associated reference level was derived from epidemiological studies undertaken at beaches throughout the United States (28, 29) with limited knowledge of local implications.
In response to water quality degradation from human stressors and the poorly understood microbial conditions in large-scale fresh water systems such as the Great Lakes basin, this paper aims to (i) examine the spatial distribution of E. coli and a human specific source marker (B. theta) in 64 river systems that drain most of the state’s Lower Peninsula under baseflow conditions, (ii) identify baseflow reference levels of fecal contamination in rivers, and (iii) determine how key chemical, physical, environmental, hydrologic, and land use variables are linked to river water quality at large scales.

Results and Discussion

To address microbial water quality impairment, this study examined fecal bacteria source tracking across a large spatial scale with classification and regression tree (CART) statistical method to link fecal contamination in rivers to landscape, geochemical, and hydrologic factors as well as potential human fecal sources such as septic systems and sewage effluent at the basin scale. The B. theta results suggest human fecal contamination was affecting 100% of the studied river systems. These results have significant implications for water and environmental quality managers. Further details on hydrologic, geochemical, and land use characteristics, as well as a CART analysis of the reduced dataset, are described in SI Materials and Methods.

Microbial Water Quality and Reference Conditions.

This project measured E. coli and B. theta concentrations in 64 rivers under baseflow conditions. Across all sites, E. coli concentrations ranged from 0.3 to 3.0 log10 MPN⋅100 mL−1 (geometric mean of 1.4 log10 MPN⋅100 mL−1) and B. theta ranged from 4.2 to 5.9 log10 cell equivalents (CE)⋅100 mL−1 (geometric mean of 5.1 log10 CE⋅100 mL−1). E. coli levels were below the detection limit (<1 MPN⋅100 mL−1) in four rivers, whereas B. theta was detected in all samples (Fig. 1 and Table S1). Nine rivers (14% of sites) exceeded the US Environmental Protecion Agency (USEPA) suggested E. coli criterion for safe contact (2.37 log10 MPN⋅100 mL−1), ranging in concentrations from 2.4 to 3.0 log10 MPN⋅100 mL−1. In these same nine rivers, B. theta concentrations ranged from 4.6 to 5.6 log10 CE⋅100 mL−1. These nine E. coli values were significantly different (P < 0.001) from those of the other 55 sites, which had a geometric mean of 1.3 log10 MPN⋅100 mL−1. In contrast, there was no statistically significant difference (P = 0.433) between B. theta concentrations from these two sets of sites.
Fig. 1.
(A) E. coli (log10 MPN⋅100 mL−1) and (B) B. theta (log10 CE⋅100 mL−1) concentrations measured in 64 rivers under baseflow conditions. Areas in black were not represented with samples.
Table S1.
Description of land, discharge, and E. coli and B. theta concentrations at 64 Michigan rivers under baseflow
River systemE. coli, MPN⋅100 mL−1B. theta, CE⋅100 mL−1Discharge, m3⋅s−1Area, km2Urban, %Agriculture, %Rangeland, %Forest, %Water, %Wetland, %Barren, %No. of septic systemsAverage annual WWTP discharge, MGD
St. Joseph River1.05.357.311,06114.359.,64324.2
Paw Paw River1.85.78.01,02711.547.52.621.11.415.60.318,1062.3
Kalamazoo River1.65.943.65,00214.147.81.821.,94849.2
Grand River1.55.550.712,85412.755.,03388.1
Muskegon River1.85.437.86,4187.619.69.640.53.918.60.147,1156.7
White River2.15.87.71,0495.220.39.749.70.714.30.16,2240.0
Pere Marquette River2.25.812.21,7905.09.38.361.,0940.0
Big Sable River0.75.32.94765.311.,9750.0
Little Manistee River1.45.44.55264.73.912.568.,2280.0
Manistee River0.24.742.83,5595.79.615.956.41.410.90.111,9510.0
Bear Creek2.15.93.03506.313.820.137.62.319.70.11,6560.0
Betsie River2.05.822.86188.17.613.146.29.915.00.16,1400.0
Platte River0.25.24.84716.69.913.456.,6970.0
Boardman River1.15.36.771610.810.418.846.,8244.0
Elk-Torch River0.24.512.81,3087.614.413.645.411.37.40.29,5950.0
Cheboygan River0.24.926.12,3176.48.211.651.,6600.0
Black River1.24.811.61,5095.54.412.147.13.927.00.12,7470.3
Thunder Bay River0.54.912.62,2416.311.08.840.22.731.00.17,2831.6
Au Sable River1.25.632.55,2878.43.214.558.,1560.6
Au Gres River2.15.31.19876.423.37.637.,5730.4
Rifle River1.65.55.08589.316.58.944.21.619.40.16,3221.6
Black River1.15.40.661,2506.,5494.1
Pine River2.65.40.054409.,4300.0
Belle River2.05.61.45129.559.71.719.,4163.2
Clinton River2.05.5n/a1,88051.520.21.314.,581152.9
River Rouge2.75.50.321,03382.,34510.9
Huron River1.95.87.72,29832.524.,12243.9
Raisin River1.45.44.72,68310.867.40.811.,99914.0
South Branch Black River2.35.91.13139.145.84.422.,0991.7
North Branch Black River2.25.61.53987.043.65.624.81.717.10.24,0730.6
Macatawa River1.35.60.1629223.567.,9601.1
Pine Creek2.14.90.274848.430.,4870.0
Pigeon River2.75.00.0810211.,7610.0
Rush Creek2.35.40.1215256.431.,6070.0
Buck Creek2.24.60.64391.
Sand Creek2.34.90.3114219.160.80.811.,7800.0
Bass River2.15.10.2012711.,5940.0
Little Pigeon Creek2.54.90.031418.916.
Black Creek3.04.80.6613614.934.85.329.94.810.10.24,1970.0
Silver Creek1.74.40.374111.70.615.
Flower Creek2.54.60.347910.245.610.727.
Stony Lake Outlet0.55.21.316010.137.711.735.,6350.0
Swan Creek2.54.90.34545.557.
Lincoln River2.25.00.662155.633.211.830.,0870.0
Crystal River0.74.61.61104.73.48.753.723.73.32.45180.0
Belangers Creek1.34.70.08256.738.412.730.
Mitchell Creek2.24.80.293828.322.816.319.,6080.0
Jordan River0.94.44.11743.27.86.570.
Monroe Creek1.24.50.08274.222.38.844.
Boyne River1.25.41.71998.316.110.854.,6470.0
Bear River1.54.81.62936.413.,2760.0
Carp River1.05.01.81196.28.67.722.
Ocqueoc River0.94.72.43694.76.511.543.
Trout River1.34.80.41824.613.59.528.
Little Trout River2.44.90.06285.427.87.514.
Long Lake Creek1.84.20.031625.711.77.120.715.739.10.11,1420.0
Tawas River1.24.51.64038.47.16.951.,5970.9
Harrington Drain2.04.50.015399.,7140.0
Marsh Creek2.45.30.137872.04.71.715.,8380.0
Sandy Creek2.24.90.028226.258.71.310.,5060.5
Cass River1.25.42.02,1746.957.,9992.8
Flint River1.95.76.33,20621.,16875.6
Shiawassee River2.04.74.41,51715.752.50.717.,04313.5
Tittabawassee River1.95.617.56,2118.632.87.330.61.519.10.255,63516.3
n/a, not applicable.
E. coli concentrations (geometric mean of 1.4 log10 MPN⋅100 mL−1) were generally below USEPA recreational water quality criteria and consistent with previously measured ranges in Great Lakes tributary rivers (3032). A comprehensive review (33) found that E. coli levels in freshwater below 2.23 log10 MPN⋅100 mL−1 were associated with low relative risks of gastrointestinal illness for swimmers compared with nonswimmers. Because the E. coli geometric mean concentration observed in this study was below the safety level reported by Wade et al. (33), we suggest a reference condition for E. coli of 1.4 log10 MPN⋅100 mL−1 for Michigan’s Lower Peninsula rivers under baseflow conditions in the absence of recent storm runoff. Wade et al. (28) reported positive associations between occurrence of illness and molecularly detected Bacteroides at one Great Lakes beach with a geometric mean concentration of 3.08 log10 CE⋅100 mL−1, while noting that the associations were statistically weak (P < 0.1). Yampara-Iquise et al. (19) reported B. theta levels ranged from 5.8 to 9.8 log10 copies⋅100 mL−1 in multiple urban, agricultural, and small-town creek systems that represented various levels of human impact. In the current study, B. theta concentrations (range = 4.2–5.9 log10 CE⋅100 mL−1; geometric mean = 5.1 log10 CE⋅100 mL−1) averaged 1.6 times higher than levels reported by Wade et al. (28) but slightly lower than those reported by Yampara-Iquise et al. (19). Establishing B. theta reference conditions for Michigan rivers under other flow conditions would require additional sample analysis and a greater understanding of the bacterial distributions because comparative B. theta datasets are relatively small relative to available E. coli data, a key aspect to defining reference levels (34). Reference levels are important for establishing acceptable levels of disturbances, defining long-term water quality changes, and supporting management decisions (34). Although the concept of a reference condition lies in the notion of minimal impact, it is recognized that few streams or rivers are truly unimpaired because most receive treated sewage effluent, and the current study supports this premise.

CART Analysis of Microbial Water Quality.

A primary goal of this study was to address diffuse pollution sources, historically a significant challenge in managing water quality. Major sources of nutrient loads from point and nonpoint sources of contamination were previously examined for Michigan’s Lower Peninsula and shown to vary significantly between watersheds (35). The current study examines these drivers under baseflow conditions, where groundwater inputs dominate flows and wastewater effluent generally provides only a small fraction of total river discharges (Table S1). Effects of wastewater treatment plant (WWTP) effluent on microbial water quality were examined using multiple approaches (see Supporting Information for details), and it was ultimately determined that WWTP were not a driving factor of microbial water quality in the studied watersheds. Future analysis of the seasonal efficacy of WWTP could improve the understanding of wastewater impact on water quality by quantifying effluent discharge contributions in key urban areas.
The initial hypothesis of this research was that land use would best explain fecal bacterial concentrations in water. Instead, we found that land use characteristics such as septic systems and nutrients were the primary explanatory factors of microbial water quality. The influence of septic systems on microbial water quality, measured by E.coli, at a smaller watershed scale has also been reported in other regions (36, 37). In the current study, E. coli concentrations were linked primarily to total phosphorus and potassium. B. theta concentrations were primarily associated with the total number of septic systems in the watershed and within a 60-m buffer. Because WWTPs were not a driving factor of microbial water quality in the studied watersheds, these results indicate that under low flow conditions septic systems are a significant source of human fecal contamination to surface water in the studied watersheds.
CART analysis was used to evaluate the influence of the independent variables on E. coli and B. theta. Results from CART analyses for E. coli and B. theta concentrations at the full and reduced watersheds are summarized in Fig. 2 and Fig. S1, respectively. The CART outputs indicated complex causes of river water quality variability under baseflow conditions. For instance, E. coli concentrations at the full watershed scale were mainly related to total phosphorus (TP) concentrations, which is consistent with results by Carrillo et al. (38). TP concentrations accounted for 48% of E. coli variance with a threshold of 19.0 µg⋅L−1. Although TP is essential for bacterial growth, the authors acknowledge that treated wastewater effluent includes high levels of both E. coli and TP. However, as stated above, WWTPs were not a driving factor of microbial water quality in the studied watersheds. Phosphorus, like E. coli, may be derived from sediments in the rivers, soil, plants, animal wastes, or manure and thus, unlike the B. theta, is not exclusive to fecal pollution.
Fig. 2.
CART analyses for (A) E. coli and (B) B. theta concentrations as dependent variables and land use, nutrient, chemical, hydrologic, and environmental parameters as independent variables in watersheds. PRE, proportion of reduction in error.
Fig. S1.
CART analyses for log-transformed (A) E. coli and (B) B. theta concentrations as dependent variables and land use, nutrient, chemical, hydrologic, and environmental parameters as independent variables in reduced watersheds (n = 52).
The full watershed CART outputs and correlation analysis indicated B. theta concentrations were strongly associated with total numbers of septic systems in the watershed (r = 0.364, P = 0.002) and in the 60-m buffer (r = 0.357, P = 0.004). B. theta concentrations were not correlated with septic system density in the watershed (P = 0.361) or in the 60-m buffer (P = 0.520). Interestingly, the total number of septic systems in the watershed accounted for 36% of the B. theta concentration variance with a threshold count of 1,622 systems per watershed, as shown in Figs. 2B and 3. The snapshot sampling strategy used in this study focused on a spatial composite of the watersheds near the drainage point toward the Great lakes. Thus, the total number of people on septic tanks equates to the level of feces entering each watershed, and these levels are potentially dominated by failing septic systems contributing high concentrations of bacteria to nearby water systems. A Michigan health department reported a 26% on-site wastewater failure rate during time of sale or transfer inspections that discharged an estimated 65,000 gallons of untreated fecal waste each year to nearby water bodies (39). Future watershed-based studies should include analysis of total septic systems in the watershed and septic density, because it would be possible to overlook failing septic systems if the sample size were small or the focus were only on septic density. Additional efforts aimed at the condition of septic systems, their ability to remove bacteria, and microbial transport to nearby surface waters are required.
Fig. 3.
B. theta versus septic systems illustrating the CART output from the first split of Fig. 2B.
The direct and significant correlation between estimated number of septic systems and the human-specific marker B. theta in water (Fig. 3) illustrates a major issue for water quality of Michigan’s streams and rivers, with an estimated 1.4 million on-site septic systems statewide (35, 40). In this study, the overall B. theta geometric mean was one log10 unit higher than secondary treated sewage effluent, whereas the highest measured concentrations were 1.5 logs higher than biologically treated septage effluent (20). Interestingly, when the CART analysis considered the entire upstream drainage area, including lakes, 2.5 times fewer septic systems were required to produce B. theta levels similar to when these drainage areas were restricted to downstream of the nearest lake, potentially indicating increased failure rates of septic systems surrounding lakes compared with rivers (see Supporting Information for details). Habteselassie et al. (41) identified that surface water and groundwater near failing on-site wastewater treatment systems contained higher concentrations of E. coli and enterococci than water surrounding properly functioning on-site wastewater treatment systems (P < 0.001). Combined, these results illustrate the importance and need for responsible development and septic system maintenance along lake and river riparian zones to protect water quality. Future analysis should include incremental spatial assessment of B. theta with respect to septic systems in watersheds to assess the fate and transport of bacteria from septic systems and define their acute/chronic impacts on water quality.
E. coli and B. theta Z-scores [(observed – mean)/SD] were compared using CART, as shown in Fig. 4, to identify the characteristics that could differentiate between E. coli and B. theta concentrations. Positive values of the Z-score differences occur when E. coli concentrations are higher, relative to their population mean, than B. theta concentrations. Negative values imply the opposite, with relatively higher B. theta concentrations. In catchments with discharge <0.66 m3⋅s−1 and with fewer than 294 septic systems in the 60-m buffer, E. coli concentrations were much higher than those of B. theta. In contrast, B. theta concentrations were much higher than those of E. coli in rivers with discharge >0.66 m3⋅s−1, particularly in catchments with dissolved organic carbon >5.4 µg⋅L−1. E. coli, which occurs in the feces of all warm-blooded mammals and birds, has been shown to persist and regrow in the environment under some conditions and has been associated with suspended particles that have low settling rates (4245). Therefore, in watersheds with low discharge it is possible that E. coli can attach to particles and persist longer than B. theta, which is an anaerobic organism with a faster decay rate in rivers (46).
Fig. 4.
CART of E. coli and B. theta Z-scores illustrating conditions associated with different concentrations between these two microbes. PRE, proportion of reduction in error.
We compared the concentrations and loads of E. coli and B. theta across all sites (Fig. S2). No statistically significant relationship was identified between E. coli and B. theta concentrations (r = 0.18; P = 0.16). Bacterial entry to rivers during baseflow seems to be occurring from some of the same diffuse sources, including septic systems. The comparison of E. coli versus B. theta concentrations illustrated that each of these microorganisms was entering rivers from similar sources (i.e., diffuse sources such as septic systems) (Fig. 2). However, each organism was influenced by different environmental parameters as identified by the Z-score CART analysis (Fig. 4). E. coli was ubiquitous in most rivers and concentrations were primarily associated with TP and K levels. This study indicates that B. theta can be used as a source-tracking marker to investigate diffuse sources of human-derived contaminants from septic systems under baseflow hydrologic conditions at watershed scales.
Fig. S2.
Scatter plots of B. theta versus E. coli (A) concentrations (n = 64) and (B) loads (n = 63).

SI Materials and Methods

Landscape Characteristics.

The land use composition across the project area can be split into two groups (Fig. S3): The northern area has more forest and wetlands (P < 0.01) and the southern area has more urban and agriculture (P < 0.01). Southern watersheds, based on latitude, showed E. coli (r > 0.345, P < 0.005) and B. theta (r > 0.250, P < 0.05) concentrations were statistically correlated to agriculture at the reduced watershed and 60-m buffered scales.
The estimated number of on-site septic systems was highly variable across the study area, with SDs roughly twice the mean for each of the three scales. In contrast, septic system densities were similar across all three scales. Interestingly, impervious surface coverage in the 60-m buffer (average = 5.5%) and full watersheds (average = 7.5%) were correlated to septic density at the same spatial scale (r ≥ 0.370, P < 0.001). The number and density of on-site septic systems was higher in the southern sites compared with northern sites at both the full watershed and the 60-m buffer scales (P < 0.006). The land use composition and classification of each river system including septic systems at the full watershed, reduced watershed, and 60-m riparian buffer are defined in Table S1 and summarized in Table S4.
The USEPA DMR Pollutant Loading Tool ( was used to estimate the ratio of average annual WWTP effluent to measured baseflow. The total sum of WWTP discharges (million gallons per day, MGD) in each watershed was calculated in Esri ArcMap GIS software. This total WWTP discharge was compared with the measured baseflow river discharge to produce the ratio of average annual WWTP effluent to measured baseflow. The ratio of average annual WWTP effluent to measured baseflow was calculated using annual averages of WWTP discharge and field measurements; thus, values greater than 100% were possible and any watersheds exceeding 100% were removed from calculations. Although estimated levels of bacterial discharge are reported to the DMR, it was not appropriate to calculate a proportion of measured bacteria attributable to WWTP effluent because bacteria concentrations can change quickly (65) and the concentrations reported from WWTP are generally annual estimates of fecal coliforms.
The percentage of measured flow attributable to WWTP effluent was estimated to be between 0% and 52%, with a mean of 4%. The ratio of average annual WWTP effluent to measured baseflow flow was calculated using annual averages of WWTP discharge and field baseflow measurements; thus, values greater than 100% are possible. Only seven watersheds had WWTP contributions above 10% of measured flow. Our analysis also included mean population densities served by WWTP as estimated from census blocks and wastewater service boundaries. When the 28 watersheds with >80% of the population relying on WWTP service were excluded from our CART analysis, the primary split variables remained the same for E. coli and B. theta concentrations. Furthermore, sources of human bacteria could not be distinguished because no statistical difference (P > 0.1) of bacterial concentrations was identified between these two groups of watersheds [i.e., WWTP-reliant (>80% of the population living inside the WWTP service area, n = 28) or septic-reliant (>80% of the population living outside the WWTP service area, n = 36)]. In this case, we could not statistically differentiate the impacts of the point source WWTP effluent on receiving water bodies from the plethora of nonpoint sources measured during baseflow sample collection. Previous studies from Michigan demonstrated that B. theta concentrations in untreated sewage averaged 7.2 log10 CE/100 mL and were reduced by 3.1 logs through secondary treatment before discharge (66).

Hydrogeologic and Geochemical Properties.

The watersheds included in our study were characterized under baseflow conditions to ensure precipitation was not significantly influencing stream flow. Six-hour cumulative precipitation totals were generally low with a mean of 0.14 mm. River discharge and discharge per area ranged from 0.01 to 57 m3⋅s−1 and 1.1 × 10−4 to 2.2 × 10−1 m3⋅s−1⋅km−2, respectively. Discharge for each river system is provided in Table S1. CART analysis identified the pH, total phosphorus, water temperature, potassium, and septic system numbers in the watershed as significantly related to microbial water quality. Descriptive statistics for all measured hydrogeologic variables are provided in Table S2.

Reduced Dataset Analysis.

In the reduced watersheds (Fig. S1), the highest B. theta concentrations were driven by septic systems in the watershed, similar to the full watershed models, but with a tipping point of 3,927 septic systems in the watershed, much higher than the full watershed CART models. The highest concentrations of E. coli in the reduced watersheds (Fig. S1) were associated with potassium levels greater than 0.91 mg⋅L−1.


To address impaired waters and restore them to designated uses, the process for total maximum daily loads (TMDLs) has been developed under the Clean Water Act. According to Stiles (47) there are currently 65,000 TMDLs and 43,000 listings that need to be addressed. Many stretches of water systems are impaired due to fecal pollution and E. coli, but there have been no established approaches or tools to identify nonpoint sources. This study provides a path forward to assess and ultimately improve water quality at large scales. More importantly, this study provides reference conditions for a large number of watersheds that, in the event of major landscape disturbance, could be used to measure remediation progress. Using a synoptic sampling approach for regional water quality assessment, this study found that human fecal contamination was prevalent under baseflow conditions. Baseflow in the study watersheds was generally dominated by groundwater and not by wastewater treatment effluent. Results suggest a regional E. coli reference condition below the current USEPA freshwater recreational criterion could be established. However, identifying specific sources of fecal contamination in rivers cannot be achieved using ubiquitous bacteria, such as E. coli. Assessing water quality using solely E. coli may mislead water quality managers and severely limit the ability to remediate impaired waterways. However, microbial source-tracking markers, such as the human-specific B. theta marker, can provide a more refined tool to identify the impacts of nonpoint sources of human fecal pollution, which could help prioritize restoration activities that should be implemented at watershed scales. The high variability of water quality measurements illustrates complex relationships between bacteria and landscape, geochemical, and hydrologic properties. The influence of septic systems in riparian zones also indicates that additional localized control measures, including septic system maintenance and construction, should be implemented to protect water quality and human health.

Materials and Methods

Study Area.

This study investigated 64 watersheds draining Michigan’s Lower Peninsula to the Great Lakes (Fig. S3). Watersheds were selected using the following criteria: (i) the 30 largest watersheds that represent >80% of Michigan’s Lower Peninsula land area and (ii) 34 smaller watersheds randomly selected across the state from locations near their outlet to the lake. All sampling sites were located at bridge crossings and selected on the criteria that each was reasonably accessible, had adequate flow, river water dominated discharge, and the maximum amount of upstream land use was captured while meeting the above criteria.
Fig. S3.
Watersheds of sampled river systems that drain Michigan's Lower Peninsula and states to the south, colored by 2006 NLCD land use classes.

Water Sample Collection.

A synoptic sampling scheme was used to capture water quality characteristics under a single flow condition (i.e., baseflow) across broad spatial areas (14). Compared with long-term comprehensive investigations, this approach reduces the number of samples, cost, and personnel resources required to address pollution sources while providing essential information missed during routine monitoring.
Grab samples were collected from each river sampling site between October 1–13, 2010, which was chosen as a groundwater-dominated baseflow period based on historical hydrographs and antecedent precipitation. Groundwater-driven baseflow is critical to the preservation of water quality and quantity in the Great Lakes and provides year-round support for aquatic habitats. Before sampling each watershed, meteorological conditions were monitored to ensure that no significant precipitation had occurred within several days and hydrographs from nearby US Geological Survey (USGS) stream gauges were inspected to check that sampled rivers were at baseflow. October was chosen for the sampling period because the late growing season baseflow period is least likely to have large variability in water quality because flows are dominated by groundwater in the region. There is variability in water quality between baseflow periods (i.e., fall versus summer), but this variability is small relative to the variability between baseflow and other periods due to overland flow and dilution effects (48, 49). Water temperature (degrees Celcius), specific conductance (microsiemens per centimeter), and dissolved oxygen (milligrams per liter) were measured on-site using YSI 600R Sonde (YSI Incorporated). Field samples were placed on ice in coolers and transported to Michigan State University for other analyses, including bacterial testing (described below) within 24 h.

Water Analysis.

Each sample was assayed for water chemistry as summarized in Table S2. The methods for assaying chemicals and nutrients are described in Table S3. E. coli analyses were performed within 24 h of collection using IDEXX Colilert Quanti-Tray 2000. Following incubation at 35 °C (±0.5 °C) for 24 h (±2 h), fluorescent wells were reported positive for E. coli, and reported as MPN per 100 mL. E. coli C-3000 (American Type Culture Collection 15597) was used as positive control for verification of media integrity. Sterile water was used for negative controls to verify method integrity. E. coli measurements below detection limits (1.0 MPN⋅100 mL−1) were assigned the value of the detection limit.
Table S2.
Descriptive statistics of physical, chemical, and hydrologic variables measured during baseflow conditions at 64 rivers
ParameterCountMinimumMeanMaximumSD5th percentile95th percentile
Ammonia, µg⋅L−1630.023.6280.
Calcium, mg⋅L−16330.062.4160.621.633.898.2
Chlorine (Cl), mg⋅L−1633.442.3291.854.45.9174.8
Dissolved oxygen, mg⋅L−1645.99.813.
Dissolved organic carbon (NPOC), mg⋅L−1631.
Magnesium, mg⋅L−1637.018.434.26.310.329.1
Nitrate/nitrite (NOx), µg⋅L−1640.0858.35,638.91,310.30.04,095.6
Pheophytin corrected chlorophyll a, µg⋅L−1590.
Potassium, mg⋅L−1630.
Sodium, mg⋅L−1633.027.0199.336.93.4113.0
Soil hydraulic conductivity (Ksat), m⋅d−1640.
Specific conductance, μS⋅cm−163257.0527.01,589.0264.2265.21,039.8
Soluble reactive P, µg⋅L−1640.923.3266.
Sulfate, µg⋅L−1632.432.1169.830.55.689.6
Total dissolved N, µg⋅L−1640.01,423.36,033.71,346.5337.65,414.1
Total dissolved P, µg⋅L−1643.125.2292.338.63.958.0
Total N, µg⋅L−16481.81,082.15,583.11,129.3110.83,610.6
Total P, µg⋅L−1647.737.8395.552.48.9102.5
Total chlorophyll a, µg⋅L−1590.
Precipitation, mm       
 6 h640.
 12 h640.01.977.910.20.07.9
 18 h640.03.478.611.60.030.8
 24 h640.04.478.611.90.031.0
 2 d640.06.078.611.90.031.0
 3 d640.07.780.
 4 d640.08.380.513.50.034.2
 6 d640.09.087.314.10.034.2
 8 d640.011.692.616.50.057.2
Discharge, m3⋅s−1630.06.757.312.50.043.4
Water temperature, °C647.
Precipitation measured at hourly averages from 16-km2 NEXRAD cells and reported in cumulative millimeters per time.
Table S3.
Summary of chemical and nutrient methods
AssayMethod descriptionRefs.
Ammonia, µg⋅L−1Phenate methodStandard methods 4500-NH3-G (68)
Calcium, mg⋅L−1Flame atomic absorption spectrophotometry65
Chlorine (Cl), mg⋅L−1Dionex membrane-suppression ion chromatography65, 66
Magnesium, mg⋅L−1Flame atomic absorption spectrophotometry65
Nitrate/nitrite, µg⋅L−1Cadmium reductionStandard methods 4500-NO3-E (68)
Pheophytin corrected chlorophyll a, µg⋅L−1Fluorometry with pheophytin correction following ethanol extractionStandard methods 10200.H (68)
pHHydrolab multisonde66
Potassium, mg⋅L−1Flame atomic absorption spectrophotometry (0.5% HNO3 preservative)66
Sodium, mg⋅L−1Flame atomic absorption spectrophotometry (0.5% HNO3 preservative)66
Soluble reactive phosphorus, µg⋅L−1Ascorbic acid methodStandard methods 4500-P.E. (68)
Sulfate (SO4), µg⋅L−1Dionex membrane- suppression ion chromatography66
Total dissolved nitrogen, µg⋅L−1Second derivative spectroscopy following persulfate digestion67
Total dissolved phosphorus, µg⋅L−1Ascorbic acid method following persulfate digestionStandard methods 4500-P.E and 4500-N.C (68)
Total nitrogen, µg⋅L−1Second derivative spectroscopy following persulfate digestion67
Total phosphorus, µg⋅L−1Ascorbic acid method following persulfate digestionStandard methods 4500-P.E and 4500-N.C (68)
Total chlorophyll a, µg⋅L−1Fluorometry following ethanol extractionStandard methods 10200.H (68)
Samples were analyzed for the human-specific marker B. theta, which has been shown to have a high sensitivity comparable to other human-associated markers in a multilaboratory evaluation (50). Compared with B. theta, HF183 and other source markers had greater false positive rates in animal feces collected in the same region as our study area (21). BacHum exhibited an even greater false positive rate than HF183 (51). Laboratories associated with our team and others have demonstrated that B. theta is a suitable human-specific marker and is related to human health outcomes (1921, 52).
Analysis of the human-specific marker B. theta α-1–6 mannanase (5′CATCGTTCGTCAGCAGTAACA3′; 5′CCAAGAAAAAGGGACAGTGG3′) was performed according to Yampara-Iquise et al. (19), specifically by filtering 900 mL of water through a 0.45-µm hydrophilic mixed cellulose esters filter. Each filter was placed into a 50-mL centrifuge tube containing 20 mL of sterile phosphate-buffered water, vortexed, and centrifuged (30 min; 4,000 × g; 21 °C). Eighteen milliliters were decanted from the tube and the remaining eluent and pellet were stored at −80 °C. DNA was extracted from 200 µL of the thawed pellet via QIAamp DNA mini kit protocol. Quantitative PCR (qPCR) was performed on extracted DNA following Yampara-Iquise et al. (19) with a probe modification (20) using a Roche Light-Cycler 2.0 Instrument (Roche Applied Sciences). Each B. theta assay was carried out with 10 µL of LightCycler 480 Probe Mastermix (Roche Applied Sciences), 0.4 µL forward and reverse primers, 0.2 µL probe 62 (6FAM-ACCTGCTG-NFQ; Roche Applied Sciences Universal Probe Library), 1.0 µL BSA, 3.0 µL nuclease-free water, and 5.0 µL of extracted DNA and processed in triplicate. The qPCR analyses included a 15-min, 95 °C preincubation cycle, followed by 50 amplification cycles, and a 0.5-min 40 °C cooling cycle. A diluted plasmid standard was included during each qPCR run as a positive control and molecular-grade water was used in place of DNA template for negative controls. One copy of the targeted B. theta gene is assumed present per cell, and thus one gene copy number corresponded to one equivalent cell (19, 20). B. theta gene copies were converted to CE and reported as qPCR CE⋅100 mL−1.

Climate and Hydrology.

Hourly precipitation data were extracted from the Grand Rapids, Gaylord, and Detroit (Michigan) Next Generation Radar (NEXRAD) stations through the National Climate Data Center (, with a base reflectivity of 0.50°, an elevation range of 124 nautical miles, and 16-km2 cells. Hourly precipitation averages across each watershed were used to calculate total rainfall weighted by the proportion of each NEXRAD cell within the sampled watershed. Precipitation was categorized into cumulative hourly totals (millimeters) before sample collection at intervals of 6, 12, 18, and 24 h and 2, 3, 4, 6, and 8 d, reported as millimeters per time before sample collection.
Real-time river discharge was measured at each site during sample collection using an Acoustic Doppler Current Profiler (53), colocated USGS stream gauges (, or current meter via wading following USGS protocol (54). River discharge is reported as cubic meters per second.

Land Use.

Watersheds were delineated and then land use and septic system statistics were calculated for each watershed using Esri ArcMap GIS software (Table S4). The spatial analyst watershed tool was used to develop surface watersheds for each sampling point at 1 arc-second. Two watersheds were defined for each river site, referred to in this paper as full watersheds, which include the entire upstream drainage area (n = 64), and reduced watersheds, which only include drainage areas upstream of the sampling site to the nearest lake, reservoir, or pond (n = 52). The full watershed analysis (n = 64) included 12 sites that were at or near lake outlets, resulting in significantly smaller watersheds (average = 108 km2) than the other 52 watersheds (average = 366 km2). These 12 sites were removed in the reduced watershed analysis because it was originally hypothesized that longer retention time in the lentic water systems would likely reduce microbe concentrations owing to environmental decay. A digital map of land cover from 30-m resolution Landsat imagery and the National Land Cover Database (NLCD 2006; was used to define land use in each watershed and buffer. Land use was categorized using the NLCD classification system with 16 categories and seven categories using the Anderson Level 1 Land Cover Classification System (55); Table S5 describes the Anderson classifications and equivalent NLCD categories. A 60-m riparian buffer was applied to streams in both full and reduced watersheds because land parcels are generally located adjacent to roads and require a buffer between surface waters and septic tanks. The average septic system setback from surface waters in Michigan is 15 m. Additionally, the 60-m riparian buffer ensured all riparian land uses were accounted for if the land use/river/septic system GIS layers were not completely matched under the 30-m resolution.
Table S4.
Land use summary for full watersheds, reduced watersheds, and 60-m buffers
Scale parameterMinimumMeanMaximumSD
Full watershed    
 Area, km22.881,37712,8542,431
 Estimated septic systems0.019,579246,03341,902
 Septic density, no. per km20.015.7113.719.5
 Population density, persons per km271311,597281
 Population density on WWTP0981,589279
 Population density on septic71141,567236
 Impervious surface, km20.415.1356.99.8
 Urban, %3.1616.799.70.21
 Agriculture, %0.02874.20.22
 Open, %0.06.9720.10.05
 Forest, %0.1931.470.70.18
 Water, %0.02.6823.70.04
 Wetland, %0.071448.30.1
 Barren, %0.00.312.450.0
Reduced watershed    
 Area, km20.153664,065630
 Estimated septic systems0.022,299246,03345,592
 Septic density, no. per km20.016.111418.8
 Population density, persons per km271471,597306
 Population density on WWTP01151,589307
 Population density on septic71241,567259
 Impervious surface, km20.47.555.913.6
 Urban, %3.121.399.726.2
 Agriculture, %
 Open, %0.06.1618.85.27
 Forest, %0.02971.219.5
 Water, %0.01.6115.43.32
 Wetland, %0.013.947.912.1
 Barren, %0.00.7731.13.87
60-m riparian buffer    
 Area, km20.064649778.3
 Estimated septic systems0.02,67228,2565,596
 Septic density, no. per km20.01510421
 Population density, persons per km271241,567259
 Population density on WWTP0000
 Population density on septic03210524
 Impervious surface, km20.05.542.79.64
 Urban, %0.018.998.323
 Agriculture, %0.021.472.121.7
 Open, %0.03.6419.43.8
 Forest, %
 Water, %0.06.0963.212
 Wetland, %0.027.376.317.9
 Barren, %0.00.5924.93.12
Entire upstream drainage area including lakes (n = 64).
Watersheds were defined as the total upstream area to the nearest lake draining to each respective river sampling point (n = 52).
Table S5.
Anderson level 1 land use classifications and descriptions
ClassificationDescriptionExamplesAssociated NLCD classifications (code)
UrbanIntensive use with structures covering the majority of landCities, shopping, industrial, and commercial centersDeveloped open space (21)
Developed low intensity (22)
Developed medium intensity (23)
Developed high intensity (24)
AgriculturalLand used for food productionPasture, row crop, orchards, confined feeding operationsPasture and hay (81)
Cultivated crops (82)
OpenPredominant natural vegetation is grass or shrubsHerbaceous, shrub, brushShrub and scrub (52)
Grassland and herbaceous (71)
ForestClosed canopy at least 10% from timber quality treesDeciduous, coniferous, and mixed forestedDeciduous forest (41)
Evergreen forest (42)
Mixed forest (43)
WaterArea predominantly covered by water throughout yearStreams, lakes, bays, and reservoirsWater (11)
WetlandLand with water table near land surface for significant portion of yearMarshes, swamps, perched bogsWoody wetland (90)
Emergent herbaceous wetland (95)
BarrenLand that has less than one-third vegetative coverBeaches, exposed rock, gravel pitsBarren (31)
A map of households that likely use on-site septic systems to treat wastewater was previously developed for this study region (35). Briefly, septic system totals and locations were estimated following the cumulative examination of WWTP infrastructure, incorporated municipality areas, household location according to 2010 census blocks, 2006 NLCD and road layers, and residential drinking water well information. Estimated septic system numbers (per watershed) and densities (per square kilometer) in each watershed and 60-m-wide buffer around surface water bodies were calculated for the 64 river systems.
Estimates of total population and population relying on WWTPs for water treatment were performed for each watershed and 60-m buffer. The total population in each watershed was estimated by multiplying the number of households (based on 2010 census data, described above during septic system estimates) by the average household size in each census block. The number of people relying on WWTPs was estimated by overlaying census block information and wastewater treatment plant service area boundaries. Additionally, the USEPA Discharge Monitoring Report (DMR) Pollutant Loading Tool ( was used to estimate the ratio of average annual WWTP effluent to measured baseflow. A full description of this method is provided in Supporting Information.

Statistical Analysis.

A constant value of 1 was added to E. coli and B. theta concentrations before log transformation and analysis. Soil hydraulic conductivity values were log10-transformed before statistical analyses. Spearman correlation tests were used to examine relationships among physical, geochemical, and microbial measurements. Descriptive statistics were performed using IBM SPSS Statistics software (Version 19.0) with a significance threshold of (α) 0.01.
CART analysis was used to compare E. coli and B. theta (dependent variables) data to the independent geochemical, hydrologic, environmental, and land use variables. CART has been used to investigate pathogenic bacteria and parasite relationships with environmental and land use factors (56), to classify lakes based on chemistry and clarity (57), and to predict the occurrence of fecal indicator bacteria with respect to physiochemical variables (58). CART was selected because it allows for robust nonlinear model development using multiple potentially interacting predictor variables (59) that splits dependent variables into categories based on the influence of independent variables. Following previously published methods (56, 57), CART recursively split dependent variables using a recursive partitioning algorithm (rpart) and a 10-fold cross-validation criterion. The 10-fold cross-validation approach breaks all data into 10 subsets and calculates the split based on 9 of the 10 subsets. This method is used for each group until reaching a minimum stopping criterion of five observations per subgroup.
Fully developed CART outputs often required pruning to remove insignificant splits and ensure significant variable associations were not missed due to the splitting and stopping criteria (60). We first pruned CART outputs using the 1-SE rule (6163), and, if needed, a subsequent pruning step was performed if splits did not reduce error by 5% or more. This rule minimized the cross-validated error of the model, which has been shown to produce optimal sized trees that are stable across replications (61, 64).
Detailed CART outputs were investigated to identify competitor and surrogate variables for each node. Competitor splits are ranked according to the reduction in model error from other potential splits, whereas surrogate splits are ranked according to how similar the resultant groups are relative to the primary split groups. Model accuracy was assessed by summing the proportional reduction of error from each split. All CART analyses were performed using the R software system (R Foundation for Statistical Computing).
To compare concentrations of the two organisms at each site relative to the average concentration of each organism, the Z-score of each sample was calculated. Z-scores [(observed – mean)/SD] for E. coli and B. theta were calculated in R using the “scale (dataset, center=TRUE, scale=TRUE)” command. This is defined as the sample concentration minus the mean of the population divided by the SD of the population. In this case, the Z-score of the log-transformed concentration was calculated. Positive Z-scores indicate samples with concentrations greater than the population mean, whereas negative Z-scores indicate the opposite. A CART analysis of the difference in Z-scores, calculated as E. coliB. theta, was then performed using the same set of predictor variables in the single-organism models.


We thank Steve Hamilton, Emily Luscz, Bobby Chrisman, Rebecca Ives, Sarah AcMoody, and Seth Hunt for vital support during this project and Drs. Shannon Briggs and Jon Bartholic for their technical support during the development of this manuscript. Partial funding for this project came from National Oceanic and Atmospheric Administration Great Lakes Environmental Research Laboratory Grant “Land Use Change and Agricultural Lands Indicators and Tipping Points.” Partial support was also provided by the Environmental Protection Agency Grants 112013 and 118539.

Supporting Information

Supporting Information (PDF)
Supporting Information


L Breuer, et al., Assessing the impact of land use change on hydrology by ensemble modeling (LUCHEM). I: Model intercomparison with current land use. Adv Water Resour 32, 129–146 (2009).
C Vörösmarty, S Dork, Anthropogenic disturbance of the terrestrial water cycle. Bioscience 50, 753–765 (2000).
S Germer, et al., Implications of long-term land-use change for the hydrology and solute budgets of small catchments in Amazonia. J Hydrol (Amst) 364, 349–363 (2009).
CL Arnold, CJ Gibbons, Impervious surface coverage: The emergence of a key environmental indicator. J Am Plann Assoc 62, 243–258 (1996).
M Falkenmark, Water—A reflection of land use: Understanding of water pathways and quality genesis. Int J Water Resour Dev 27, 13–32 (2011).
DC Evers, et al., Mercury in the Great Lakes region: Bioaccumulation, spatiotemporal patterns, ecological risks, and policy. Ecotoxicology 20, 1487–1499 (2011).
DM Katz, FJ Watts, ER Burroughs, B Daniel, MK Associate, Effects of surface roughness and rainfall impact on overland flow. J Hydraul Eng 121, 546–553 (1995).
DK Ray, JM Duckles, BC Pijanowski, The impact of future land use scenarios on runoff volumes in the Muskegon River Watershed. Environ Manage 46, 351–366 (2010).
Y Cha, CA Stow, KH Reckhow, C DeMarchi, TH Johengen, Phosphorus load estimation in the Saginaw River, MI using a Bayesian hierarchical/multilevel model. Water Res 44, 3270–3282 (2010).
T Kistemann, et al., Microbial load of drinking water reservoir tributaries during extreme rainfall and runoff. Appl Environ Microbiol 68, 2188–2197 (2002).
M Wong, et al., Evaluation of public health risks at recreational beaches in Lake Michigan via detection of enteric viruses and a human-specific bacteriological marker. Water Res 43, 1137–1149 (2009).
C Almeida, F Soares, Microbiological monitoring of bivalves from the Ria Formosa Lagoon (south coast of Portugal): A 20 years of sanitary survey. Mar Pollut Bull 64, 252–262 (2012).
PA Soranno, et al., Quantifying regional reference conditions for freshwater ecosystem management: A comparison of approaches and future research needs. Lake Reservior Manage 27, 138–148 (2011).
RB Grayson, CJ Gippel, BL Finlayson, BT Hart, CJ Gippep, Catchment-wide impacts on water quality: The use of “snapshot” sampling during stable flow. J Hydrol (Amst) 199, 121–134 (1997).
M Vega, R Pardo, E Barrado, L Debán, Assessment of seasonal and polluting effects on the quality of river water by exploratory data analysis. Water Res 32, 3581–3592 (1998).
E Wheeler, J Burke, E Hagan, EW Alm, Persistence and potential growth of the fecal indicator bacteria, Escherichia coli, in shoreline sand at Lake Huron. J Great Lakes Res 32, 401–405 (2006).
LA Peed, et al., Combining land use information and small stream sampling with PCR-based methods for better characterization of diffuse sources of human fecal pollution. Environ Sci Technol 45, 5652–5659 (2011).
V Furtula, et al., Inorganic nitrogen, sterols and bacterial source tracking as tools to characterize water quality and possible contamination sources in surface water. Water Res 46, 1079–1092 (2012).
H Yampara-Iquise, G Zheng, JE Jones, CA Carson, Use of a Bacteroides thetaiotaomicron-specific alpha-1-6, mannanase quantitative PCR to detect human faecal pollution in water. J Appl Microbiol 105, 1686–1693 (2008).
S Srinivasan, A Aslan, I Xagoraraki, E Alocilja, JB Rose, Escherichia coli, enterococci, and Bacteroides thetaiotaomicron qPCR signals through wastewater and septage treatment. Water Res 45, 2561–2572 (2011).
A Aslan, JB Rose, Evaluation of the host specificity of Bacteroides thetaiotaomicron alpha-1-6, mannanase gene as a sewage marker. Lett Appl Microbiol 56, 51–56 (2013).
BA Layton, et al., Performance of human fecal anaerobe-associated PCR-based assays in a multi-laboratory method evaluation study. Water Res 47, 6897–6908 (2013).
TB Reynoldson, RH Norris, VH Resh, KE Day, DM Rosenberg, The reference condition: A comparison of multimetric and multivariate approaches to assess water-quality impairment using benthic macroinvertebrates. J N Am Benthol Soc 16, 833–852 (1997).
SP Davies, SK Jackson, The biological condition gradient: A descriptive model for interpreting change in aquatic ecosystems. Ecol Appl 16, 1251–1266 (2006).
DM Carlisle, CP Hawkins, MR Meador, M Potapova, J Falcone, Biological assessments of Appalachian streams based on predictive models for fish, macroinvertebrate, and diatom assemblages. J N Am Benthol Soc 27, 16–37 (2008).
LL Tiefenthaler, ED Stein, GS Lyon, Fecal indicator bacteria (FIB) levels during dry weather from Southern California reference streams. Environ Monit Assess 155, 477–492 (2009).
; United States Environmental Protection Agency Recreational Water Quality Criteria (US Environmental Protection Agency, Washington, DC, 2012).
TJ Wade, et al., Rapidly measured indicators of recreational water quality are predictive of swimming-associated gastrointestinal illness. Environ Health Perspect 114, 24–28 (2006).
LJ Wymer, TJ Wade, AP Dufour, Equivalency of risk for a modified health endpoint: A case from recreational water epidemiology studies. BMC Public Health 13, 459 (2013).
M Byappanahalli, M Fowler, D Shively, R Whitman, Ubiquity and persistence of Escherichia coli in a Midwestern coastal stream. Appl Environ Microbiol 69, 4549–4555 (2003).
MN Byappanahalli, RL Whitman, DA Shively, MJ Sadowsky, S Ishii, Population structure, persistence, and seasonality of autochthonous Escherichia coli in temperate, coastal forest soil from a Great Lakes watershed. Environ Microbiol 8, 504–513 (2006).
MB Nevers, RL Whitman, WE Frick, Z Ge, Interaction and influence of two creeks on Escherichia coli concentrations of nearby beaches: Exploration of predictability and mechanisms. J Environ Qual 36, 1338–1345 (2007).
TJ Wade, N Pai, JNS Eisenberg, Jr JM Colford, Do U.S. Environmental Protection Agency water quality guidelines for recreational waters prevent gastrointestinal illness? A systematic review and meta-analysis. Environ Health Perspect 111, 1102–1109 (2003).
RC Russo, Development of marine water quality criteria for the USA. Mar Pollut Bull 45, 84–91 (2002).
EC Luscz, AD Kendall, DW Hyndman, High resolution spatially explicit nutrient source models for the Lower Peninsula of Michigan. J Great Lakes Res 41, 618–629 (2015).
R Sowah, H Zhang, D Radcliffe, E Bauske, MY Habteselassie, Evaluating the influence of septic systems and watershed characteristics on stream faecal pollution in suburban watersheds in Georgia, USA. J Appl Microbiol 117, 1500–1512 (2014).
CW Oliver, et al., Quantifying the contribution of on-site wastewater treatment systems to stream discharge using the SWAT model. J Environ Qual 43, 539–548 (2014).
M Carrillo, E Estrada, TC Hazen, Survival and enumeration of the fecal indicators Bifidobacterium adolescentis and Escherichia coli in a tropical rain forest watershed. Appl Environ Microbiol 50, 468–476 (1985).
Barry-Eaton District Health Department (2011) Time of Sale or Transfer (TOST) Program: The first three years. Available at THREE YEARS OF TOST.pdf. Accessed June 8, 2015.
; Michigan Department of Environmental Quality Michigan’s Nonpoint Source Program Plan (Michigan Department of Environmental Quality, Lansing, MI, 2009).
MY Habteselassie, et al., Tracking microbial transport through four onsite wastewater treatment systems to receiving waters in eastern North Carolina. J Appl Microbiol 111, 835–847 (2011).
S Ishii, MJ Sadowsky, Escherichia coli in the Environment: Implications for Water Quality and Human Health. Microbes Environ 23, 101–108 (2008).
H Kobayashi, T Pohjanvirta, S Pelkonen, Prevalence and characteristics of intimin- and Shiga toxin-producing Escherichia coli from gulls, pigeons and broilers in Finland. J Vet Med Sci 64, 1071–1073 (2002).
LR Fogarty, SK Haack, MJ Wolcott, RL Whitman, Abundance and characteristics of the recreational water quality indicator bacteria Escherichia coli and enterococci in gull faeces. J Appl Microbiol 94, 865–878 (2003).
J Wilkinson, A Jenkins, M Wyer, D Kay, Modelling faecal coliform dynamics in streams and rivers. Water Res 29, 847–855 (1995).
E Ballesté, AR Blanch, Persistence of Bacteroides species populations in a river as measured by molecular and culture techniques. Appl Environ Microbiol 76, 7608–7616 (2010).
T Stiles, Lightening a new candle: A new long-term vision for the Clean Water Act Section 303(d) program. Water Resour IMPACT 16, 3–8 (2014).
CJ Poor, JJ McDonnell, The effects of land use on stream nitrate dynamics. J Hydrol (Amst) 332, 54–68 (2007).
KG Wayland, et al., Identifying relationships between baseflow geochemistry and land use with synoptic sampling and R-mode factor analysis. J Environ Qual 32, 180–190 (2003).
AB Boehm, et al., Performance of forty-one microbial source tracking methods: A twenty-seven lab evaluation study. Water Res 47, 6812–6828 (2013).
LC Van De Werfhorst, B Sercu, PA Holden, Comparison of the host specificities of two bacteroidales quantitative PCR assays used for tracking human fecal contamination. Appl Environ Microbiol 77, 6258–6260 (2011).
VJ Harwood, C Staley, BD Badgley, K Borges, A Korajkic, Microbial source tracking markers for detection of fecal contamination in environmental waters: Relationships between pathogens and human health outcomes. FEMS Microbiol Rev 38, 1–40 (2014).
DS Mueller, CR Wagner, MS Rehmel, KA Oberg, F Rainville, Measuring discharge with acoustic Doppler current profilers from a moving boat. Book 3: Applications of Hydraulics (US Geological Survey, Reston, VA), Sect A, Chap 22, Version 2.0, p 95. (2013).
RD Jarrett, Wading measurements of vertical velocity profiles. Geomorphology 4, 243–247 (1991).
JR Anderson, EE Hardy, JT Roach, RE Witmer A Land Use and Land Cover Classification System for Use with Remote Sensor Data (US Government Printing Office, Washington, DC, 1976).
G Wilkes, et al., Associations among pathogenic bacteria, parasites, and environmental and land use factors in multiple mixed-use watersheds. Water Res 45, 5807–5825 (2011).
SL Martin, PA Soranno, MT Bremigan, KS Cheruvelil, Comparing hydrogeomorphic approaches to lake classification. Environ Manage 48, 957–974 (2011).
HK Bae, BH Olson, K-L Hsu, S Sorooshian, Classification and regression tree (CART) analysis for indicator bacterial concentration prediction for a Californian coastal area. Water Sci Technol 61, 545–553 (2010).
G De’ath, Multivariate regression trees: A new technique for modeling species–environment relationships. Ecology 83, 1105–1117 (2002).
SC Lemon, J Roy, MA Clark, PD Friedmann, W Rakowski, Classification and regression tree analysis in public health: methodological review and comparison with logistic regression. Ann Behav Med 26, 172–181 (2003).
L Breiman, J Friedman, R Olshen, C Stone Classification and Regression Trees (Chapman and Hall/CRC, 1st Ed, New York, 1984).
WN Venables, BD Ripley Modern Applied Statistics With S-Plus (Springer, New York, 2001).
G De’ath, K Fabricius, Classification And Regression Trees: A powerful yet simple technique for ecological data analysis. Ecology 81, 3178–3192 (2000).
F Questier, R Put, D Coomans, B Walczak, Y Vander Heyden, The use of CART and multivariate regression trees for supervised and unsupervised feature selection. Chemom Intell Lab Syst 76, 45–54 (2005).
R Wetzel, G Likens Limnological Analyses (Springer, 3rd Ed, New York, 2000).
SK Hamilton, DA Bruesewitz, GP Horst, DB Weed, O Sarnelle, Biogenic calcite–phosphorus precipitation as a negative feedback to lake eutrophication. Can J Fish Aquat Sci 66, 343–350 (2009).
W Crumpton, T Isenhart, P Mitchell, Nitrate and organic N analyses with second-derivative spectroscopy. Limnol Oceanogr 37, 907–913 (1992).
Clesceri LS, Greenberg AE, Eaton AD, eds (1998) Standard Methods for the Examination of Water and Wastewater (United Book, Baltimore), 20th Ed.

Information & Authors


Published in

Go to Proceedings of the National Academy of Sciences
Go to Proceedings of the National Academy of Sciences
Proceedings of the National Academy of Sciences
Vol. 112 | No. 33
August 18, 2015
PubMed: 26240328


Submission history

Published online: August 3, 2015
Published in issue: August 18, 2015


  1. Escherichia coli
  2. Bacteroides thetaiotaomicron
  3. baseflow
  4. reference conditions
  5. septic system


We thank Steve Hamilton, Emily Luscz, Bobby Chrisman, Rebecca Ives, Sarah AcMoody, and Seth Hunt for vital support during this project and Drs. Shannon Briggs and Jon Bartholic for their technical support during the development of this manuscript. Partial funding for this project came from National Oceanic and Atmospheric Administration Great Lakes Environmental Research Laboratory Grant “Land Use Change and Agricultural Lands Indicators and Tipping Points.” Partial support was also provided by the Environmental Protection Agency Grants 112013 and 118539.


*This Direct Submission article had a prearranged editor.



Department of Fisheries and Wildlife, Michigan State University, East Lansing, MI 48824;
Sherry L. Martin
Department of Geological Sciences, Michigan State University, East Lansing, MI 48824
Anthony D. Kendall
Department of Geological Sciences, Michigan State University, East Lansing, MI 48824
David W. Hyndman
Department of Geological Sciences, Michigan State University, East Lansing, MI 48824
Joan B. Rose
Department of Fisheries and Wildlife, Michigan State University, East Lansing, MI 48824;


To whom correspondence should be addressed. Email: [email protected].
Author contributions: M.P.V., S.L.M., A.D.K., and D.W.H. designed research; M.P.V., S.L.M., and A.D.K. performed research; M.P.V. and A.D.K. contributed new reagents/analytic tools; M.P.V., S.L.M., and A.D.K. analyzed data; and M.P.V., S.L.M., A.D.K., D.W.H., and J.B.R. wrote the paper.

Competing Interests

The authors declare no conflict of interest.

Metrics & Citations


Note: The article usage is presented with a three- to four-day delay and will update daily once available. Due to ths delay, usage data will not appear immediately following publication. Citation information is sourced from Crossref Cited-by service.

Citation statements



If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by


    View Options

    View options

    PDF format

    Download this article as a PDF file


    Get Access

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Personal login Institutional Login

    Recommend to a librarian

    Recommend PNAS to a Librarian

    Purchase options

    Purchase this article to get full access to it.

    Single Article Purchase

    Linking fecal bacteria in rivers to landscape, geochemical, and hydrologic factors and sources at the basin scale
    Proceedings of the National Academy of Sciences
    • Vol. 112
    • No. 33
    • pp. 10069-E4635







    Share article link

    Share on social media