New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology
Estimating species distribution and abundance in river networks using environmental DNA
Contributed by Andrea Rinaldo, September 26, 2018 (sent for review August 15, 2018; reviewed by Gentile Francesco Ficetola and Christopher Jerde)

Significance
Organisms leave traces of DNA in their environment [environmental DNA (eDNA)], such as cells in mucus or feces. When extracted from water or soil, eDNA can be used to track the presence of a target species or the composition of entire communities. In rivers, eDNA dynamics are modulated by transport and decay. Here, we use hydrologically based models to reconstruct the upstream distribution and abundance of target species throughout a river network from eDNA measurements. We validate our method by estimating the catchment-wide biomass distribution of a sessile invertebrate and its parasite, causing disease in salmonids. This work unlocks the power of eDNA for monitoring biodiversity across broad geographies in a way hitherto unfeasible with traditional survey approaches.
Abstract
All organisms leave traces of DNA in their environment. This environmental DNA (eDNA) is often used to track occurrence patterns of target species. Applications are especially promising in rivers, where eDNA can integrate information about populations upstream. The dispersion of eDNA in rivers is modulated by complex processes of transport and decay through the dendritic river network, and we currently lack a method to extract quantitative information about the location and density of populations contributing to the eDNA signal. Here, we present a general framework to reconstruct the upstream distribution and abundance of a target species across a river network, based on observed eDNA concentrations and hydro-geomorphological features of the network. The model captures well the catchment-wide spatial biomass distribution of two target species: a sessile invertebrate (the bryozoan Fredericella sultana) and its parasite (the myxozoan Tetracapsuloides bryosalmonae). Our method is designed to easily integrate general biological and hydrological data and to enable spatially explicit estimates of the distribution of sessile and mobile species in fluvial ecosystems based on eDNA sampling.
Environmental DNA (eDNA), present as loose fragments, as shed cells (1, 2), or in microscopic organisms (3, 4), can be extracted from matrices such as water or soil and used to track the presence of target species or the composition of entire communities (5, 6). Approaches using eDNA for qualitative species detection have proved their value in management and conservation, improving the measurement of biodiversity in a replicable and consistent manner (7) and facilitating the detection of rare, invasive, or parasitic species (8⇓⇓⇓⇓–13).
Environmental DNA in river water carries a record of the species present upstream, but the interpretation of this signal is a complex issue (14). Once released to the environment, eDNA undergoes selective decay. Nucleic acids incur progressive damage [e.g., due to biological activity, temperature, or pH (15, 16)] during hydrological advection, retention, and resuspension (17, 18). These processes result in alterations that affect eDNA detection in environmental samples. The magnitude of the decay is highly dependent on the nature of the flow regime and the substrate type (19). Furthermore, eDNA has polydisperse properties due to its origin from diverse organic sources (e.g., spores, cells, tissues, feces), which complicates the evaluation of decay rates (20). The eDNA sampled at any point within a dendritic network of sources is the outcome of diffuse eDNA release from points upstream, modified by decay processes during transport that are governed by network connectivity, in which each path to the observation point may be described by different hydro-morphological conditions. As a result, while it is straightforward to link a positive PCR test with the presence of the target species at some (unknown) distance upstream, quantification of species densities and the location of populations is currently impossible because, besides a number of potentially confounding factors affecting eDNA shedding [e.g., animal behavior, movement, physiology, and size (21, 22)], it requires consideration of the effects of the dynamics of eDNA transport along river branches and the deconvolution of the hierarchical aggregation of the various network branches. Here, we establish a generally applicable framework to interpret quantitative eDNA point measurements in rivers and relate them to the spatial distribution of the DNA sources, jointly with estimates of the density distribution of the target species throughout the river basin.
The proposed framework stems from fundamental mass balance relationships. It is intended for use in river networks discretized into “nodes,” i.e., river stretches of suitable length within which hydrological conditions, as well as the target species density and hence its eDNA production, can be considered homogeneous (23). Within such nodes, the basis for the spatially explicit model contrasting measured eDNA concentrations is given by
While not all of the above assumptions are equally valid in the general case (see SI Appendix for a digression on caveats on model assumptions), the framework represents a flexible general theory of spatially explicit eDNA source tracking. Some of these assumptions may be easily relaxed. For instance, the purely convective treatment of the decay of genetic material outlined above may be the subject of more refined formulations of the transport problem. In particular, travel times from source to measurement site can be made explicitly dependent on the hydrodynamic and geomorphological dispersion induced by the hierarchical nature of the network (27).
To derive estimates of biomass distribution, we coupled the general source area model described above with a species distribution model. This combined approach provides a versatile method to capture the influence of the catchment-wide ecological, hydro-morphological, or geological drivers promoting eDNA production (and reflecting species density) within the defined river stretches. Local eDNA production can be expressed by means of the exponential link
We tested the framework with joint field measurements of eDNA concentrations of the myxozoan parasite Tetracapsuloides bryosalmonae and its primary host, the freshwater bryozoan Fredericella sultana, across various locations within the Wigger watershed (Switzerland) (Fig. 1). T. bryosalmonae is the causative agent of proliferative kidney disease (PKD), a high-mortality disease affecting salmonid fish populations. PKD is recognized as one of the leading causes of declines in brown trout populations in Europe, it affects diverse salmonid populations in North America, and it is a major aquaculture disease (29, 30). The water samples used for eDNA detection of both F. sultana and T. bryosalmonae were collected at roughly monthly intervals at 15 sites during 12 mo (one 500-mL sample per sampling occasion and site) (Fig. 1C). T. bryosalmonae eDNA is likely to be largely derived from spores shed into the environment. Parasite spores, released into water by infected bryozoans, infect brown trout through skin and gills and proliferate in the kidney. To complete the life cycle, spores infective to bryozoans are excreted in the urine of infected fish. These two types of spores are genetically indistinguishable yet differentiated in terms of function, which poses further challenges for modeling. The T. bryosalmonae eDNA concentration may thus be a product of the genomic contents of the two types of spore originating from very different transport sources, i.e., from an immobile source (bryozoans) coupled with a mobile source (fish). In this particular case, a comparative analysis of field-measured eDNA for both F. sultana (sessile source of eDNA) and T. bryosalmonae (eDNA that could jointly originate from sessile and mobile hosts) will prove particularly instructive as a demonstration of the potential of the proposed framework.
Environmental DNA sampling of F. sultana and T. bryosalmonae. (A) Map and color-coded digital elevation map of the study region showing the extracted river network and location of the eDNA sampling sites. (B) Location of the study region within Switzerland. (C) Measured eDNA concentrations of F. sultana and T. bryosalmonae at the 15 sampling sites during the period May 2014–May 2015 (LOQ, limit of quantification).
We implemented and calibrated the model as described in Materials and Methods. Note that we explicitly accounted for the possibility that samples with low eDNA concentration may be interpreted as zeros (Fig. 1C) by introducing the nondetection probability
Comparison between the observed cumulative frequencies of measured eDNA concentrations at the 15 sampling sites for F. sultana (Fs; purple) and T. bryosalmonae (Tb; green) and the cumulative distribution function obtained by the model. α values (color coded to match the solid lines) indicate the confidence level at which, according to a two-sample Kolmogorov–Smirnov test, the null hypothesis that the two samples (modeled and observed) come from the same distribution cannot be rejected. Higher values of α indicate a better fit. Tested values for α were 0.05, 0.01, 0.005, 0.001. N/A, not applicable (i.e., at site 15 no positive values of eDNA concentration for Tb were detected). Note that some of the cases where the two distributions are not different at
Correlation between eDNA concentrations, predicted species density, and covariates. (A) Correlation between observed and modeled eDNA concentrations for F. sultana (Fs; purple) and T. bryosalmonae (Tb; green). Each circle corresponds to a sampling site. x and y values, respectively, are taken from the medians of the observed and modeled cumulative distribution functions displayed in Fig. 2. (B) Correlation between medians of predicted F. sultana and T. bryosalmonae eDNA production. Each circle represents one river stretch. The trend line is displayed in red. Pearson’s correlation coefficients r are also reported. (C) Correlations between medians of predicted eDNA production for the two target species and the covariates representing the contributing catchment area and fraction of the upstream catchment covered by moraine. Correlations with other covariates are reported in SI Appendix, Fig. S2B; maps of covariate values are presented in SI Appendix, Fig. S3. (D) Posterior distributions for the decay time τ and values of the β coefficients associated with the covariates reported on the axis labels. The posterior distributions for the other covariates are shown in SI Appendix, Fig. S2A.
Predicted maps of eDNA production for F. sultana (Fig. 4A) identify the southeastern portion of the watershed as a hotspot for bryozoans. This is mainly due to the positive correlation between the presence of moraines and the production of bryozoan eDNA (Fig. 3 C and D). This correlation was uncovered in previous work (23), although it is not clear yet what the underlying causes of this positive association are. The upstream bryozoan reservoir explains the eDNA concentration patterns observed at the downstream sites (Fig. 1C), because the estimated values of the decay time (Fig. 3D) allow for the detection of eDNA material at distances comparable with the maximum length from source to outlet of the river system. The predicted distribution of T. bryosalmonae (Fig. 4B) mirrors that of F. sultana in headwaters and upper sites, while higher values of production are estimated toward the outlet. This is shown by a positive shift in the posterior distribution of the parameter expressing the effect of contributing area (Fig. 3 C and D). The correlation between predicted densities of F. sultana and T. bryosalmonae is strong (Fig. 3B) and suggests that bryozoans release disproportionately more spores compared with fish hosts. Thus, we suggest that a full description of the spatial distribution of PKD-infected fish (23) might be unnecessary to understand the bulk of the distribution of the parasite sources when the eDNA signal is dominated by locally abundant colonies of infected bryozoans like in the case at hand. We argue that the much stronger correlation observed between predicted densities of T. bryosalmonae and contributing area in comparison with that for F. sultana (Fig. 3C) posits that the density of overtly infected (i.e., spore producing) bryozoans tends to increase along downstream directions. In fact, the positive correlation between PKD prevalence and total contributing area in river networks has been shown to be a byproduct of network connectivity (32). The fact that T. bryosalmonae is mostly shed by bryozoans rather than fish seems plausible in this specific host–parasite system, as parasite maturation in fish kidney tubuli is observed relatively rarely, compared with the prolific spore production within large parasite sacs inside the bryozoan host (33). We note further that the spatial match of the bryozoan and fish populations is unlikely to drive this relationship as the biomass of fish is expected to be higher in deeper, more downstream sections (34). Finally, estimated median values of the decay time (Fig. 3D) were 4.0 h for T. bryosalmonae (with a 25–75% range of the posterior distribution of 2.7–7.0 h) and 6.9 h (25–75% range: 5.0–11.1 h) for F. sultana, corresponding to decay lengths of 14 km (25–75% range: 10–25 km) and 25 km (25–75% range: 18–40 km), respectively (obtained by assuming an average flow velocity of 1 ms−1), in agreement with previous findings (19).
Predicted species distributions. (A–D) Maps of predicted production p (A and B) and concentration
Our framework proved capable of interpreting both eDNA data incidentally shed from benthic populations (F. sultana, with likely sources of eDNA from fecal pellets and sloughed cells) and eDNA from spores released into the water (T. bryosalmonae). Although different forms of eDNA may be differently impacted by environmental factors such as temperature and pH, the choice of formulation involving a single parameter expressing the decay time for both species appeared satisfactory for capturing the integrated eDNA transport dynamics at the catchment scale. Further considerations on this aspect are presented in SI Appendix. Another strength of our approach is the possibility of applying adequate parameterizations for p in an explicit manner (as in Eq. 2), to accommodate the nature of the link between the target species density and its biological and environmental filters along hydrologic pathways, such as the environmental conditions or the density of species with which it interacts. Such parameterization provides a simple and versatile means to assimilate field data and to integrate population or species distribution models.
Accurate field validations of the current assumptions are needed to generalize this framework, and this could be achieved by relatively simple experimental designs. For instance, the displacement and decay of genetic material from nonnative known biomasses placed in well-differentiated positions (say, within a catchment where hydrologic and geomorphologic drivers are known) could be key. Subsequent sampling at downstream sites, where eDNA would be contributed by sources at known distances, could then be used to assess the strengths and weaknesses of each assumption underlying the proposed approach.
Tracking the source area and the local biomass density of target species via downstream eDNA measurement is possible, provided that a suitable spatially explicit framework is used to interpret the field data. Key is accounting for the filtering produced by the progressive damage occurring during hydrological transport and harnessing it to recover spatial information on species distributions. The integration of quantitative eDNA measurements, hydro-geomorphological scaling, and ecological models presented here reveals another direction in ecohydrological studies by unlocking the huge potential of remote monitoring using eDNA.
Materials and Methods
eDNA Data Collection.
Stream water samples were collected in 15 locations (Fig. 1) along the river network of the Wigger watershed, Switzerland. For each site, a total of 21 500-mL samples were taken at approximately biweekly (or monthly during December, January, and February) intervals (except site 5, a connected pond which was artificially drained after 12 samples were taken).
Presterilized (10% bleach followed by UV-B treatment) plastic bottles were used to collect water from the river by submerging the bottle with a gloved hand. The samples were transported to the laboratory on ice and filtered using gentle vacuum within the same day onto 5-cm diameter, 0.45-μm pore size individually packaged sterile membrane filters (Merck Millipore). A vacuum pump with a borosilicate glass filtration setup was used and sterilized between each sample in 10% bleach followed by three clean water rinses. Negative controls were created by filtering MilliQ water through a sterile filter at the start and end of each filtration session, as well as once during the filtration (after sample 7). Filter papers were placed in 2-mL bead beating tubes (obtained from the kit described below) and frozen at −80 oC until extraction. Before extraction, filter papers were cut with sterilized scissors to break them up. eDNA was extracted from all filter papers, including controls, using a PowerSoil DNA kit (MO BIO Laboratories) in a dedicated clean laboratory (free of PCR products). The kit includes a bead beating step and a separate inhibitor removal step. The eDNA was eluted in 60 μL of Solution C6 and subsequently preserved at −20 oC. Samples were removed from the freezer only for analysis and remained at room temperature for a maximum of 2 h.
Details on F. sultana and T. bryosalmonae eDNA measurement and characterization of the study area are reported in SI Appendix.
Choice of Covariates and Model Settings.
Details on covariates and model settings are reported in SI Appendix. Chosen covariates were checked for multicollinearity. All variance inflation factors for the five considered covariates were below the rule-of-thumb threshold (35) of 10. The three geological covariates were obtained from the vectorized geological map of Switzerland provided by the Swiss Federal Office of Topography (Swisstopo). Covariates were normalized (i.e., linearly transformed into vectors within the range [−1; 1]). SI Appendix, Fig. S3 shows the values of the covariates plotted on the maps of the river network.
The average velocity
Calibration Algorithm.
As sampled eDNA concentrations for both F. sultana and T. bryosalmonae did not show a clear temporal pattern (Fig. 1C), measured values were considered as realizations of a random variable whose distribution is constant in time. Measured eDNA concentrations
Acknowledgments
The authors thank Prof. Carlo Gaetan (Università Cà Foscari, Venice) for advice on the statistical calculations. The authors acknowledge the support provided by the Swiss National Science Foundation through the Sinergia project CRSII3_147649: “Temperature driven emergence of Proliferative Kidney Disease in salmonid fish - role of ecology, evolution and immunology for aquatic diseases in riverine landscapes.”
Footnotes
- ↵1To whom correspondence may be addressed. Email: andrea.rinaldo{at}epfl.ch or luca.carraro{at}epfl.ch.
Author contributions: L.C., H.H., J.J., E.B., and A.R. designed research; L.C. and H.H. performed research; L.C., J.J., E.B., and A.R. analyzed data; and L.C., H.H., J.J., E.B., and A.R. wrote the paper.
Reviewers: G.F.F., University of Milan; and C.J., University of California, Santa Barbara.
The authors declare no conflict of interest.
Data deposition: Malacosporean mitochondrial COI sequences have been deposited in the GenBank database, (accession nos. MH986597–MH986599). Matlab codes have been deposited on GitHub (available at https://github.com/lucarraro/edna-species-distribution).
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1813843115/-/DCSupplemental.
- Copyright © 2018 the Author(s). Published by PNAS.
This open access article is distributed under Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND).
References
- ↵
- ↵
- Thomsen PF,
- Willerslev E
- ↵
- Pace NR
- ↵
- ↵
- ↵
- Kelly RP, et al.
- ↵
- ↵
- ↵
- Jerde CL,
- Mahon AR,
- Chadderton WL,
- Lodge DM
- ↵
- ↵
- Mächler E,
- Deiner K,
- Steinmann P,
- Altermatt F
- ↵
- ↵
- Huver JR,
- Koprivnikar J,
- Johnson PTJ,
- Whyard S
- ↵
- Olds B, et al.
- ↵
- ↵
- Lance RF, et al.
- ↵
- Jerde CL, et al.
- ↵
- Shogren AJ, et al.
- ↵
- ↵
- ↵
- Klymus K,
- Richter C,
- Chapman D,
- Paukert C
- ↵
- Bylemans J, et al.
- ↵
- Carraro L, et al.
- ↵
- Sansom BJ,
- Sassoubre LM
- ↵
- Pfister L, et al.
- ↵
- Rodriguez-Iturbe I,
- Rinaldo A
- ↵
- Pilgrim DH
- ↵
- Rinaldo A,
- Marani A,
- Rigon R
- ↵
- ↵
- ↵
- James F
- ↵
- Carraro L,
- Mari L,
- Gatto M,
- Rinaldo A,
- Bertuzzo E
- ↵
- ↵
- Heggenes J,
- Baglinière JL,
- Cunjak RA
- ↵
- Hastie T,
- Tibshirani R,
- Friedman J
- ↵
Citation Manager Formats
Sign up for Article Alerts
Article Classifications
- Physical Sciences
- Environmental Sciences
- Biological Sciences
- Ecology