Impacts of current and future large dams on the geographic range connectivity of freshwater fish worldwide

Significance Freshwater fish are highly threatened by dams that disrupt the longitudinal connectivity of rivers and may consequently impede fish movements to feeding and spawning grounds. In a comprehensive global analysis covering ∼10,000 freshwater fish species and ∼40,000 existing large dams we identified the most disconnected geographical ranges for species in the United States, Europe, South Africa, India, and China. The completion of near-future plans for ∼3,700 large hydropower dams will greatly increase habitat fragmentation in (sub)tropical river basins, where many livelihoods depend on inland fisheries. Our assessment can support infrastructure planning on multiple scales and assist in setting conservation priorities for species and basins at risk.


Supplementary Methods
Compilation of freshwater fish species occurrence records. We collected point occurrence records from external datasets to complement the IUCN geographical ranges for those species not represented in the IUCN data. We used both global and national datasets. The focus of national datasets was especially centered on enriching species data for South America which is scarcely represented within the IUCN database and yet it represents the most biodiverse hotspot for freshwater fish species. We extracted freshwater fish species from the datasets listed in Table S1 based on freshwater fish species names and associated synonyms provided by fishbase.org, IUCN and Tedesco et al., (1). Table S1 lists the number of species and associated occurrence records extracted from each data source (including synonyms and freshwater fish species with records falling within saltwater areas, i.e., diadromous). When merging the records from the different datasets, we removed duplicates and checked for synonyms by referencing all the species to the names reported by fishbase.org (2). This dataset consisted of 2,427,956 occurrence records for 12,233 freshwater fish species ( Figure S1). The code used to extract, clean and merge the species occurrence records is freely accessible at https://github.com/vbarbarossa/occ2range4fish. Deriving fish ranges from occurrence records. We used the occurrence point records to draw speciesspecific geographical ranges. We followed the same approach used by IUCN. We first referenced the point occurrence records to the underlying HydroBASINS unit (level 8 of aggregation) (3) and therefore dissolved the corresponding polygons to obtain species-specific geographical ranges. The code used to develop the geographical ranges is available at https://github.com/vbarbarossa/occ2range4fish. Representability of freshwater fish ranges used in this study. We checked the global coverage of the geographical ranges employed in this study against the most comprehensive list of species by main drainage basin (i.e., with an outlet to the sea or an internal sink) provided by Tedesco et al., (1). To this end, we calculated species richness (i.e., number of unique species) within the main drainage basins as reported in (1). We then calculated a coverage ratio as SR/SRref*100 [%] for each main drainage basin ( Figure S2).
Lotic/lentic species classification. We classified species as lotic, lentic or both lotic and lentic, using metadata from the IUCN Red List (4). For each species, we retrieved a list of habitat types where the species was known to be found. We classified species as lotic if they were associated with habitats containing at least one of the categories "river", "stream", "creek", "canal", "channel", "delta", "estuaries", and as lentic if the habitat descriptions contained at least one of the words "lake", "pool", "bog", "swamp", "pond". For species not present in the IUCN metadata we complemented habitat information from fishbase.org (2). We classified species as lotic and lentic based on the flags "Stream" and "Lakes", respectively, available for each species from the fishbase.org API (2).
Hydrological units. We employed the HydroBASINS sub-basin units (Pfafstetter level 12) for the underlying hydro-morphology used to calculate the longitudinal connectivity in our analysis (3,5). Henceforth, we refer to sub-basins as the HydroBASINS units and to the main hydrologic basin as to the connected sub-basins that drain to the sea or an internal sink ( Figure S3). We allocated the geographical ranges of each species to the ~1M overlapping sub-basin units so that each sub-basin was assigned a list of species for which it provides habitat. In turn, we identified all sub-basins that provide habitat for each species. HydroBASINS divides the globe in 1,034,083 sub-basins (area median = 135 km 2 , interquartile range = 64 km 2 ) following the Pfafstetter coding scheme (3) and based on the high-resolution 15 arcseconds (~500m) HydroSHEDS hydrography (5). We used HydroBASINS as both IUCN and the complementary geographical ranges developed in this study are established based on HydroBASINS subbasin units at a coarser level of aggregation (Pfafstetter level 8). With the Pfaffstetter level 12 we used the highest level of spatial definition available, i.e., the smallest sub-basin units. Each of the sub-basins carries information on the connectivity to the next downstream sub-basin, which allows to determine the total connected area within a main hydrologic basin. Dams falling within a sub-basin were georeferenced to the downstream boundary of that sub-basin so that isolated patches were a collection of HB sub-basin units ( Figure S3). Cote et al., (2009) allow to calculate the connectivity index for non-diadromous (N) and diadromous (D) fish species, assuming barriers are impassable, as follows:

Derivation of the connectivity index equations. The equations proposed by
where , , represents the length of stream segment isolated due to a dam for a species within a main basin and is the number of isolated segments due to − 1 dams within that basin. Hence, , expresses the habitat connectivity, with smaller values indicating less connectivity. The equation for diadromous species differs from the one for non-diadromous as the most downstream dam obstructing the passage to/from the marine environment is likely to have the highest impact (6). In Eq. S2, 1, , is the length of the longest river segment that is connected to the ocean. While the measures of Cote et al. (6) can in principle account for different passability of barriers, we assume here that the dams considered in this analysis are impassable.
The species occurrence locations are not reported per stream segment, but as geographical ranges occupying a portion of the hydrologic basin, while Eq. S1-S2 were developed for river segments. To make these equations applicable to the areal range data from IUCN, we propose the following conversion between a sub-basin area and the length of the streams in that area based on Hack's law. According to Hack's law (7), = , i.e., the length of a stream ( ) is proportional to its drainage area ( ). Therefore, Eq. 1 and 2 can be rewritten as: • 100 (Eq. S4c) Figure S3 shows an example application of equations S4a and S4c.
National dams' datasets. We retrieved dams from the National Inventory of Dams (NID; https://nid.sec.usace.army.mil/) for the USA, which consists of 91,226 dams above 15 feet or with a major hazard potential for downstream people. Of those, we excluded 21,044 dams used for purposes such as fire protection, stock, small fish ponds, tailings, debris control which are likely off-stream and therefore not directly affecting the longitudinal connectivity of the river network. We split the NID dataset in large (n = 5,733) and small (n = 64,449) dams based on a height threshold of the dam of 15 meters. For the greater Mekong area (Mekong-Irrawaddy-Salween main hydrologic basins), we gathered data on the location of 1,007 dams from https://opendevelopmentmekong.net. We selected 773 dams with latitude-longitude information and that were classified as existing or under construction (Status = "OP", "COMM", "UNCON"). We split the greater Mekong dams in large (n = 229) and small (n = 544) based on the same 15 meters height threshold. For Brazil, we retrieved data on 498 large and additional 1,996 small hydropower dams from https://sigel.aneel.gov.br/Down/. Figure S1. Spatial distribution of the occurrence records collected from the different datasets listed in Table S1. The point occurrence records were then converted to species-specific geographical ranges to complement the IUCN geographical ranges data with species not listed by IUCN.  . For each configuration, the CI is given for species s being either diadromous or non-diadromous. Note that the CI would not change for a diadromous species between the center and the right panel, even though the right panel contains more dams, as the connectivity for diadromous species is controlled by the most downstream dam.       .  Table S1. Overview of source data used for the compilation of the global dataset of fish species occurrence records. For each source the number of species along with the total number of records is reported. All datasets are freely accessible and the code used to extract the records is available at https://github.com/vbarbarossa/occ2range4fish. Table S2. Abbreviations used for the species order names of Figure 5i in the main text. "Other" groups together order names with less than 20 species available for our analysis.

ORDER NAME ABBREVIATION USED Acipenseriformes
Other Albuliformes Other Amiiformes Other Anguilliformes Angui.

Myctophiformes
Other

Polypteriformes
Other Pristiformes Other

Torpediniformes
Other Zeiformes Other