Considering adaptive genetic variation in climate change vulnerability assessment reduces species range loss projections

Significance Forecasts of species vulnerability and extinction risk under future climate change commonly ignore local adaptations despite their importance for determining the potential of populations to respond to future changes. We present an approach to assess the impacts of global climate change on biodiversity that takes into account adaptive genetic variation and evolutionary potential. We show that considering local climatic adaptations reduces range loss projections but increases the potential for competition between species. Our findings suggest that failure to account for within-species variability can result in overestimation of future biodiversity losses. Therefore, it is important to identify the climate-adaptive potential of populations and to increase landscape connectivity between populations to enable the spread of adaptive genetic variation.

In M. escaleari neutral genetic distance was most strongly related to slope (R 2 =0.76). After accounting for the effect of geographic distance, the strongest correlated variable on its own was slope (R 2 =0.383), but the strongest relationship was with the combination of slope and tree cover (R 2 =0.532) (Table S4). Similarly, in M. crypticusgenetic distance was most strongly related to forest cover (variable lcfor100; R 2 =0.791). After accounting for the effect of geographic distance, the strongest correlated variable on its own was forest cover (R 2 =0.316), but the strongest relationship was with the combination of forest cover and slope (R 2 =0.356) (Table S5).

Generating the genomic datasets
Genomic DNA was extracted from all samples using the Qiagen DNeasy Blood and Tissue extraction kit, and was quantified with Qubit® Fluorometer 2.0 and the dsDNA High Sensitivity assay kit (Invitrogen, ThermoFisher Scientific). DNA quality was assessed through visualisation on 1% agarose gels. Samples with low DNA quantity (<2 ng/μL) or quality were excluded. ddRAD library preparation protocol was based on the methodology in Peterson et al. (1), with modifications from Manousaki et al. (2). DNA samples were simultaneously digested by two high fidelity restriction enzymes: SbfI (CCTGCA|GG recognition site), and NlaIII (CATG| recognition site) (New England Biolabs, UK).
Bioinformatics of the high-throughput sequencing data was carried out in STACKS v1.46 (3). We first did a reference-based alignment using the reference genome of two congeners, Myotis davidii and Myotis branditii, to obtain knowledge of the allelic diversity before setting up the parameters for the de novo approach used to call SNPs. We ran STACKS for de novo assembly with the following optimised parameters: a minimum stack depth [m] of 6, a maximum of 2 mismatches allowed in a locus [M]) in an individual and maximum of 1 mismatches between loci when building the catalog [n]). For the 312 samples, belonging to the two species, we got 724,579 unique RAD-tags and 86,051 shared RAD-markers (Allele=2, SNP=1-3, Coverage=10+ samples per marker, Read number >6 per samples). We exported one SNP (randomly selected) per RAD-marker. The mean coverage of each SNP selected was 63 (range 19-248). The SNP data set was processed in PLINK v1.9(4). To improve data robustness, only RAD-markers that were genotyped in at least 75% of the samples and had minor allele frequencies above 0.03 were considered for analysis. We also removed individuals that had more than 50% missing data and close relatives (based on identity-by-state distances, PI HAT >0.5).

Identifying climate-adaptive genotypes and individuals
Outlier tests were performed in Bayescan(5) to identify SNPs potentially under directional selection, or linked with genes under selection (1,000,000 iterations, 50,000 burn-in and 20 pilot runs; false discovery rate was set to 0.05). Outlier tests were performed on the M. escalerai population dataset (18 sampled populations, N=162) and the whole M. crypticus dataset divided into four distinct geographic and genetic population clusters based on Principle Component Analysis, performed in the R package Adegenet 2.0.0 (6). SNPs identified as under directional selection (1237 in M. escalerai and 271 in M. crypticus) were removed from the analyses of population structure and neutral patterns of genetic differentiation.
Missing genotypes were imputed using a sparse non-negative matrix factorisation algorithm (7) as implemented in the R package LEA v2.0.0 using function snmf (8). Snmf imputes genotypes from the estimated ancestry coefficients and ancestral genotype frequencies. Genetic population structure was determined at the individual level based on a combination of ordination methods (Principle Component Analysis, PCA, performed in the R package Adegenet 2.0.0(6) and ancestry coefficient estimations performed using the snmf function(8), a computationally fast algorithm similar to STRUCTURE. Number of population clusters was set at K=1-10, using 10 replicates for each K and the entropy function.
Genotype-Environment Associations analysis. We focused on two climatic variables that are likely to directly affect bat survival and reproductive success. Increased aridity and prolonged droughts around the Mediterranean are predicted to affect insect prey availability during the summer(9) and thus decrease reproductive success in bats(10) that breed during the summer and rely on insect prey. Increased maximum summer temperatures have already been implicated in mass mortality events of bats(11). In order for bats to survive in warmer and more arid conditions they requires physiological adaptations to reduce evaporative water loss(12).
For M. escalerai we ran LFMM(13) with K=3-4, and for M. crypticus with K=2-4. Final K values were determined based on the genomic inflation factor (lambda values closest to 1): K=4 for M. escalerai and K=3 for M. crypticus. We performed five LFMM repetition runs with 10,000 iterations and 5000 burn-in. Z-scores of multiple runs were combined using the median value and p-values were adjusted for expected FDR of 0.05 (following the procedures in Frichot & François(8)). For both species, we ran an RDA on the scaled genotypes and environmental predictors. We assessed the significance of the full RDA and each constrained axis using 999 permutations(14). We then identified outlier loci as those markers with a locus score ±3 standard deviations from the mean score of each of the two constrained axes(15).

Modelling range losses under future climate change
Ecological niche models running procedures. When running biomod2(16), we initially included six commonly used ENM techniques (Maximum Entropy (Maxent v.3.4.1(17)), Classification Tree Analysis (CTA), Generalised Boosting Model (GBM), Flexible Discriminant Analysis (FDA), Random Forest (RF) and Artificial Neural Network (ANN)), but only retained the first two techniques due to poor performance of the remaining models with the smaller sample size datasets. Pseudo-absences were selected from the model extent. We used ENMTools v1. 3(18) to select the number of model features and regularisation values for the Maxent models based on AIC scores (AICc for models with N<100; Table S1 for final model features). Model AUC scores were compared to 100 null models, generated in ENMTools through resampling the altitude layer, to determine whether models performed significantly better than random(19).
Range and niche overlap. Extent of overlap in geographic space (range overlap) was measured in ArcGIS 10.3.1 (ESRI) based on the binary model outputs (binary suitable/unsuitable maps were generated based on the threshold value that minimises the difference between sensitivity and specificity). Overlap in ecological space was calculated using Schoener's D index of niche overlap in ENMTools. Identity tests carried out on 30 simulated datasets were used to determine whether overlap was significantly higher or lower than expected by chance.

Landscape genetics analysis methods
Genetic differentiation between M. escalerai populations (Fst) was computed with the R package diveRsity v1. 9.90(20). Genetic differentiation between individual bats was determined using the Euclidean genetic distance measure in the R package Adegenet v2.0.0(6). Both analyses were carried out on neutral datasets after excluding outlier (Bayescan) and climateadaptive (LFMM+RDA) loci.
Generating landscape resistance layers. The following landscape variables were included in the analysis: habitat suitability based on ENMs; percent tree cover (percent tree canopy cover map(21)); distance to broadleaf forest, conifer forest and all forests (generated from GlobCover2009 map [European space Agency, http://due.esrin.esa.int/page_globcover.php] using the Euclidian distance tool in ArcGIS); land cover-forest (reclassified land cover map into forest=1, and the rest 50-100 [lcfor50/75/100]); land cover type (reclassified GlobCover2009 map based on different costs to all habitats [lc1/lc50/100]); altitude, slope and ruggedness (calculated from SRTM map [https://www2.jpl.nasa.gov/srtm/] in ArcGIS); autumn temperatures and autumn rainfall (calculated from WorldClim mean temperatures and precipitation maps for September-November).

Statistical analysis.
We carried out a two-step analysis, first correlating each variable against genetic distance to select variables to include in the analysis based on the strength of their correlations with genetic distance. We removed variables that were strongly correlated with Euclidean distance or other variables (R 2 > 0.56 [R>0.75]). Second, to partial-out the effect of geographic distance, we correlated the residuals of the correlation between genetic and geographic distance against the selected landscape resistance variables. The best-fit model was selected based on highest R 2 values and significant P values for all variables (P<0.05).

SI References
irrigation on farmland arthropods in southern Britain. J Appl Ecol 37 (5)