Global gene flow releases invasive plants from environmental constraints on genetic diversity

Annabel L. Smith [email protected], Trevor R. Hodkinson, Jesus Villellas, Jane A. Catford, Anna Mária Csergő, Simone P. Blomberg, Elizabeth E. Crone, Johan Ehrlén, Maria B. Garcia, Anna-Liisa Laine, Deborah A. Roach, Roberto Salguero-Gómez, Glenda M. Wardle, Dylan Z. Childs, Bret D. Elderd, Alain Finn, Sergi Munné-Bosch, Maude E. A. Baudraz, Judit Bódis, Francis Q. Brearley, Anna Bucharova, Christina M. Caruso, Richard P. Duncan, John M. Dwyer, Ben Gooden, Ronny Groenteman, Liv Norunn Hamre, Aveliina Helm, Ruth Kelly, Lauri Laanisto, Michele Lonati, Joslin L. Moore, Melanie Morales, Siri Lie Olsen, Meelis Pärtel, William K. Petry, Satu Ramula, Pil U. Rasmussen, Simone Ravetto Enri, Anna Roeder, Christiane Roscher, Marjo Saastamoinen, Ayco J. M. Tack, Joachim Paul Töpper, Gregory E. Vose, Elizabeth M. Wandrag, Astrid Wingler, and Yvonne M. Buckley Info & Affiliations
Edited by Nils Chr. Stenseth, University of Oslo, Oslo, Norway, and approved January 10, 2020 (received for review September 13, 2019)
February 7, 2020
117 (8) 4218-4227


We found that long-distance dispersal and repeated introductions by humans have shaped adaptive potential in a globally distributed invasive species. Some plant species, therefore, do not need strong demographic changes to overcome environmental constraints that exist in the native range; simply mixing genetic stock from multiple populations can provide an adaptive advantage. This work highlights the value of preventing future introduction events for problematic invasive species, even if the species already exists in an area.


When plants establish outside their native range, their ability to adapt to the new environment is influenced by both demography and dispersal. However, the relative importance of these two factors is poorly understood. To quantify the influence of demography and dispersal on patterns of genetic diversity underlying adaptation, we used data from a globally distributed demographic research network comprising 35 native and 18 nonnative populations of Plantago lanceolata. Species-specific simulation experiments showed that dispersal would dilute demographic influences on genetic diversity at local scales. Populations in the native European range had strong spatial genetic structure associated with geographic distance and precipitation seasonality. In contrast, nonnative populations had weaker spatial genetic structure that was not associated with environmental gradients but with higher within-population genetic diversity. Our findings show that dispersal caused by repeated, long-distance, human-mediated introductions has allowed invasive plant populations to overcome environmental constraints on genetic diversity, even without strong demographic changes. The impact of invasive plants may, therefore, increase with repeated introductions, highlighting the need to constrain future introductions of species even if they already exist in an area.
Patterns of genetic diversity across a species’ range arise from a complex interplay between the diversifying effect of demographic variation across landscapes with different selection pressures and the homogenizing effects of dispersal (13). On one hand, variability in demographic performance influences genetic diversity through its influence on effective population size (4). Short-lived, highly fecund species generally have higher levels of genetic diversity compared with species that are long lived or have low fecundity (5, 6). On the other hand, dispersal modulates these relationships by facilitating gene flow between populations (7). Gene flow from seed and pollen can increase genetic diversity and reduce genetic differences among populations. While the importance of these forces is widely accepted (8), there is uncertainty about the relative strength of demography and dispersal in shaping genetic structure across global environmental gradients (9, 10).
For invasive species, the situation is even more complex because humans disrupt many of the natural processes that determine genetic diversity (Fig. 1). For example, repeated introductions and long-distance dispersal by humans can release invasive plant species from demographic constraints, such as those imposed by the colonization–competition tradeoff (11). Invasive species might also overcome climatic constraints on phenotypic traits as a result of rapid adaptation to new environments (12) or nonadaptive processes, such as repeated introductions, which can swamp locally adapted phenotypes (13). Thus, emerging evidence suggests that plants in their nonnative range can break ecological “rules” because they are not always constrained by the same biological and climatic forces that operate in their native range.
Fig. 1.
Conceptual diagram showing how demographic performance and dispersal collectively shape genetic diversity in plant populations (+ indicates that a positive relationships expected). Genetic diversity is influenced through natural pathways (solid lines), such as local environmental conditions that affect demographic performance and effective population size (4). Environmental conditions also affect genetic diversity through dispersal (e.g., by facilitating dispersal vectors or creating dispersal barriers). Dispersal can increase genetic diversity directly by providing a source of new genetic material (outcrossing) or indirectly through immigration and consequent effects on demography. High propagule pressure arising from high fecundity can influence source–sink dynamics (7, 83), increasing rates of dispersal (hence the double arrow between demography and dispersal). Human activity can affect genetic diversity (dashed lines) by altering environmental conditions (e.g., climate change) and by changing dispersal rates and dispersal pathways (e.g., admixture). When this occurs, demographic performance can also be affected (e.g., through enemy release associated with dispersal across biogeographic boundaries), which can cause invasive plants to overcome biotic constraints on life–history (11) and environment–trait relationships (13). Although genetic architecture can influence demography and dispersal, the overall quantity of neutral genetic diversity across the genome is more likely to be the outcome of demographic and dispersal processes (hence the one-sided arrows between these panels).
Some populations of invasive species lose genetic diversity during invasion through founder effects (14), but many have higher genetic diversity outside their native range (15, 16). The mechanisms underlying this phenomenon include admixture (i.e., new genotypes arising from interbreeding among divergent source populations) (17), hybridization (18), rapid mutation (19), and exposure of cryptic genetic variation (20). Such increases in genetic diversity can enhance colonization success (21) and adaptive potential (22) in invasive species. Demographic changes can also improve invasive plant performance (23), which is sometimes associated with release from natural enemies (24). Unfortunately, demographic and genetic aspects of invasion are often analyzed in isolation (25), in part because labor-intensive demographic studies are typically done at one or a few sites, making them severely limited in spatial replication (26). This means that we lack understanding about the relative importance of demographic change and global dispersal on biological invasions (27, 28).
Here, we present a demographically informed analysis of neutral and putatively adaptive genetic diversity in Plantago lanceolata L. (Plantaginaceae), a common forb native to Europe and western Asia, which now has a cosmopolitan distribution (Fig. 2). P. lanceolata established in its nonnative range through long-distance dispersal by humans (29), repeated introductions (30), and cultivation (31)—all processes that can increase genetic diversity and invasion success (15). The overarching aim of the study was to analyze the influences of local demography and global dispersal patterns on genetic diversity in P. lanceolata and determine which of these pathways drives adaptive capacity. This knowledge is necessary to understand how future introduction events will influence the spread of invasive plants. This global analysis of genetic diversity, which integrates field-collected demographic data, was made possible by a spatially distributed demographic research network (PLANTPOPNET).
Fig. 2.
Global genetic structure in P. lanceolata. (A) Colored bars represent the proportion of individual genotypes in each population assigned to one of six genetic clusters identified with fastSTRUCTURE. For clarity, multiple sites were aggregated where overlapping bars had similar assignment probabilities (e.g., southern Ireland, Switzerland). Dark gray points are P. lanceolata records from GBIF/BIEN (84, 85). For each nonnative region, the minimum number of propagules (mean ± SE), overall (Propmin) and relative to sample size (Propmin/N), indicates that multiple introductions would be required to produce observed levels of genetic diversity. The number of non-European alleles indicates that more genetic diversity was present in nonnative regions than could be explained by the native sample. (B) Probability of assignment for 491 individuals to six genetic clusters, with individuals grouped by population within region. Three commercial cultivar lines and two outgroups (P. coronopus and P. major) were included.
In addition to demographic data, we sampled DNA from 491 individuals, including outgroups and cultivar lines, and 53 naturally occurring populations across the native European range (n = 35) and the nonnative range (n = 18) in southern Africa, Australasia, and North America (Fig. 2). To address our main aim, three hypotheses were tested.
Hypothesis 1. In the absence of dispersal, increases in survival and fecundity will drive increases in genetic diversity. These effects will be diluted by dispersal between populations.
Hypothesis 2. Patterns of spatial genetic structure among native populations will reflect dispersal limitations across environmental gradients. In the nonnative range, gene flow arising from multiple introductions will disrupt spatial genetic structure observed in the native range.
Hypothesis 3. Environmental influences on within-population genetic diversity will be explained by demographic variation (density, fecundity, and empirical population growth rate). Repeated introductions into the nonnative range and long-distance dispersal by humans will weaken this relationship (Fig. 1).
A genotypic simulation model, parameterized with empirical demographic data from P. lanceolata, was used to test Hypothesis 1. We then coupled field-collected demographic data (density, empirical population growth rate, and fecundity) with single-nucleotide polymorphism (SNP) data (18,166 neutral and 3,024 putatively adaptive SNPs) to test Hypotheses 2 and 3.

Results and Discussion

Hypothesis 1: Dispersal between Populations Will Dilute Demographic Effects on Genetic Diversity.

In two simulated populations unconnected by dispersal, with different rates of juvenile survival (σj = 0.1 and 0.2) and female fecundity (seeds per plant, ΦF = 1 to 100), higher juvenile survival led to greater genetic diversity (Fig. 3A). Above the threshold at which populations went extinct (ΦF = 15), genetic diversity increased sharply until ΦF was ∼25. Above this point, there was little influence of fecundity on genetic diversity (Fig. 3A). Population size at the end of the simulation was larger with higher juvenile survival (Fig. 3B). Thus, variation in female fecundity seems to have less influence than juvenile survival in determining genetic diversity in P. lanceolata. When the two populations were connected by dispersal, differences in heterozygosity persisted until the number of migrants per generation exceeded 50,000 (Fig. 3 C and D). This number is realistic in natural populations since reproductive individuals typically produce a minimum 20 to 100 seeds, and migration refers to propagules dispersed before the recruitment process. Male fecundity was kept constant in the model as it is very high in P. lanceolata [10,000 to 54,000 pollen grains per anther (32)] and had no influence on genetic diversity.
Fig. 3.
The simulated effect of demography and dispersal on genetic diversity (expected heterozygosity, ±95% CI) in two populations of P. lanceolata. (A) When there was no dispersal between populations, the population with high juvenile survival (σj = 0.2) had greater genetic diversity than the population with low juvenile survival (σj = 0.1). At very low levels of female fecundity ΦF, populations went extinct (†), but ΦF had little influence on genetic diversity at approximately >25 seeds per plant. (B) Variation in σj influenced population size at the end of the simulation. (C) The difference in heterozygosity between the two populations was influenced by dispersal between them (where fecundity was kept constant at 20 seeds per plant). (D) Genetic differences persisted until high levels of dispersal (>50,000 migrants per generation) indicated by the 95% CI crossing zero.
The simulation result supports our prediction (Hypothesis 1) that demography would influence genetic diversity in P. lanceolata when dispersal barriers are present and that dispersal would dilute these effects. The simulation also suggests that juvenile survival is an important parameter controlling heterozygosity. When dispersal barriers are removed, however, gene flow from pollen and seed will swamp local effects of juvenile survival on heterozygosity. We could, therefore, expect demographic effects on genetic diversity to become undetectable at the upper range of pollen and seed movement that occurs in P. lanceolata.
The increases in genetic diversity with juvenile survival (Fig. 3) might not confer an adaptive advantage since they reflect genetic diversity arising from neutral demographic processes. The relevance of this result, however, is that there is enough demographic variability in P. lanceolata to shape neutral genetic structure, an assumption underlying the hypotheses in the rest of the study. Thus, we can expect juvenile survival to be the dominant demographic parameter underlying differences in P. lanceolata genetic diversity when dispersal is limited at local scales. At continental scales, genetic diversity is probably influenced less by juvenile survival when gene flow is high. This might be especially true in the nonnative range where there has been a shorter history of local adaptation (33) and multiple human-mediated introductions (the human activity pathway) (Fig. 1).

Hypothesis 2: Global Gene Flow from Multiple Introductions Will Disrupt Spatial Genetic Structure.

Admixture analysis of P. lanceolata genotypes with fastSTRUCTURE (34) revealed strong genetic structure in the native range and a high degree of admixture in the nonnative range. The number of genetic clusters at Hardy–Weinberg (HW) equilibrium (K) was between K = 6 (model complexity maximizing marginal likelihood) and K = 13 (model components used to explain structure in the data). When K = 6, cultivar lines and outgroups (Plantago coronopus and Plantago major) formed two distinct clusters, and the remaining four clusters were present in the native European range with clear spatial structure (Fig. 2). Greece, Italy, the Islands of the North Atlantic, and Finland made up almost “pure” lines of these four clusters, while other European populations were admixed.
Genotypes of most nonnative populations were admixed, and there was relatively little spatial structure at a global scale (Fig. 2). This was supported by a significantly higher diversity score in the nonnative range (model estimate, SE = 0.34, 0.04) compared with the native range (0.22, 0.03; P = 0.033) (SI Appendix, Fig. S6). Italy and central France were the most similar source material for the dominant genotype in the nonnative populations. Some cultivar stock was identified in the Spanish populations, possibly reflecting the Iberian source of material used to breed cultivars. The cultivars were developed in New Zealand; thus, the presence of cultivar stock in that population might indicate mixing between the naturalized population and pasture plants (Fig. 2). At the upper range of K, further spatial structure was identified in Europe (e.g., at K = 13, Norway was differentiated from Finland), while the nonnative populations still showed admixture of multiple, mostly Mediterranean sources (SI Appendix, Fig. S1). The lack of spatial structure at a global scale was supported by analysis of molecular variance showing that genetic variation between the native and nonnative range was only 2.2%, among individuals within populations was 10.7%, and among populations within ranges was 11.4%. The remaining genetic variation (75.5%) accounted for individual heterozygosity.
The minimum number of colonizing propagules required to produce the observed level of genetic diversity in nonnative regions (Propmin) depended on sample size (r = 0.99) and ranged from 5.35 in New Zealand to 49.95 in North America (Fig. 2). Multiple introductions were, therefore, required to produce observed levels of genetic diversity in the nonnative ranges. Relative to sample size, Propmin ranged from 0.55 to 0.90, indicating that, in each region, more than half the sampled population was required to represent nonnative genetic diversity. Propmin was based on the alleles present in the native range, but there were also a number of non-European alleles in each nonnative region (12 to 159) (Fig. 2). Thus, we either failed to sample the full extent of the source population (despite extensive sampling across Europe), or new genotypes were produced after colonization. The latter explanation can arise through transgressive segregation (35) and is one mechanism by which invasive species adapt quickly to new environments. However, we also detected private alleles within sites in Europe (SI Appendix, Table S1), and therefore, our sample does not represent the full range of genetic diversity in the species.
Genetic structure measured by FST (genetic differentiation between all pairs of populations) was stronger among populations in the native range (mean FST = 0.16) than the nonnative range (mean FST = 0.09). To analyze the influence of environmental gradients on FST, we used three separate generalized dissimilarity models, one for each range type: native range, nonnative range, and the global population (native and nonnative combined). The deviance explained by the native model was 74.3% (bootstrap CI = 68.6, 78.3), and two of six variables fitted in the model had a significant influence on FST (Fig. 4 and SI Appendix, Fig. S2). Genetic distance increased with geographic distance (Fig. 4A), and sites with similar levels of precipitation seasonality were more genetically similar (Fig. 4B) after accounting for other variables in the model (SI Appendix, Fig. S2). No variable significantly affected FST in the nonnative range (deviance explained = 23.1%, bootstrap CI = 9.4, 34.1) or the global population (deviance explained = 10.9%, bootstrap CI = 7.25, 14.33) (SI Appendix, Fig. S2). Geographic distance was included in each model to account for differences in spatial scale. Thus, if environmental influences on gene flow had persisted in the nonnative range, they should have been detectable. Combined with the admixture analysis, these results support our prediction (Hypothesis 2) that multiple introductions from diverse source populations and long-distance dispersal can weaken environment–genetic structure relationships. P. lanceolata reproduces clonally as well as sexually, and this flexible reproductive mode combined with high admixture in the nonnative range suggests fast expansion after colonization. This might allow the species to overcome ecological constraints without the need for local adaptation (36).
Fig. 4.
Genetic distance (FST) between pairs of P. lanceolata populations in the native European range was explained by two variables: (A) geographic distance and (B) distance in precipitation seasonality (coefficient of variation of annual mean precipitation) between sites. A generalized dissimilarity model indicated that these variables had a significant (adjusted P < 0.001) effect on FST given all other variables in the model (geographic distance, mean temperature, mean precipitation, temperature seasonality, and precipitation seasonality). Deviance explained by the model was 74.3%, and the model splines are shown in SI Appendix, Fig. S2.
In the native range of P. lanceolata, the increase in genetic distance with precipitation seasonality might partially reflect a historic biogeographical pattern (precipitation seasonality was correlated with longitude, r = 0.47). Historical processes occurring along both east–west and north–south axes shape contemporary genetic patterns in European plants. For example, glacial refugia in Iberia, Italy, and the Balkans were reflected in highly divergent lines of Arabidopsis thaliana south of the alpine barrier (37). In our dataset, the Italian population was genetically distinct, while two eastern sites in Romania were highly differentiated and genetically related to Greece (Fig. 2). François et al. (37) also found evidence for an eastern refuge in A. thaliana. Further sampling into the continental Asian range of P. lanceolata would help uncover whether the observed patterns arose from movement with agriculture westward across Europe (38, 39) or postglacial colonizers from the Balkans (40).

Hypothesis 3: Global Gene Flow Will Weaken Demographic Effects on Genetic Diversity within Populations.

We compared a series of linear models, including additive and interactive effects of range (native/nonnative), to address the hypothesis that environmental influences on within-population genetic diversity would differ between the native and nonnative ranges (Dataset S1). Our results offered partial support for Hypothesis 3 because environmental gradients (characterized by mean temperature, temperature seasonality, and mean precipitation) affected population growth rate, fecundity, and neutral and adaptive genetic diversity in native and nonnative ranges of P. lanceolata (Fig. 5 and SI Appendix, Fig. S3). Our expectation, however, that genetic responses to the environment could be explained by demographic variation had little support (SI Appendix, Fig. S3). Demographic variables responded to environmental gradients but did not induce a response on genetic diversity when used as predictor variables. Demographic and genetic parameters within populations were best explained by environmental gradients, and in some cases, there were differences in the responses between native and nonnative ranges.
Fig. 5.
Environmental influences on demography and genetic diversity within populations in the native European (n = 30) and nonnative (n = 14) range of P. lanceolata (model estimates and 95% CIs shown over raw data). First-ranked models are shown for environmental influences on (A) population growth rate, (B) reproductive effort, (C and D) neutral genetic diversity, and (E and F) adaptive genetic diversity. In all models except E, the additive and interactive models both had support from the data (∆AICc < 2) (SI Appendix, Fig. S3 and Dataset S1). For E, the interaction between temperature seasonality (SD of annual mean temperature at each site) and range (native/nonnative) was the only model supported by the data (AICc weight = 0.95).
The top-ranked models for population growth rate (Fig. 5A) and fecundity (Fig. 5B) had additive effects of mean temperature, responding similarly in the native and nonnative ranges. Globally, warmer sites tended to have lower population growth rates and higher fecundity. Increases in fecundity can occur to offset lower survival in stressful environments (41), a phenomenon that has been recorded in other studies of Plantago (42, 43). There was also an additive effect of temperature seasonality on neutral genetic diversity (Fig. 5C), with highly seasonal sites having greater genetic diversity in the native and nonnative ranges. Mean temperature and temperature seasonality were correlated (r = −0.36, P = 0.02) (SI Appendix, Fig. S4). Thus, the observed responses are best thought of as responses to an environmental gradient, with demographic and genetic parameters responding to different aspects of the gradient. High genetic diversity in highly seasonal sites might have been driven by increased fecundity since we found some evidence of a positive relationship between fecundity and genetic diversity (SI Appendix, Fig. S3G and Dataset S1).
Three of the top-ranked models included an interaction between environment and range, showing environmental effects in the native range but not the nonnative range. Both neutral (Fig. 5D) (bootstrap CI = 0.001, 0.010) and adaptive (Fig. 5F) (bootstrap CI = 0.004, 0.021) genetic diversity decreased across a mean precipitation gradient in the native range but not in the nonnative range. Adaptive genetic diversity increased with temperature seasonality but only in the native range (Fig. 5E) (bootstrap CI = −0.021, −0.005). There was also support (a change in Akaike Information Criterion [ΔAICc] < 2) for nonnative populations having a weaker response to environmental gradients in terms of fecundity (SI Appendix, Fig. S3 A and B), population growth rate (SI Appendix, Fig. S3C), and neutral genetic diversity (SI Appendix, Fig. S3D). Taken together, these results suggest that nonnative populations are not constrained by the same environmental forces as their native counterparts.
Population growth rate and neutral and adaptive genetic diversity were all higher in the nonnative range (Fig. 5 and Dataset S1), suggesting that invasive populations have a greater capacity for colonization and adaptation. Higher population growth rates in nonnative populations were probably driven by increases in survival rather than fecundity since fecundity was lower in the nonnative range (Fig. 5B and Dataset S1). Thus, our simulation experiments and our field data indicated stronger effects of survival than of fecundity on genetic diversity and population growth, respectively.
Increases in genetic diversity can arise when environmental heterogeneity drives population turnover through increases in sexual reproduction, population growth, and survival (6, 44). In our study, however, population growth was affected by mean temperature, not variability in temperature; cooler sites generally had higher rates of population growth across the first two demographic censuses. This is consistent with previous work showing that high mean temperature was associated with mortality in P. lanceolata (42). Thus, we did not find a clear demographic explanation for the effect of temperature seasonality on genetic diversity. Temperature stability might have promoted clonality in P. lanceolata, leading to lower genetic diversity (45). However, rates of sexual and clonal reproduction within species are often inversely related (46), and genetic diversity was unaffected by rates of sexual reproduction in our study. The influence of global variation in clonality on genetic diversity needs further investigation, particularly because clonality combined with sexual reproduction can increase invasion success (36).
Our prediction that environmental effects on genetic diversity could be explained by demographic variation had only little support, even in the native range. Except for a weak increase in neutral genetic diversity with density (SI Appendix, Fig. S3F) and fecundity (SI Appendix, Fig. S3G), there was little direct influence of demographic variables on genetic diversity. There are at least two explanations for this general lack of a demographic relationship. First, genetic structure can arise even under frequent dispersal (44). Thus, although we found strong spatial genetic structure in the native range, it is possible that dispersal was high enough to mask any influence of demography on genetic diversity (the natural dispersal pathway) (Fig. 1). Second, the fine scale of demographic sampling within sites (a few meters2) might not reflect effective population size (47). This fits with our understanding of abiotic filters operating at all scales, while biotic filters, such as inter- and intraspecific interactions affecting demographic performance, generally operate at localized scales (10, 13). P. lanceolata is also highly genetically variable within and outside its native range. Thus, the low power within sites might have limited our ability to draw conclusions about demographic influences on genetic diversity. Sampling more individuals per site in the future might reveal stronger effects of fecundity, survival, and population growth on genetic diversity.
In summary, genetic diversity in P. lanceolata seems to be shaped predominantly by temperature and precipitation gradients related to gene flow and admixture rather than demographic variation. Our data support the prediction that high dispersal would dilute demographic effects on genetic diversity (Hypothesis 1). Globally, our analyses suggest that genetic diversity in the nonnative range is shaped by admixture from multiple source populations and ongoing introductions, leading to high neutral and adaptive genetic diversity (Hypothesis 2). Our data suggest that invasive populations can establish in a broad range of environments without the need for associated demographic change. Thus, there was little support for the prediction that demographic variation could explain environmental effects on genetic diversity (Hypothesis 3). Our unique global demographic dataset provides evidence that invasive species can overcome ecological rules in their nonnative range (1113). Reducing long-distance dispersal and further introductions of invasive plants is important, even in areas where they already exist, as this will limit future increases in genetic diversity and the formation of new genotypes that confer an adaptive advantage in new environments.


Study Overview.

P. lanceolata is a short-lived [mean, max = 2.8, 8 y (48)], perennial forb native to Europe. It reproduces sexually and vegetatively, with gynodioecy, self-incompatibility, and protogyny to enhance outcrossing (49). Flowers are wind pollinated, and seeds mature in summer. The species occurs in a wide range of habitats, including seminatural grasslands, roadsides, disturbed sites, abandoned fields, and agricultural land (50). Seeds are dispersed locally by wind, but seed dispersal distances are estimated to be within centimeters or meters of the mother plant (51). Widespread propagule movement by humans (29) and repeated introductions as seed contaminants (30) have led to the global distribution of P. lanceolata. It has been present in Australia since before 1850 (, in North America since before 1832 (30), and for an unknown time in South Africa (52). It is cultivated as a commercial pasture plant in New Zealand because it grows well in the mild winter and limits soil nitrification (31). The species is classed as invasive in its nonnative range (52) because it reproduces prolifically and spreads over large areas (53). We follow this definition of “invasive” to refer to P. lanceolata and other plant species with this characteristic. We use the term “nonnative” to refer to the geographic range outside of Europe where the species exists.
We used field-collected demographic and DNA data from populations of P. lanceolata to analyze spatial variation in demographic rates and genetic diversity. The demographic data were used to parameterize the simulation part of the study (Hypothesis 1) and to analyze the demographic influence on genetic diversity across global environmental gradients (Hypothesis 3). For the genetic dataset, we sampled 454 individuals from 53 naturally occurring populations in 21 countries across the native European range (35 populations: Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Norway, Romania, Spain, Sweden, Switzerland, United Kingdom) and the nonnative range (18 populations: Australia, Canada, Japan, New Zealand, South Africa, the United States) (Fig. 2). The latitudinal range of sampling, in absolute terms, was 27.5 to 61.4°. Forty-four populations (83%) were established sites in the PLANTPOPNET network ( undergoing an annual demographic census, while the remaining nine were sampled for DNA only (SI Appendix, Table S1).
We characterized the environment at each site using four variables from BioClim (54) at 30-s resolution: annual mean temperature, annual mean precipitation, temperature seasonality (SD of annual mean temperature), and precipitation seasonality (coefficient of variation in annual mean precipitation). We selected these variables because they were important for morphological variation in P. lanceolata in preliminary analyses, and multicollinearity was not high (variance inflation factor < 3, maximum r between pairs of environmental variables = 0.43 [mean temperature and seasonality in precipitation], and between range [native/nonnative] and environment [mean temperature] = 0.59) (55).

Field Demographic Census and DNA Sampling.

PLANTPOPNET is an ongoing research project that began in 2014, and annual censuses of P. lanceolata populations are planned for the long term. Our analysis used data collected between 2014 and 2017, but not all sites began data collection at the same time (i.e., year 0 varied among sites) (SI Appendix, Table S1). In most populations (61%), year 0 was 2015, and 73% of populations were sampled twice during this study period (number of annual censuses per population = 1 to 3) (SI Appendix, Table S1). At each census site in year 0, a series of adjacent 50 × 50-cm quadrats was established along transects until the quadrats covered 100 individual plants. Researchers established transects where P. lanceolata was present in sufficient numbers for demographic studies, and therefore, density estimates might reflect upper estimates across local populations. Quadrats were permanently marked to enable repeat censuses from year 1 onward. Each plant was individually tagged, and all rosettes on each plant were measured according to a standard protocol (56), which included leaf length, number of flowering stems, inflorescence length, and stage of seed development.
At each site, fresh leaf tissue from seven to nine individuals was collected and placed immediately in silica gel (SI Appendix, Table S1). Sampled individuals were close to (∼5 to 20 m) but outside of census plots and were separated from each other by ∼5 to 10 m. Thus, we avoided damage to permanently marked individuals in the census population, ensured that samples were closely related to the census population, and minimized the chance of sampling clones. We included two samples each from one population of P. coronopus (Spain) and four populations of P. major (Australia × 2, Ireland × 1, Romania × 1) as outgroups. To investigate if naturally occurring populations were influenced by genetic stock from commercial pasture lines, we included nine individuals from each of three cultivar lines derived from P. lanceolata: AgriTonic, Ceres Tonic, and Tonic Plantain. The whole dataset thus included 491 individuals. The data are publicly available (


Samples were genotyped at Diversity Arrays Technology P/L (Canberra, Australia) using double-restriction enzyme complexity reduction and high-throughput sequencing (DArTseq). Total genomic DNA was extracted with a NucleoSpin 96 Plant II Core Kit (MACHEREY-NAGEL) and purified using a Zymo kit (Zymo Research). The enzymes PstI and MseI were chosen following tests of different enzyme combinations for P. lanceolata. DNA samples were processed in digestion/ligation reactions following Kilian et al. (57) but substituting the single PstI adaptor for two adaptors corresponding to restriction enzyme-specific overhangs. The PstI adaptor was modified to include Illumina sequencing primers and variable length barcodes following Elshire et al. (58). Mixed fragments (PstI–MseI) were amplified in 30 rounds of PCR using the following reaction conditions: 94 °C for 1 min and then 30 cycles of 94 °C for 20 s, 58 °C for 30 s, and 72 °C for 45 s followed by 72 °C for 7 min. After PCR, equimolar amounts of amplification products from each sample were bulked and applied to c-Bot (Illumina) bridge PCR followed by single-read sequencing on an Illumina Hiseq2500 for 77 cycles. Raw sequences were processed using DArTseq analytical pipelines (DArTdb) to split samples by barcode and remove poor-quality sequences. Genotypes for codominant SNPs were called de novo (i.e., without a reference genome) from 69-bp sequences using DArTseq proprietary software (DArTsoft). Replicate samples were processed to assess call rate (mean = 79%), reproducibility (mean = 99%), and polymorphic information content (mean = 22%).

SNP Filtering.

Starting with 37,692 SNPs that passed DArTseq quality control, we filtered the data for minimum minor allele frequency (1%), call rate (50%), and reproducibility (98%) using custom R scripts (59) ( Loci in HW and linkage disequilibrium hold important biological information about population structure, but extreme disequilibrium can indicate genotyping errors, which bias estimates of population structure (60). Within sites, there was limited power to reliably test for patterns of HW and linkage disequilibrium (seven to nine individuals per site). It was not possible to combine samples from multiple populations because we detected strong genetic structure even within countries, which would have produced biologically meaningful patterns of disequilibrium arising from the Wahlund effect (61). Thus, to identify SNPs with consistent patterns of HW disequilibrium, we tested each locus in every population separately using Fisher’s exact tests (62) and used unadjusted P values given the low power within sites. Loci that deviated from HW equilibrium in more than five populations were removed (63). We used the correlation between genotype frequencies (64) to test for linkage disequilibrium between each pair of loci in each population. Following the same rationale as for HW disequilibrium, we removed a locus if it was in a correlated pair (r > 0.75) in more than five populations. To reduce the chance of disequilibrium from physical linkage, we also filtered SNPs that occurred in the same 69-bp sequence as another SNP, keeping the one with the highest call rate. The data comprised 21,190 SNPs after applying these filters.

Detecting Loci under Putative Selection.

Neutrality was an assumption underlying the population structure models that we used; thus, we investigated if SNPs were putatively under selection using one population-level method (BayeScan) and two individual-level methods (PCAdapt and LFMM). BayeScan uses a Markov chain Monte Carlo algorithm to examine outlier loci against background values of population differentiation (FST) among predefined populations (65). PCAdapt and LFMM both define background population structure as K principal components derived from individual genotypes (66, 67). In PCAdapt, each SNP is regressed against each principal component. LFMM uses the principal components as latent factors in a Gaussian mixed model, where the genotype matrix is modeled as a function of an environmental matrix (67). While BayeScan is suitable for our population-level sampling design, PCAdapt and LFMM are more reliable for species with complex, hierarchical population structure (e.g., multiple divergence events) and are less sensitive to admixed individuals and outliers in the data (68, 69). Thus, we considered outliers identified in any of the three methods to be putatively under selection.
For BayeScan, we set the prior odds at 200 [appropriate for the number of markers in our data (70)], ran the model using default parameters (100,000 iterations with a thinning interval of 10, a burn in of 50,000 and 20 pilot runs of 5,000 iterations), and checked the distribution of the log likelihood across iterations to ensure model convergence (SI Appendix, Fig. S5). For both individual-level methods, we examined scree plots to determine K and used the first 10 components that captured the majority of population structure in the data (SI Appendix, Fig. S5). We defined the LFMM environmental matrix using the four 30-s BioClim variables described above and three additional variables: elevation (meters above sea level measured at the site) and two variables extracted from CliMond (71) at 5-min resolution (annual mean moisture index and seasonality in moisture [coefficient of variation of annual mean moisture]). To control for false discovery rate, we calculated q values from P values and classed SNPs as outliers where q < 0.05 for BayeScan and PCAdapt and q < 0.1 for LFMM (to account for the small number of loci identified with this method) (SI Appendix, Fig. S5). The three analyses identified a total of 3,026 outlier SNPs, and as commonly reported in other studies (69), there was little overlap among methods (SI Appendix, Fig. S5). After filtering the putatively adaptive loci, our final dataset comprised 18,164 neutral SNPs.

Simulated Genetic Diversity (Hypothesis 1).

We conducted two simulation experiments in MetaPopGen 0.0.4 (72) to determine if realistic levels of variation in P. lanceolata survival and fecundity would influence genetic diversity and whether dispersal would override demographic influences on genetic diversity. Gametes in the model are produced via Mendelian segregation, and mating is random (72). We modeled two distinct populations to examine different rates of juvenile survival and female fecundity. In Experiment 1, the two populations were unconnected by dispersal, while in Experiment 2, they were connected by varying levels of dispersal.
Male fecundity ΦM in P. lanceolata is high [10,000 to 54,000 pollen grains per anther (32)] and had no influence on genetic diversity. Thus, we set ΦM at 10,000 and focused on variation in female fecundity (seeds per plant) ΦF, adult σa and juvenile σj survival rate, and between-population dispersal δ (number of migrants per generation). In both experiments, each of the two populations i had two age classes x (juvenile xj, adult xa), three genotypes p representing all combinations of two alleles (00, 01, and 11), and a starting size Nxp of 25,000 individuals. The model was not spatially explicit, but we wanted each population to represent a 1-ha site with a density of 15 individuals per meter2 (based on census data from year 0). Generation time in P. lanceolata is ∼3 y [range = 1 to 3 y (73, 74)]. Thus, we ran the model for 100 time steps to represent population dynamics over 100 to 300 y, accounting approximately for the time that P. lanceolata has been present in its nonnative range. Population sizes reached a steady state within 10 time steps. We estimated juvenile carrying capacity as K = (ΦF × (N × p)) × g, where g is the estimated field germination rate (0.039). We kept K time and population constant. MetaPopGen can only simulate one locus at a time, and therefore, we repeated the experiments 300 times to simulate sampling 300 independent loci (following ref. 72).
In Experiment 1, we tested the influence of ΦF on genetic diversity (1 to 100, based on census data from year 0) and σ (σji1 = 0.1; σai1 = 0.84; σji2 = 0.2; σai2 = 0.71) with no dispersal between populations (δ = 0). Survival rates were based on a total population estimate of 5% alive after 5 y (exp(log(0.05)/5)) (73) and adjusted for commonly reported low survival in juveniles (42). In Experiment 2, we tested the influence of δ (migration rate: 0 to 0.04 = number of migrants: 0 to 60,000) on the difference in genetic diversity between populations. Each population had the same survival rates as Experiment 1, and ΦF was kept constant at 20. The migration rates produce large numbers of migrants because each plant produces 20 “newborns,” and migration occurs before recruitment in the model (72). Thus, δ is influenced by K and will always be higher than recruitment. We summarized expected heterozygosity at the end of each simulation and calculated the mean and 95% CI across the 300 loci. The experiments can be reproduced with the code available at

Population Genetic Structure (Hypothesis 2).

All population structure analyses used our panel of neutral SNPs, a choice dictated by the model assumptions being based on HW and linkage equilibrium. We first conducted an analysis of molecular variance in poppr 2.8.0 (75) to determine how neutral genetic diversity was partitioned across levels: within individuals, among individuals within populations, among populations within ranges, and between the native and nonnative range. To assess genomic relationships and the degree of admixture in the global dataset, we used fastSTRUCTURE (34). This model determines the number of genetic clusters in the data that would maximize HW and linkage equilibrium (K). We investigated K = 1 to K = 20 and assigned each individual to a cluster based on the model complexity that maximized marginal likelihood and the model components used to explain structure in data (34). To quantify the level of admixture for each individual (i) across the most likely K, we calculated a diversity score (76) as
where Ci is the cumulative admixture and Hmax is a scaling factor (Hmax = K · ((1/K) · ln(1/K))), making DS relative to complete evenness for each individual. We used a linear mixed model to evaluate whether there was a difference in DS between the native and nonnative range, with site fitted as a random effect.
To determine whether multiple introductions of P. lanceolata had occurred in nonnative regions (Australia, Japan, New Zealand, North America, and South Africa), we estimated the minimum number of propagules required to produce the observed level of genetic diversity in nonnative regions (Propmin) (77). We defined the source population as all of Europe because nonnative individuals were usually composed of admixed genotypes from multiple European populations. For each nonnative region, we calculated the number of alleles not present in Europe and removed these from the reference panel of nonnative alleles. Individuals from the native range were then randomly cumulatively sampled without replacement. Propmin was the number of individuals sampled at the point when all alleles in the nonnative panel were represented ( We repeated the process 1,000 times to obtain a mean and SE. We also calculated the number of unique alleles in each of the 53 sites as a measure of uniqueness.
To assess the influence of environmental gradients on spatial genetic structure, we used generalized dissimilarity models (78, 79). We fitted one model for the native range, a second for the nonnative range, and a third for the global dataset (native and nonnative). We calculated genetic differentiation as FST between all pairs of populations in GENEPOP 4.6 (80). Environmental distances between all pairs of populations i and j were calculated from the four BioClim variables x (xixj) (79). For each of the three datasets, we fitted geographic distance and all environmental distances as predictor variables in a single model. The importance of each variable, given all other variables, was assessed by comparing the fitted model with 500 models with a permuted environmental matrix (79). Thus, the effect of each environmental variable can be interpreted independently, and differences in spatial scale are accounted for by the geographic distance variable. P values were Bonferroni adjusted across all terms within each model. We used deviance explained to assess goodness of fit of the three models. Given samples size differences between the three datasets, we used a bootstrap estimate from 10,000 replicates of the deviance explained to assess the accuracy of the model fit. We assumed the deviance explained to be accurate if bootstrap 95% CI did not include zero.

Demographic and Dispersal Effects on Genetic Diversity (Hypothesis 3).

We used linear regression to determine if environmental influences on within-population genetic diversity could be explained by demographic variation and whether this effect would be weakened by mass dispersal into the nonnative range (Hypothesis 3). The observation level for all analyses was the population, and the number of observations was 44 (i.e., all populations with genetic and demographic data) (SI Appendix, Table S1).
Genetic diversity was calculated as allelic richness in hierfstat (81) separately for the neutral (18,166 SNPs) and adaptive (3,024 SNPs) datasets. Allelic richness was highly correlated with expected heterozygosity (He; r = 0.98), and because it was standardized for sample size, it eliminated a weak correlation that we observed between He and sample size. We characterized the environment using the four BioClim variables. For demography, we used three variables that can influence genetic diversity (Table 1): population density (rosettes per meter2), fecundity, and empirical population growth rate. For fecundity, we used reproductive effort estimated as the rosette-level inflorescence length × number of flowering stems per meter2. Empirical population growth rate was calculated as r = log(Nt+1/Nt), indicating the strength and direction of change in rosettes per meter2 in the first 2 y of the study (for 38 of the 44 populations with 2 y of data) (SI Appendix, Table S1). Thus, r reflects the combined influence of fecundity and survival (the variables explored in simulation Experiment 1). We used rosette-level data for all metrics to reduce potential observer bias in assessing clonality, but plant- and rosette-level metrics were highly correlated (r = 0.94). Fecundity was log transformed to address a strongly skewed distribution, and all predictors were standardized prior to analysis (x − mean(x)/SD(x)).
Table 1.
Demographic variables used to analyze population processes that are important to genetic diversity
Demographic variable measuredUsed as a proxy forRelevance to genetic diversityFormula
DensityPopulation sizeEffective population sizeNo. of rosettes/m2 (N)
Reproductive effort per unit areaFecundityFitness(Inflorescence length × no. flowering stems)/m2
Empirical population growth rateCombined effects of survival and fecundityFitnesslog(Nt+1/Nt)
The relevance of demographic variables to genetic diversity is outlined in Fig. 1 and described in detail by Ellegren and Gaultier (4).
We tested environmental and demographic effects separately to determine which variables best described variation in genetic diversity. The analysis comprised two stages. First, we analyzed the effect of each environmental variable on genetic diversity. Here, we also modeled the environmental effect on demography (i.e., using the three demographic variables as response terms) to establish a baseline for environmental influences on demographic rates. Second, we examined whether each demographic variable influenced genetic diversity. In both stages, we analyzed environmental and demographic interactions with range (native/nonnative). Because of data limitations (n = 44), it was not possible to fit complex models with multiple interaction terms, and therefore, we modeled each predictor separately.
To determine the importance of each environmental or demographic predictor, we used AICc to compare model fit across five alternative model forms: a null model (no predictor variation), a predictor-only model, a range-only model, an additive model (predictor + range), and an interactive model (predictor × range). We considered a model to have support from the data if it improved the fit over the null model by ∆AICc > 2 (82). Among models that outfitted the null, those within ∆AICc ≤ 2 of each other were considered to have equal support from the data. In these cases, we presented the top-ranked model in the main document and supported models in SI Appendix. To interpret interaction models in light of sample size differences between the native (30) and nonnative (14) ranges (e.g., a strong response in the native range and no response in the nonnative range), we obtained a bootstrap 95% CI from 10,000 bootstrap replicates of the interaction coefficient using the adjusted bootstrap percentile method.

Data Availability

Data deposition: Data and code have been deposited in Zenodo,


Jan van Groenendael helped design the PLANTPOPNET network. Leander Anderegg, Lauchlan Fraser, Jennifer Gremer, Emily Griffoul, Adrian Oprea, Richard Shefferson, and Danielle Sherman provided data. Maeve Harrison assisted with field work. Valuable discussions with Alan Stewart and Andrzej Kilian improved our knowledge of Plantago cultivation and SNP data generation, respectively. This research was supported by a Science Foundation Ireland grant to Y.M.B. (European Research Council Development Programme 15/ERCD/2803). A.L.S. was supported by a Marie Skłodowska-Curie Individual Fellowship (746191) under the European Union Horizon 2020 Programme for Research and Innovation. Additional support came from: Catalan Institution for Research and Advanced Studies (ICREA) (Academia Award to S.M.-B.), Spanish Government (Ministerio de Economía y Competitividad BFU2015-64001-P/MINECO/FEDER to S.M.-B.), Estonian Ministry of Education and Research (Institutional Research Funding IUT20–29 to M.P.), European Regional Development Fund (Centre of Excellence EcolChange to M.P.), New Zealand Ministry for Business Innovation (Employment's Strategic Science Investment Fund to R.G.) and Academy of Finland (285746 to S.R.).

Supporting Information

Appendix (PDF)
Dataset_S01 (XLSX)


A. L. Smith et al., Dispersal responses override density effects on genetic diversity during post-disturbance succession. Proc. Biol. Sci. 283, 20152934 (2016).
J. Duminil et al., Can population genetic structure be predicted from life-history traits? Am. Nat. 169, 662–672 (2007).
E. M. Leffler et al., Revisiting an old riddle: What determines genetic diversity levels within species? PLoS Biol. 10, e1001388 (2012).
H. Ellegren, N. Galtier, Determinants of genetic diversity. Nat. Rev. Genet. 17, 422–433 (2016).
J. Romiguier et al., Comparative population genomics in animals uncovers the determinants of genetic diversity. Nature 515, 261–263 (2014).
R. Leimu, P. I. A. Mutikainen, J. Koricheva, M. Fischer, How general are positive relationships between plant population size, fitness and genetic variation? J. Ecol. 94, 942–952 (2006).
O. E. Gaggiotti, Population genetic models of source–sink metapopulations. Theor. Popul. Biol. 50, 178–208 (1996).
A. R. Hughes, B. D. Inouye, M. T. J. Johnson, N. Underwood, M. Vellend, Ecological consequences of genetic diversity. Ecol. Lett. 11, 609–623 (2008).
P. Vergeer, R. Rengelink, A. Copal, N. J. Ouborg, The interacting effects of genetic variation, habitat quality and population size on performance of Succisa pratensis. J. Ecol. 91, 18–26 (2003).
M. van Kleunen, O. Bossdorf, W. Dawson, The ecology and evolution of alien plants. Annu. Rev. Ecol. Evol. Syst. 49, 25–47 (2018).
J. A. Catford, M. Bode, D. Tilman, Introduced species that overcome life history tradeoffs can cause native extinctions. Nat. Commun. 9, 2131 (2018).
L. A. van Boheemen, D. Z. Atwater, K. A. Hodgins, Rapid and repeated local adaptation to climate in an invasive plant. New Phytol. 222, 614–627 (2019).
S. B. Endriss, C. Alba, A. P. Norton, P. Pyšek, R. A. Hufbauer, Breakdown of a geographic cline explains high performance of introduced populations of a weedy invader. J. Ecol. 106, 699–713 (2018).
K. M. Dlugosch, I. M. Parker, Invading populations of an ornamental shrub show rapid life history evolution despite genetic bottlenecks. Ecol. Lett. 11, 701–709 (2008).
J. R. U. Wilson, E. E. Dormontt, P. J. Prentis, A. J. Lowe, D. M. Richardson, Something in the way you move: Dispersal pathways affect invasion success. Trends Ecol. Evol. (Amst.) 24, 136–144 (2009).
A. Estoup et al., Is there a genetic paradox of biological invasion? Annu. Rev. Ecol. Evol. Syst. 47, 51–72 (2016).
M. Rius, J. A. Darling, How important is intraspecific genetic admixture to the success of colonising populations? Trends Ecol. Evol. (Amst.) 29, 233–242 (2014).
M. Parepa, M. Fischer, C. Krebs, O. Bossdorf, Hybridization increases invasive knotweed success. Evol. Appl. 7, 413–420 (2014).
M. Exposito-Alonso et al., The rate and potential relevance of new mutations in a colonizing plant lineage. PLoS Genet. 14, e1007155 (2018).
K. M. Dlugosch, S. R. Anderson, J. Braasch, F. A. Cang, H. D. Gillette, The devil is in the details: Genetic variation in introduced populations and its contributions to invasion. Mol. Ecol. 24, 2095–2111 (2015).
K. M. Crawford, K. D. Whitney, Population genetic diversity influences colonization success. Mol. Ecol. 19, 1253–1263 (2010).
A. M. O. Oduor, R. Leimu, M. van Kleunen, Invasive plant species are locally adapted just as frequently and at least as strongly as native plant species. J. Ecol. 104, 957–968 (2016).
J. D. Parker et al., Do invasive species perform better in their new ranges? Ecology 94, 985–994 (2013).
A. Uesugi, A. Kessler, Herbivore exclusion drives the evolution of plant competitiveness via increased allelopathy. New Phytol. 198, 916–924 (2013).
T. M. Arredondo, G. L. Marchini, M. B. Cruzan, Evidence for human-mediated range expansion and gene flow in an invasive grass. Proc. Biol. Sci. 285, 20181125 (2018).
R. Salguero-Gómez et al., The COMPADRE plant matrix database: An open online repository for plant demography. J. Ecol. 103, 202–218 (2015).
S.-L. Li, A. Vasemägi, S. Ramula, Genetic variation facilitates seedling establishment but not population growth rate of a perennial invader. Ann. Bot. 117, 187–194 (2016).
S. R. Keller, P. D. Fields, A. E. Berardi, D. R. Taylor, Recent admixture generates heterozygosity-fitness correlations during the range expansion of an invading species. J. Evol. Biol. 27, 616–627 (2014).
C. Pickering, A. Mount, Do tourists disperse weed seed? A global review of unintentional human-mediated terrestrial seed dispersal on clothing, vehicles and horses. J. Sustain. Tour. 18, 239–256 (2010).
R. N. Mack, M. Erneberg, The United States naturalized flora: Largely the product of deliberate introductions. Ann. Mo. Bot. Gard. 89, 176–189 (2002).
R. H. Skinner, A. V. Stewart, Narrow-leaf plantain (Plantago lanceolata L.) selection for increased freezing tolerance. Crop Sci. 54, 1238–1242 (2014).
R. B. Primack, Evolutionary aspects of wind pollination in the genus Plantago (Plantaginaceae). New Phytol. 81, 449–458 (1978).
A. T. Pahl, J. Kollmann, A. Mayer, S. Haider, No evidence for local adaptation in an invasive alien plant: Field and greenhouse experiments tracing a colonization sequence. Ann. Bot. 112, 1921–1930 (2013).
A. Raj, M. Stephens, J. K. Pritchard, fastSTRUCTURE: Variational inference of population structure in large SNP data sets. Genetics 197, 573–589 (2014).
L. H. Rieseberg, A. Widmer, A. M. Arntz, J. M. Burke, The genetic architecture necessary for transgressive segregation is common in both natural and domesticated populations. Philos. Trans. R. Soc. Lond. B Biol. Sci. 358, 1141–1147 (2003).
E. Fenollosa, D. A. Roach, S. Munné-Bosch, Death and plasticity in clones influence invasion success. Trends Plant Sci. 21, 551–553 (2016).
O. François, M. G. B. Blum, M. Jakobsson, N. A. Rosenberg, Demographic history of european populations of Arabidopsis thaliana. PLoS Genet. 4, e1000075 (2008).
R. Pinhasi, J. Fort, A. J. Ammerman, Tracing the origin and spread of agriculture in Europe. PLoS Biol. 3, e410 (2005).
M. J. Grosvenor et al., Human activity was a major driver of the mid-Holocene vegetation change in southern Cumbria: Implications for the elm decline in the British Isles. J. Quaternary Sci. 32, 934–945 (2017).
Ł. Kajtoch et al., Phylogeographic patterns of steppe species in Eastern Central Europe: A review and the implications for conservation. Biodivers. Conserv. 25, 2309–2339 (2016).
J. Villellas, D. F. Doak, M. B. García, W. F. Morris, Demographic compensation among populations: What is it, how does it arise and what are its implications? Ecol. Lett. 18, 1139–1152 (2015).
D. A. Roach, Age-specific demography in Plantago: Variation among cohorts in a natural plant population. Ecology 84, 749–756 (2003).
J. Villellas, R. Berjano, A. Terrab, M. B. García, Escasa correspondencia entre diversidad genética y demografía en una planta a escala continental. Ecosistemas (Madr.) 28, 4–14 (2019).
C. I. Fraser, I. D. Davies, D. Bryant, J. M. Waters, How disturbance and dispersal influence intraspecific structure. J. Ecol. 106, 1298–1306 (2018).
K. Ohbayashi, Y. Hodoki, N. I Kondo, H. Kunii, M. Shimada, A massive tsunami promoted gene flow and increased genetic diversity in a near threatened plant species. Sci. Rep. 7, 10933 (2017).
M. Vallejo-Marín, M. E. Dorken, S. C. H. Barrett, The ecological and evolutionary consequences of clonality for plant mating. Annu. Rev. Ecol. Evol. Syst. 41, 193–213 (2010).
S. R. Coutts, R. Salguero-Gómez, A. M. Csergő, Y. M. Buckley, Extrapolating demography with climate, proximity and phylogeny: Approach with caution. Ecol. Lett. 19, 1429–1438 (2016).
A. Roeder, F. H. Schweingruber, M. Fischer, C. Roscher, Growth ring analysis of multiple dicotyledonous herb species—A novel community-wide approach. Basic Appl. Ecol. 21, 23–33 (2017).
D. T. Krohne, I. Baker, H. G. Baker, The maintenance of the gynodioecious breeding system in Plantago lanceolata L. Am. Midl. Nat. 103, 269–279 (1980).
G. Sagar, J. Harper, Biological flora of the British Isles. Plantago major L., P. media L. and P. lanceolata. J. Ecol. 52, 189–221 (1964).
S. J. Tonsor, Leptokurtic pollen-flow, non-leptokurtic gene-flow in a wind-pollinated herb, Plantago lanceolata L. Oecologia 67, 442–446 (1985).
K. P. Alston, D. M. Richardson, The roles of habitat features, disturbance, and distance from putative source populations in structuring alien plant invasions at the urban/wildland interface on the Cape Peninsula, South Africa. Biol. Conserv. 132, 183–198 (2006).
D. M. Richardson et al., Naturalization and invasion of alien plants: Concepts and definitions. Divers. Distrib. 6, 93–107 (2000).
T. H. Booth, H. A. Nix, J. R. Busby, M. F. Hutchinson, BIOCLIM: The first species distribution modelling package, its early applications and relevance to most current MAXENT studies. Divers. Distrib. 20, 1–9 (2013).
A. F. Zuur, E. N. Ieno, C. S. Elphick, A protocol for data exploration to avoid common statistical problems. Methods Ecol. Evol. 1, 3–14 (2010).
Y. M. Buckley et al., Data from “Plantpopnet protocol V1.01 2015.” Figshare. Accessed 1 November 2019.
A. Kilian et al., Diversity arrays technology: A generic genome profiling technology on open platforms. Methods Mol. Biol. 888, 67–89 (2012).
R. J. Elshire et al., A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS One 6, e19379 (2011).
R Core Team, R: A language and environment for statistical computing (R Foundation for Statistical Computing, Vienna, Austria, 2017), Version 3.6.
S. Turner et al., Quality control procedures for genome wide association studies. Curr. Protoc. Hum. Genet. 68, 1.19.11–11.19.18 (2012).
M. Slatkin, Linkage disequilibrium–Understanding the evolutionary past and mapping the medical future. Nat. Rev. Genet. 9, 477–485 (2008).
J. E. Wigginton, D. J. Cutler, G. R. Abecasis, A note on exact tests of Hardy-Weinberg equilibrium. Am. J. Hum. Genet. 76, 887–893 (2005).
M. P. Schilling et al., Genotyping-by-sequencing for Populus population genomics: An assessment of genome sampling patterns and filtering approaches. PLoS One 9, e95292 (2014).
E. Chan, Handy R functions for genetics research. Accessed 1 November 2018.
M. Foll, O. Gaggiotti, A genome-scan method to identify selected loci appropriate for both dominant and codominant markers: A Bayesian perspective. Genetics 180, 977–993 (2008).
N. Duforet-Frebourg, E. Bazin, M. G. Blum, Genome scans for detecting footprints of local adaptation using a Bayesian factor model. Mol. Biol. Evol. 31, 2483–2495 (2014).
E. Frichot, S. D. Schoville, G. Bouchard, O. François, Testing for associations between loci and environmental gradients using latent factor mixed models. Mol. Biol. Evol. 30, 1687–1699 (2013).
K. Luu, E. Bazin, M. G. Blum, pcadapt: An R package to perform genome scans for selection based on principal component analysis. Mol. Ecol. Resour. 17, 67–77 (2017).
P. de Villemereuil, É. Frichot, É. Bazin, O. François, O. E. Gaggiotti, Genome scan methods against more complex models: When and how much should we trust them? Mol. Ecol. 23, 2006–2019 (2014).
M. Foll, BayeScan v2.1 user manual. Accessed 30 June 2019.
D. J. Kriticos et al., CliMond: Global high-resolution historical and future scenario climate surfaces for bioclimatic modelling. Methods Ecol. Evol. 3, 53–64 (2011).
M. Andrello, S. Manel, MetaPopGen: An r package to simulate population genetics in large size metapopulations. Mol. Ecol. Resour. 15, 1153–1162 (2015).
J. M. van Groenendael, P. Slim, The contrasting dynamics of two populations of Plantago lanceolata classified by age and size. J. Ecol. 76, 585–599 (1988).
U. Steiner, S. Tuljapurkar, D. Roach, Quantifying genetic, environmental and individual stochastic variability in Plantago lanceolata. bioRxiv:10.1101/270603 (23 February 2018).
Z. N. Kamvar, J. C. Brooks, N. J. Grünwald, Novel R tools for analysis of genome-wide population genetic data with emphasis on clonality. Front. Genet. 6, 208 (2015).
O. Harismendy, J. Kim, X. Xu, L. Ohno-Machado, Evaluating and sharing global genetic ancestry in biomedical datasets. J. Am. Med. Inform. Assoc. 26, 457–461 (2019).
I. G. Alsos et al., Frequent long-distance plant colonization in the changing Arctic. Science 316, 1606–1609 (2007).
M. C. Fitzpatrick, S. R. Keller, Ecological genomics meets community-level modelling of biodiversity: Mapping the genomic landscape of current and future environmental adaptation. Ecol. Lett. 18, 1–16 (2015).
S. Ferrier, G. Manion, J. Elith, K. Richardson, Using generalized dissimilarity modelling to analyse and predict patterns of beta diversity in regional biodiversity assessment. Divers. Distrib. 13, 252–264 (2007).
F. Rousset, genepop’007: A complete re-implementation of the genepop software for Windows and Linux. Mol. Ecol. Resour. 8, 103–106 (2008).
J. Goudet, T. Hobart, hierfstat: Estimation and tests of hierarchical F-statistics. R package, version 0.04-22. Accessed 30 October 2018.
G. Hegyi, L. Garamszegi, Using information theory as a substitute for stepwise regression in ecology and behavior. Behav. Ecol. Sociobiol. 65, 69–76 (2011).
J. R. Pannell, B. Charlesworth, Effects of metapopulation processes on measures of genetic diversity. Philos. Trans. R. Soc. Lond. B Biol. Sci. 355, 1851–1864 (2000).
GBIF Secretariat, Checklist Dataset: GBIF Backbone Taxonomy. Accessed 29 September 2016.
B. J. Enquist, R. Condit, R. K. Peet, M. Schildhauer, B. M. Thiers, Cyberinfrastructure for an integrated botanical information network to investigate the ecological impacts of global climate change on plant biodiversity. PeerJ Preprints 4, e2615v2612 (2016).

Information & Authors


Published in

Go to Proceedings of the National Academy of Sciences
Go to Proceedings of the National Academy of Sciences
Proceedings of the National Academy of Sciences
Vol. 117 | No. 8
February 25, 2020
PubMed: 32034102


Data Availability

Data deposition: Data and code have been deposited in Zenodo,

Submission history

Published online: February 7, 2020
Published in issue: February 25, 2020


  1. plant invasion
  2. adaptation
  3. global change
  4. population genetics
  5. demography


Jan van Groenendael helped design the PLANTPOPNET network. Leander Anderegg, Lauchlan Fraser, Jennifer Gremer, Emily Griffoul, Adrian Oprea, Richard Shefferson, and Danielle Sherman provided data. Maeve Harrison assisted with field work. Valuable discussions with Alan Stewart and Andrzej Kilian improved our knowledge of Plantago cultivation and SNP data generation, respectively. This research was supported by a Science Foundation Ireland grant to Y.M.B. (European Research Council Development Programme 15/ERCD/2803). A.L.S. was supported by a Marie Skłodowska-Curie Individual Fellowship (746191) under the European Union Horizon 2020 Programme for Research and Innovation. Additional support came from: Catalan Institution for Research and Advanced Studies (ICREA) (Academia Award to S.M.-B.), Spanish Government (Ministerio de Economía y Competitividad BFU2015-64001-P/MINECO/FEDER to S.M.-B.), Estonian Ministry of Education and Research (Institutional Research Funding IUT20–29 to M.P.), European Regional Development Fund (Centre of Excellence EcolChange to M.P.), New Zealand Ministry for Business Innovation (Employment's Strategic Science Investment Fund to R.G.) and Academy of Finland (285746 to S.R.).


This article is a PNAS Direct Submission.



Zoology, School of Natural Sciences, Trinity College Dublin, Dublin 2, Ireland;
School of Agriculture and Food Science, University of Queensland, Gatton, 4343, Australia;
Present address: School of Agriculture and Food Science, University of Queensland, Gatton, QLD 4343, Australia.
Botany, School of Natural Sciences, Trinity College Dublin, Dublin 2, Ireland;
Departamento Biogeografía y Cambio Global, Museo Nacional de Ciencias Naturales–Consejo Superior de Investigaciones Científicas (MNCN-CSIC), E-28006 Madrid, Spain;
Department of Geography, King’s College London, WC2B 4BG London, United Kingdom;
Anna Mária Csergő
Zoology, School of Natural Sciences, Trinity College Dublin, Dublin 2, Ireland;
Department of Botany, Faculty of Horticultural Science, Szent István University, 1118 Budapest, Hungary;
Soroksár Botanical Garden, Faculty of Horticultural Science, Szent István University, 1118 Budapest, Hungary;
School of Biological Sciences, University of Queensland, Brisbane, QLD 4072, Australia;
Elizabeth E. Crone
Department of Biology, Tufts University, Medford, MA 02145;
Johan Ehrlén
Department of Ecology, Environment and Plant Sciences, Stockholm University, SE-106 91 Stockholm, Sweden;
Maria B. Garcia
Pyrenean Institute of Ecology, CSIC, 50059 Zaragoza, Spain;
Department of Evolutionary Biology and Environmental Studies, University of Zurich, CH-8057 Zurich, Switzerland;
Research Centre for Ecological Change, Faculty of Biological and Environmental Sciences, University of Helsinki, FI-00014 Helsinki, Finland;
Deborah A. Roach
Department of Biology, University of Virginia, Charlottesville, VA 22904;
Department of Zoology, University of Oxford, OX1 3SZ Oxford, United Kingdom;
School of Life and Environmental Sciences, The University of Sydney, Sydney, NSW 2006, Australia;
Department of Animal and Plant Sciences, University of Sheffield, S10 2TN Sheffield, United Kingdom;
Bret D. Elderd
Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803;
Zoology, School of Natural Sciences, Trinity College Dublin, Dublin 2, Ireland;
Department of Evolutionary Biology, Ecology and Environmental Sciences, University of Barcelona, 08028 Barcelona, Spain;
Institut de Recerca de la Biodiversitat, University of Barcelona, 08028 Barcelona, Spain;
Maude E. A. Baudraz
Zoology, School of Natural Sciences, Trinity College Dublin, Dublin 2, Ireland;
Judit Bódis
Georgikon Faculty, University of Pannonia, H-8360 Keszthely, Hungary;
Francis Q. Brearley
Department of Natural Sciences, Manchester Metropolitan University, M1 5GD Manchester, United Kingdom;
Plant Evolutionary Ecology, Institute of Evolution and Ecology, University of Tübingen, 72074 Tübingen, Germany;
Ecosystem and Biodiversity Research Group, Institute of Landscape Ecology, University of Münster, 48149 Münster, Germany;
Department of Integrative Biology, University of Guelph, Guelph, ON N1G 2W1, Canada;
Richard P. Duncan
Institute for Applied Ecology, University of Canberra, Canberra, ACT 2617, Australia;
School of Biological Sciences, University of Queensland, Brisbane, QLD 4072, Australia;
CSIRO Land & Water, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Dutton Park, QLD 4102, Australia;
Ben Gooden
CSIRO Health & Biosecurity, CSIRO, Black Mountain, ACT 2601, Australia;
School of Earth, Atmospheric and Life Sciences, Faculty of Science, Medicine and Health, University of Wollongong, Wollongong, NSW 2522, Australia;
Manaaki Whenua–Landcare Research, 7608 Lincoln, New Zealand;
Department of Environmental Sciences, Western Norway University of Applied Sciences, N-6856 Sogndal, Norway;
Institute of Ecology and Earth Sciences, University of Tartu, 51005 Tartu, Estonia;
Zoology, School of Natural Sciences, Trinity College Dublin, Dublin 2, Ireland;
Biodiversity and Nature Tourism, Estonian University of Life Sciences, 51006 Tartu, Estonia;
Michele Lonati
Department of Agricultural, Forest and Food Science, University of Torino, 10015 Grugliasco, Italy;
Joslin L. Moore
School of Biological Sciences, Monash University, Clayton, VIC 3800, Australia;
Department of Evolutionary Biology, Ecology and Environmental Sciences, University of Barcelona, 08028 Barcelona, Spain;
Research Group of Plant Biology under Mediterranean Conditions, Faculty of Biology, University of Balearic Islands, 07122 Palma de Mallorca, Spain;
Norwegian Institute for Nature Research, N-0349 Oslo, Norway;
Meelis Pärtel
Institute of Ecology and Earth Sciences, University of Tartu, 51005 Tartu, Estonia;
Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544;
Department of Biology, University of Turku, 20014 Turku, Finland;
Department of Ecology, Environment and Plant Sciences, Stockholm University, SE-106 91 Stockholm, Sweden;
The National Research Centre for the Working Environment, 2100 København Ø, Denmark;
Department of Agricultural, Forest and Food Science, University of Torino, 10015 Grugliasco, Italy;
Anna Roeder
Department of Physiological Diversity, Helmholtz Centre for Environmental Research – UFZ, 04103 Leipzig, Germany;
German Centre for Integrative Biodiversity Research Halle-Jena-Leipzig (iDiv), 04318 Leipzig, Germany;
Christiane Roscher
Department of Physiological Diversity, Helmholtz Centre for Environmental Research – UFZ, 04103 Leipzig, Germany;
German Centre for Integrative Biodiversity Research Halle-Jena-Leipzig (iDiv), 04318 Leipzig, Germany;
Helsinki Institute of Life Science, University of Helsinki, 00100 Helsinki, Finland;
Organismal and Evolutionary Research Programme, University of Helsinki, 00014 Helsinki, Finland;
Department of Ecology, Environment and Plant Sciences, Stockholm University, SE-106 91 Stockholm, Sweden;
Norwegian Institute for Nature Research, N-5006 Bergen, Norway;
Department of Ecology and Evolutionary Biology, University of California, Irvine, CA 92697;
Elizabeth M. Wandrag
Institute for Applied Ecology, University of Canberra, Canberra, ACT 2617, Australia;
School of Environmental and Rural Science, University of New England, Armidale, NSW 2351, Australia;
School of Biological, Earth & Environmental Sciences and Environmental Research Institute, University College Cork, Cork T23 N73K, Ireland
Zoology, School of Natural Sciences, Trinity College Dublin, Dublin 2, Ireland;


To whom correspondence may be addressed. Email: [email protected].
Author contributions: A.L.S., T.R.H., J.V., J.A.C., A.M.C., S.P.B., E.E.C., J.E., M.B.G., A.-L.L., D.A.R., R.S.-G., G.M.W., D.Z.C., B.D.E., A.F., S.M.-B., and Y.M.B. designed research; A.L.S., T.R.H., J.V., J.A.C., A.M.C., S.P.B., E.E.C., J.E., M.B.G., A.-L.L., D.A.R., R.S.-G., G.M.W., D.Z.C., B.D.E., A.F., S.M.-B., M.E.A.B., J.B., F.Q.B., A.B., C.M.C., R.P.D., J.M.D., B.G., R.G., L.N.H., A.H., R.K., L.L., M.L., J.L.M., M.M., S.L.O., M.P., W.K.P., S.R., P.U.R., S.R.E., A.R., C.R., M.S., A.J.M.T., J.P.T., G.E.V., E.M.W., A.W., and Y.M.B. performed research; A.L.S., S.P.B., and Y.M.B. analyzed data; and A.L.S., T.R.H., J.V., J.A.C., A.M.C., S.P.B., E.E.C., J.E., M.B.G., A.-L.L., D.A.R., R.S.-G., G.M.W., D.Z.C., B.D.E., A.F., S.M.-B., M.E.A.B., J.B., F.Q.B., A.B., C.M.C., R.P.D., J.M.D., B.G., R.G., L.N.H., A.H., R.K., L.L., M.L., J.L.M., M.M., S.L.O., M.P., W.K.P., S.R., P.U.R., S.R.E., A.R., C.R., M.S., A.J.M.T., J.P.T., G.E.V., E.M.W., A.W., and Y.M.B. wrote the paper.

Competing Interests

The authors declare no competing interest.

Metrics & Citations


Note: The article usage is presented with a three- to four-day delay and will update daily once available. Due to ths delay, usage data will not appear immediately following publication. Citation information is sourced from Crossref Cited-by service.

Citation statements



If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by


    View Options

    View options

    PDF format

    Download this article as a PDF file


    Get Access

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Personal login Institutional Login

    Recommend to a librarian

    Recommend PNAS to a Librarian

    Purchase options

    Purchase this article to get full access to it.

    Single Article Purchase

    Global gene flow releases invasive plants from environmental constraints on genetic diversity
    Proceedings of the National Academy of Sciences
    • Vol. 117
    • No. 8
    • pp. 3885-4430







    Share article link

    Share on social media