Limits of long-term selection against Neandertal introgression

Significance Since the discovery that all non-Africans inherit 2% of their genomes from Neandertal ancestors, there has been a great interest in understanding the fate and effects of introgressed Neandertal DNA in modern humans. A number of recent studies have claimed that there has been continuous selection against introgressed Neandertal DNA over the last 55,000 years. Here, we show that there has been no long-term genome-wide removal of Neandertal DNA, and that the previous result was due to incorrect assumptions about gene flow between African and non-African populations. Nevertheless, selection did occur following introgression, and its effect was strongest in regulatory regions, suggesting that Neandertals may have differed from humans more in their regulatory than in their protein-coding sequences.

Several studies have suggested that introgressed Neandertal DNA was subjected to negative selection in modern humans. A striking observation in support of this is an apparent monotonic decline in Neandertal ancestry observed in modern humans in Europe over the past 45,000 years. Here, we show that this decline is an artifact likely caused by gene flow between modern human populations, which is not taken into account by statistics previously used to estimate Neandertal ancestry. When we apply a statistic that avoids assumptions about modern human demography by taking advantage of two high-coverage Neandertal genomes, we find no evidence for a change in Neandertal ancestry in Europe over the past 45,000 years. We use whole-genome simulations of selection and introgression to investigate a wide range of model parameters and find that negative selection is not expected to cause a significant long-term decline in genome-wide Neandertal ancestry. Nevertheless, these models recapitulate previously observed signals of selection against Neandertal alleles, in particular the depletion of Neandertal ancestry in conserved genomic regions. Surprisingly, we find that this depletion is strongest in regulatory and conserved noncoding regions and in the most conserved portion of protein-coding sequences.
Neandertal | selection | introgression | modern human | demography I nterbreeding between Neandertals and modern humans ∼55,000 y ago has resulted in all present-day non-Africans inheriting at least 1-2% of their genomes from Neandertal ancestors (1,2). There is significant heterogeneity in the distribution of this Neandertal DNA across the genomes of present-day people (3,4), including a reduction in Neandertal alleles in conserved genomic regions (3). This has been interpreted as evidence that some Neandertal alleles were deleterious for modern humans and were subject to negative selection following introgression (3,5). Several studies have suggested that low effective population sizes (N e ) in Neandertals led to decreased efficacy of purifying selection and the accumulation of weakly deleterious variants. Following introgression, these deleterious alleles, along with linked neutral Neandertal alleles, would have been subjected to more efficient purifying selection in the larger modern human population (6,7).
In apparent agreement with this hypothesis, a study of Neandertal ancestry in a set of anatomically modern humans from Upper-Paleolithic Europe used two independent statistics to conclude that the amount of Neandertal DNA in modern human genomes decreased monotonically over the last 45,000 y (Fig. 1A, dashed line) (8). This decline was interpreted as direct evidence for continuous negative selection against Neandertal alleles in modern humans (8)(9)(10)(11). However, it was not formally shown that selection on deleterious introgressed variants could produce a decline in Neandertal ancestry of the observed magnitude. Nevertheless, this decrease in Neandertal ancestry-together with the suggestion of a higher burden of deleterious alleles in Neandertals-are now commonly invoked to explain the fate of Neandertal ancestry in modern humans (9)(10)(11)(12).
Here, we reexamine estimates of Neandertal ancestry in ancient and present-day modern humans, taking advantage of a second high-coverage Neandertal genome that recently became available (13). This allows us to avoid some key assumptions about modern human demography that were made in previous studies. Our analysis shows that the Neandertal ancestry proportion in Europeans has not decreased significantly over the last 45,000 y. Using simulations of selection and introgression, we show that a model of weak selection against deleterious Neandertal variation also does not predict significant changes in Neandertal ancestry during the time period covered by existing ancient modern human samples. In contrast, these simulations do predict a depletion of Neandertal ancestry around functional genomic regions. We then use our updated Neandertal ancestry estimates to examine the genomic distribution of introgressed Neandertal DNA and find that selection against introgression was strongest in regulatory and conserved noncoding regions compared with protein-coding sequence (CDS), suggesting that regulatory differences between Neandertals and modern humans may have been more extreme than protein-coding differences.

Results
Previous Neandertal Ancestry Estimate. A number of methods have been developed to quantify Neandertal ancestry in modern human genomes (14). Among the most widely used is the f 4 -ratio statistic, which measures the fraction of drift shared with one of two parental lineages to determine the proportion of ancestry, α, contributed by that lineage ( Fig. 1 and SI Appendix, Fig. S1) (15,16). Although they have been used to draw inferences about gene flow between archaic and modern human populations, f 4 -ratio statistics are known to be sensitive to violations of the underlying population model (15). Estimating α, the proportion Significance Since the discovery that all non-Africans inherit 2% of their genomes from Neandertal ancestors, there has been a great interest in understanding the fate and effects of introgressed Neandertal DNA in modern humans. A number of recent studies have claimed that there has been continuous selection against introgressed Neandertal DNA over the last 55,000 years. Here, we show that there has been no long-term genome-wide removal of Neandertal DNA, and that the previous result was due to incorrect assumptions about gene flow between African and non-African populations. Nevertheless, selection did occur following introgression, and its effect was strongest in regulatory regions, suggesting that Neandertals may have differed from humans more in their regulatory than in their protein-coding sequences.
of ancestry in X contributed by a lineage A, requires a sister lineage B to lineage A which does not share drift with X after separation of B from A (SI Appendix, Fig. S1). Fu et al. (8) used an f 4 -ratio statistic to infer the contribution from an archaic lineage by first estimating the proportion of East African ancestry in a non-African individual X, under the assumption that Central and West Africans (B) are an outgroup to the East African lineage (A) and to the modern human ancestry in non-Africans. Defining this East African ancestry proportion as α = f 4 (C. and W. Africans, Chimp; X, Archaics)/f 4 (C. and W. Africans, Chimp; E. Africans, Archaics), the proportion of archaic ancestry was then calculated simply as 1 − α, under the assumption that all ancestry that is not of East African origin must come from an archaic lineage (8). We refer to this statistic as an "indirect f 4 -ratio." Given the sensitivity of the f 4 -ratio method to violations of the underlying population models (15), we explored the validity of assumptions on which this calculation was based. In addition to the topology of the demographic tree, which has recently been shown to be incorrect (17), the indirect f 4 -ratio assumes that the relationship between Africans and West Eurasians has remained constant over time (8). However, our understanding of modern human history and demography have been challenged by new fossil discoveries (18) and the analysis of ancient DNA, with several studies documenting previously unknown migration events in both West Eurasia (19) and Africa (17,20,21). Furthermore, an f 4 statistic sensitive to changes in the relationships between West Eurasians and various African populations [formulated as f 4 (Ust'-Ishim, X; African, Chimp), where X is a West Eurasian individual] shows increasing allele sharing between West Eurasians and Africans over time (SI Appendix, Fig. S2A).
In contrast, f 4 (Ust'-Ishim, Papuan; African, Chimp) is not significantly different from zero (jZj < 1 when using Dinka, Yoruba, or Mbuti in the third position of the f 4 statistic), demonstrating that this trend is not shared by all non-Africans.
To evaluate the sensitivity of the indirect f 4 -ratio to migration events, we performed neutral simulations of Neandertal, West Eurasian, and African demographic histories (Fig. 2). All simulations included introgression from Neandertals into West Eurasians, and varying levels of migration between Africans and West Eurasians, and between African populations. We find that gene flow from West Eurasians into Africans leads to misestimates of Neandertal ancestry when using the indirect f 4 -ratio statistic, and results in the incorrect inference of a continuous decline in Neandertal ancestry. This decline is not observed in the true simulated Neandertal ancestry ( Fig.  2A). The magnitude of this bias depends on the total amount of West Eurasian gene flow into Africa, with larger amounts leading to apparent steeper declines ( Fig. 2A). Additionally, gene flow between the two African populations used in the indirect f 4 -ratio calculation leads to overestimation of the true level of Neandertal ancestry (Fig.  2C). Overall, we find that a combination of West Eurasian migration to Africa and gene flow between African populations can produce patterns that are very similar to those observed in the empirical data ( Fig. 2D and SI Appendix, Fig. S3A). However, we caution that effective population sizes and the timing of migration also affect these estimates (SI Appendix, Fig. S3), and that there are likely many additional models that match the empirical data.
We note that an independent statistic, using a different set of genomic sites in the same ancient individuals, had been used as a second line of evidence for an ongoing decrease in Neandertal ancestry (8). This statistic, which we refer to as the "admixture array statistic," measures the proportion of Neandertal-like alleles in a given sample at sites where present-day Yoruba individuals carry a nearly fixed allele that differs from homozygous sites in the Altai Neandertal (22). Much like the indirect f 4 -ratio, we find that the admixture array statistic is affected by gene flow from non-Africans into Africans and incorrectly infers a decline in the Neandertal ancestry over time (Fig. 2D).  . "Total migration" is shown, that is, gm, where g is generations of migration, and m is the proportion of the target population composed of migrants in each generation. If present, continuous migration between A 1 and A 2 begins 40 kya and migration between Europe and Africa begins 5 kya. True Neandertal ancestry proportions are shown in black, and closely match the direct f 4 -ratio estimates (mean absolute difference from truth for indirect f 4 -ratio is 2.6%, 0.12%, and 2.8% for A, B, and C respectively; for direct f 4 -ratio 0.25%, 0.05%, and 0.06%). (D) Simulations of an example demographic model with migration parameters 0.09, 0.0, and 0.1 for E → A, A → E, and A ↔ A, respectively, which approximate the empirical direct and indirect f 4 -ratios (Fig. 1A).
Given the indirect f 4 -ratio's sensitivity to modern human demography, combined with our incomplete understanding of human migrations, we sought to reevaluate the patterns of Neandertal ancestry in modern humans in a more robust manner.
A Robust Statistic to Estimate Neandertal Ancestry. The recent availability of a second high-coverage Neandertal genome allows us to estimate Neandertal ancestry using two Neandertals-an individual from the Altai Mountains, the so-called "Altai Neandertal" (23) and an individual from the Vindija Cave in Croatia, the so-called "Vindija Neandertal" (13). Specifically, we can estimate the proportion of ancestry coming from the Vindija lineage into a modern human (X) using the Altai Neandertal as a second Neandertal in an f 4 -ratio calculated as f 4 (Altai, Chimp; X, African)/f 4 (Altai, Chimp; Vindija, African), which we refer to as a "direct f 4 -ratio" (Fig. 1C and SI Appendix, Fig. S1). Note that unlike the indirect f 4 -ratio described previously, the f 4 -ratio in this formulation does not make assumptions about deep relationships between modern human populations ( Fig. 1C and SI Appendix, Fig. S1). Instead, it assumes that any Neandertal population that contributed ancestry to X formed a clade with the Vindija Neandertal. Recent analyses showed that this is the case for all non-African populations studied to date, including the ancient modern humans in this study (13,24). When calculated on the simulations described above, we find that the direct f 4 -ratio is more robust than the indirect f 4 -ratio (Fig. 2). In fact, its temporal trajectory always closely matches the true simulated Neandertal ancestry trajectory, regardless of the specific parameters of gene flow between non-Africans and Africans (Fig. 2). We note that gene flow from West Eurasians into Africans, which introduces introgressed Neandertal alleles into Africa, produces a slight underestimate of Neandertal ancestry in all samples ( Fig. 2A). This is in agreement with empirical direct f 4 -ratio estimates, which vary depending on the African population used in the calculation, with African populations known to carry West Eurasian ancestry (e.g., Mozabite, Saharawi) (17,25) generating the lowest estimates (SI Appendix, Fig. S4). Crucially, when we use the direct f 4 -ratio to estimate the trajectory of Neandertal ancestry in ancient and present-day Europeans, we observe nearly constant levels of Neandertal ancestry over time (Fig. 1A, points and solid line) and find that a null model of zero slope can no longer be rejected (Fig. 1A, P = 0.36, estimated via resampling as described in SI Appendix, section S1).
We note that these estimates are based on a relatively small number of individuals, especially for older time points, and that the CIs are wide. For example, we cannot reject a linear decline in Neandertal ancestry of approximately half a percent over the timespan of this dataset (95% CI −0.51-0.37%). Additionally, these analyses are performed on SNPs that were ascertained largely in present day individuals. To examine the effects of such ascertainment, we split the dataset based on the ascertainments used and recalculated the direct and indirect f 4 -ratios on each of the subsets (SI Appendix, Fig. S5). Although the slopes show some variability, in all but one ascertainment subset the direct f 4 -ratio cannot reject a slope of 0, whereas the indirect f 4 -ratio consistently rejects a slope of 0, suggesting that these results are robust to the effects of ascertainment (SI Appendix, Fig. S5). In addition to calculating direct f 4 -ratio estimates, we estimated Neandertal ancestry proportions using the qpAdm method (26) and obtained similar results (null model of zero slope using Neandertal ancestry point estimates cannot be rejected with P = 0.17).
Our observation that there has been no change in Neandertal ancestry over the past 45,000 y has several implications for our understanding of the fate of Neandertal DNA in modern humans. First, it constrains the timescale during which selection could have significantly affected the average genome-wide Neandertal ancestry in modern humans, an issue addressed below in more detail. Second, a previous analysis of a 40 ky old individual ("Tianyuan") from East Asia applied the indirect f 4 -ratio statistic to estimate his Neandertal ancestry proportion at 5% (27). When we apply the direct f 4 -ratio statistic for this individual, we arrive at a value of ∼2.1% (using Dinka as the African group in the calculation). Third, it has consequences for the so-called "dilution" hypothesis, which suggests that lower levels of Neandertal ancestry in Europeans compared with East Asians can be explained by dilution of Neandertal ancestry in Europeans due to admixture with a hypothetical Basal Eurasian population that carried little to no Neandertal ancestry (19,28). Previous studies have found Basal Eurasian ancestry in all modern and some ancient Europeans [in this study, four ancient individuals show evidence of Basal Eurasian ancestry: Satsurblia (15 kya), Kotias (10 kya), Ranchot88 (10 kya), and Stuttgart (8 kya), SI Appendix, Fig. S6] (8, 19). Our finding that there is no ongoing decline in Neandertal ancestry in Europeans suggests that Neandertal ancestry in Europe has not been diluted in a significant way by gene flow from Basal Eurasians. Specifically, we find no difference in Neandertal ancestry in European individuals with and without Basal Eurasian ancestry (direct f 4 -ratio mean 2.31% vs. 2.38%, respectively; P = 0.36). However, given the small number of relevant samples we also cannot exclude that there could be up to 13% less Neandertal ancestry in individuals with Basal Eurasian ancestry, or as much as 6% more Neandertal ancestry in individuals without Basal Eurasian ancestry (95% CI).
In contrast, we do find that present-day Near Easterners carry significantly less Neandertal ancestry than Europeans (direct f 4 -ratio mean 2.03% vs. 2.33%; P = 0.001; SI Appendix, Fig. S7A). Furthermore, present-day populations in the Near East show even stronger signals of admixture with a deeply divergent modern human lineage than observed in the rest of West Eurasians (SI Appendix, Fig. S7B), suggesting that they carry additional ancestry components that are not present in Europe and that could potentially contribute to lower Neandertal ancestry in the Near East. We note, however, that a simple model of admixture from Africa into Near East would be expected to produce a similar f 4 statistics difference between Near East and the rest of West Eurasia and could also explain lower values of Neandertal ancestry in this population.
Long-Term Dynamics of Selection Against Introgressed DNA. Our observation that Neandertal ancestry levels did not significantly decrease from ∼45,000 y ago until today is seemingly at odds with the hypothesis that lower effective population sizes in Neandertals led to an accumulation of deleterious alleles, which were then subjected to negative selection in modern humans (3,(8)(9)(10). To investigate the expected long-term dynamics of selection against Neandertal introgression under this hypothesis, we simulated a model of the human genome with empirical distributions of functional regions and selection coefficients, extending a strategy previously applied by Harris and Nielsen (6). We simulated modern human and Neandertal demography, including a low long-term effective population size (N e ) in Neandertals (Neandertal N e = 1,000 vs. modern human N e = 10,000) and 10% introgression at 55 kya (2,200 generations ago, assuming generation time of 25 y). To track the changes in Neandertal ancestry following introgression, we placed fixed Neandertal-human differences as neutral markers, both outside regions that accumulated deleterious mutations (to study the effect of negative selection on linked genome-wide neutral Neandertal variation) as well as within regions directly under selection (to track the effect of negative selection itself) (Fig. 3A).
Similar to Harris and Nielsen (6), we observed abrupt removal of Neandertal alleles from the modern human population during the first ∼10 generations after introgression, followed by quick stabilization of Neandertal ancestry levels (Fig. 3B). Compared with empirical estimates of Neandertal ancestry, we find a better fit between these simulations and the direct f 4 -ratio estimate than with the indirect f 4 -ratio estimate, suggesting that our direct Neandertal ancestry estimates are consistent with theoretical expectations of genome-wide selection against introgression (Fig. 3B). Specifically, simulations show −0.004% change in Neandertal ancestry over 45 ky; in the empirical data this slope is not rejected using the direct f 4 -ratio (P = 0.29), but is significantly different from the indirect f 4 -ratio (P < 0.001).
Because many factors can potentially influence the efficacy of negative selection, and no model fully captures all of these, we next sought to determine whether there is a combination of model parameters that could potentially lead to long-term continuous removal of Neandertal ancestry over time. Surprisingly, we failed to find a model which would produce a significant decline over time, although we tried by: (i) decreasing the longterm Neandertal N e before introgression (making purifying selection in Neandertals even less efficient), (ii) increasing the N e of modern humans after introgression (i.e., increasing the efficacy of selection against introgressed alleles), (iii) artificially increasing the deleteriousness of Neandertal variants after introgression (approximating a "hybrid incompatibility" scenario), (iv) simulating mixtures of dominance coefficients, or by (v) increasing the total amount of functional sequence (thereby increasing the number of accumulated deleterious variants in Neandertals and modern humans) (SI Appendix, Figs. S9-S13). Varying these factors primarily affected the magnitude of the initial removal of introgressed DNA by increasing the number of perfectly linked deleterious mutations in early Neandertal-modern human offspring (decreasing their fitness compared with individuals with less Neandertal ancestry), which in turn influenced the final level of Neandertal ancestry in the population (SI Appendix, Figs. S9-S13).
The depletion of Neandertal ancestry around functional genomic elements in modern human genomes has also been taken as evidence for selection against Neandertal introgressed DNA (3,8). We next examined the genomic distribution of Neandertal markers at different time points in our simulations to determine whether our models can recapitulate these signals. In agreement with empirical results in present-day humans (3), we found a strong negative correlation between the proportion of Neandertal introgression surviving at a locus and distance to the nearest region under selection (Fig. 3C). Furthermore, we found that the strength of this correlation increases over time, with the bulk of these changes occurring between 10 and 400 generations postadmixture [mean Pearson's correlation coefficient ρ = 0.07, 0.79, 0.96 at generations 10, 400, and 2,200, respectively (SI Appendix, Fig. S15)]. We note that this time period predates all existing ancient modern human sequences, frustrating any current comparison with empirical data. However, despite no apparent change in genome-wide Neandertal ancestry proportion over time, we observe a smaller though still significant decrease in linked Neandertal ancestry during the time period for which modern human sequences exist (∼400-2,200 generations post-admixture) (Fig. 3 C and B). Indeed, by looking at the average per-generation changes in frequencies of simulated Neandertal mutations (that is, derivatives of allele frequencies in each generation), we observe the impact of negative selection on linked neutral Neandertal markers until at least ∼700 generations post admixture (Fig. 3D) and find that it closely follows the pattern of introgressed deleterious mutations (Fig. 3D). After this period of gradual removal, selection against linked neutral variation slows down significantly as genome-wide Neandertal ancestry becomes largely unlinked from regions that are under negative selection (Fig. 3D). In contrast, the selected variants themselves are still removed, although at increasingly slower rates (Fig. 3D). Due to this slow rate, and the small contribution these alleles make to genome-wide Neandertal ancestry, their continued removal has little impact on the slope of Neandertal ancestry over time.

Neandertal DNA Is Depleted in Regulatory and Conserved Noncoding
Sequence. We next sought to leverage the direct f 4 -ratio in analyses of selection against introgression in functional genomic regions. Although previous studies have identified a depletion of Neandertal DNA in genomic regions with a high degree of evolutionary conservation, these studies have relied on maps of introgressed haplotypes (3,29). Such maps may lack power to detect introgressed Neandertal DNA in highly conserved regions, as these regions may contain fewer informative sites carrying Neandertal-modern human differences. Furthermore, previous studies of negative selection against introgressed Neandertal DNA divided the genome into bins based on measures of evolutionary conservation, such as B values (30), which are not easily interpreted in terms of functional significance. To determine whether particular functional classes of genomic sites are differently affected by Neandertal introgression, we partitioned the human genome by functional annotation obtained from Ensembl v91 (31), and by primate conserved regions inferred using phastCons (32). For each annotation category, we estimated the Neandertal ancestry proportion in non-African Simons Genome Diversity Project (SGDP) individuals (excluding Oceanians) using the direct f 4 -ratio (Fig. 4).
In seeming contrast with previous studies (3,8), we observed no significant depletion of Neandertal ancestry in CDS compared with intronic and intergenic regions (referred to as "gap" regions below) (average direct f 4 -ratio ∼1.94% in both; Fig. 4). However, we did identify a striking depletion of Neandertal ancestry in both promoters and phastCons conserved regions (1.15% and 0.95%), with both containing significantly less Neandertal ancestry than gap regions (P = 0.004 and P < 0.0001, estimated via resampling as described in SI Appendix, section S1). We note that 62% of CDS overlaps with phastCons regions (21% of phastCons conserved tracks overlap CDS); indeed, conserved CDS has a lower Neandertal ancestry estimate (1.25%) than overall CDS, although not as low as all phastCons regions (Fig. 4). These results suggest that previously observed depletions in conserved and genic regions may not have been driven primarily by protein-coding differences between Neandertals and modern humans, as was previously assumed, but rather by differences in promoters and other noncoding conserved sequence. This hypothesis is supported by several recent studies of the effects of introgressed Neandertal sequences, including those with signatures of adaptive introgression, which found that surviving functional introgressed haplotypes have their major influence on gene expression regulation (33)(34)(35)(36)(37).
We note that the lack of a depletion in CDS does not fit the observations from our simulations (Fig. 3C). Assuming additivity, and a distribution of fitness effects (DFEs) derived from the frequency spectra of mutations altering coding sequence (38), these simulations predict a reduction of 5-17% Neandertal ancestry versus nonselected regions, depending on distance from selected regions (Fig. 3C). In addition, the reduction in simulations is much smaller than the empirical depletions of promoter and phastCons regions (40% and 51%, respectively). Together, these demonstrate that the actions of selection against Neandertal sequence are not fully captured by the models presented here. Although it is beyond the scope of this work, it may be possible to leverage distributions of Neandertal ancestry in studying the action of selection in noncoding sequence. Challenges associated with such work include the uncertainty of the DFE of mutations affecting noncoding sequence, and their dominance coefficients, potential epistatic effects of regulatory mutations, as well as the fact that a single deleterious mutation can affect a region falling into multiple functional categories at once (SI Appendix, Table S1).

Conclusions
Our reevaluation of Neandertal ancestry in modern human genomes indicates that overall levels of Neandertal ancestry in Europe have not significantly decreased over the past 45,000 y, and that previous observations of continuous Neandertal ancestry decline were likely an artifact of unaccounted-for gene flow increasing allele sharing between West Eurasian and African populations. Nevertheless, we do find evidence of selection against Neandertal DNA in the genome-wide distribution of Neandertal ancestry, with such ancestry depleted in promoter and other noncoding conserved DNA more strongly than in proteincoding sequence, raising the possibility that Neandertals may have differed more from modern humans in their regulatory variants than in their protein-coding sequences, and that regulatory variation may provide a richer template for selection to act upon.
Furthermore, simulations suggest that negative selection against introgression is expected to have the strongest impact on genome-wide Neandertal ancestry during the first few hundred generations, before the time frame for which ancient samples are currently available. The genomes of early modern humans living 55-50 kya, although difficult to obtain, may shed additional light on the process of selection against Neandertal DNA, as well as on early out-of-Africa demography.
Our findings can be extrapolated to other cases where one species or population contributes a fraction of ancestry to another species or population, a frequent occurrence in nature (5,29,(39)(40)(41). Even in cases where the introgressing population carries a high burden of deleterious mutations, negative selection is not expected to result in an extended decrease in the overall genome-wide ancestry contributed by that population. Therefore, any long-term shifts in overall ancestry proportions over time are likely to be the result of forces other than negative selection, for example admixture with one or more other populations.

Materials and Methods
Source Code and Jupyter Notebooks. Complete source code for data processing and simulation pipelines, as well as R and Python Jupyter notebooks with all analyses, can be downloaded from the project repository on GitHub: https://www.github.com/bodkan/nea-over-time.  Fig. S4). Gray dashed line shows mean Neandertal ancestry in conserved phastCons regions. (Bottom) Idealized representation of genomic regions.
Admixture Statistics. All f 4 statistics, f 4 -ratio, and qpAdm statistics were calculated on the merged 2.2 million loci EIGENSTRAT dataset using our R package admixr (available from https://www.github.com/bodkan/admixr) which utilizes the ADMIXTOOLS software suite for all underlying calculations (15).
Estimates of Neandertal Ancestry. Indirect f 4 -ratio estimates (Fig. 1A Fig. S1), as described in the original Fu et al. study (8). Direct f 4 -ratio estimates (Fig. 1A, solid line) were calculated as f 4 (Altai, Chimpanzee; X, African)/f 4 (Altai, Chimpanzee; Vindija, African) (SI Appendix, Fig. S1). Neandertal ancestry proportions using qpAdm were estimated assuming a two-source model, with the Vindija Neandertal and Mbuti as potential sources, and Chimpanzee, the Altai Neandertal, and the Denisovan as outgroups. Admixture array-based Neandertal ancestry estimates were calculated as the proportion of alleles in a test individual matching the allele seen in Neandertals. Confidence intervals and P values were calculated using a resampling strategy described in SI Appendix, section S1.
Affinity of Ancient and Present-Day Individuals Toward Africans over Time. We calculated f 4 statistics in the form f 4 (Ust'-Ishim, X; Y, Chimpanzee), which test for changes in the sharing of derived alleles between a series of West Eurasians (X) and population Y with respect to Ust'-Ishim, an ancient huntergatherer that predates the split of West and East Eurasians (43) (SI Appendix, Fig. S2). Admixture between X and Y or populations related to X and Y is expected to lead to an increase in the proportion of shared derived alleles.
Testing for the Presence of Basal Eurasian Ancestry. We used the statistic f 4 (West Eurasian W, Han; Ust'-Ishim, Chimpanzee) to look for evidence of Basal Eurasian ancestry in a West Eurasian W (SI Appendix, Fig. S4) (28). This statistic tests if the data are consistent with a tree in which W and Han lineages form a clade, which results in f 4 statistic not significantly different from 0. Significantly negative values are evidence for an affinity between the Ust'-Ishim and Han lineages, which could be explained by W carrying ancestry from a population that diverged from the non-African lineage before the split of Ust'-Ishim.
Neutral Coalescent Simulations. To study the effects of gene flow between non-African and African populations on various admixture statistics, we simulated different scenarios of such gene flow using a neutral coalescent programming library, msprime (44) (SI Appendix, Fig. S8). Depending on the particular analysis ( Fig. 2 and SI Appendix, Fig. S2 and S3), we calculated admixture statistics (f 4 , f 4 -ratio, and admixture array proportions) as described above using SNPs extracted from each simulation run. Detailed description of the simulations can be found in SI Appendix, section S2.
Simulations of Selection. To study the dynamics of selection against Neandertal introgression over time, we used the simulation framework SLiM 2 (45) to build a realistic model of the human genome with empirical distributions of functional and conserved regions and selection coefficients, extending and generalizing a strategy previously applied by Harris and Nielsen (6) (Fig.  3A). First, we simulated a demography of modern humans and Neandertals (low long-term N e ) before the introgression, and let the simulated genomes accumulate deleterious mutations. Then we simulated a single pulse of admixture from Neandertals into the non-African population at a rate of 10% and tracked the changes in Neandertal ancestry in an admixed population at fixed neutral Neandertal markers distributed along each Neandertal genome before the introgression. A detailed description of our simulations and analyses of simulated data can be found in SI Appendixes, sections S3 and S4.