DNA methylation signatures of early-life adversity are exposure-dependent in wild baboons

Significance The environment animals face when young can affect how they function throughout life. Long-lasting changes in DNA methylation—a chemical mark deposited on DNA that can affect gene activity—have been hypothesized to contribute to early-life effects. But evidence for persistent, early environment-associated differences in DNA methylation is lacking in wild animals. Here, we show that early-life adversity in wild baboons predicts DNA methylation levels in adulthood, especially for animals born in low-resource environments and drought conditions. We also show that some of the changes we observe in DNA methylation have the capacity to influence gene activity levels. Together, our results support the idea that early experiences can become biologically embedded in the genomes of wild animals.

Environmental adversity is a key predictor of morbidity, mortality, and Darwinian fitness in animals.In some cases, these effects are immediate.However, in long-lived species, exposure to adversity can be temporally separated from its outcomes later in life (1), creating lagged associations between environmental experience and trait variation.In humans, for example, adverse childhood experiences predict elevated disease risk and years of lost life many decades later (2,3).Similarly, in natural baboon, hyena, and bighorn sheep populations, individuals exposed to social, ecological, or physical adversity in early life often survive to adulthood, but on average live shorter adult lives (4)(5)(6).Experimental studies in rodents and nonhuman primates show that these lagged effects can reflect causal relationships (7)(8)(9)(10).For example, captive rhesus macaques separated from their mothers soon after birth exhibit higher rates of illness and stereotyped behavior later in life, and the effect of maternal separation can spill over to a third generation via its effects on parenting behavior (11,12).An animal's past environments can therefore shape its phenotype long after those envi ronments change, even if conditions improve (13,14).These observations are likely to be explained, at least in part, by the process of "biological embedding," which posits that differences in life experience produce stable, systematically different biological states that have the capacity to influence physiology, fertility, or survival across the life course (15).Multiple mechanisms have been proposed to mediate the embedding process, including changes in neural connectivity, HPA axis signaling, and cell type composition (15,16).At the molecular level, the majority of research has focused on environmentally responsive changes to the epigenome, especially those mediated by DNA methylation: the covalent addition of methyl groups to DNA, which, in vertebrates, occurs primarily at CpG motifs (15,(17)(18)(19).Patterns of DNA methylation are largely laid down in utero and during the first years of life (i.e., during cellular differentiation and tissue formation), and they can be highly sensitive to environmental conditions during this time (20).However, changes in DNA methylation also occur in response to environmental stimuli later in life, including pathogen exposure, metabolic stress, and glucocorticoid signaling (21)(22)(23)(24).Because DNA methylation marks can remain stable across cell divisions (25), they provide a plausible route for encod ing a memory of past events in the genome.And because DNA methylation can sometimes-although not always-affect down stream gene expression (26)(27)(28), such changes could potentially account for trait consequences at the whole organism level.
For DNA methylation to explain lasting effects of environmen tal experience, at least two requirements must be met.First, var iation in DNA methylation must be linked to the environmental exposure of interest, ideally in a manner that excludes confounding by third variable effects.Second, DNA methylation levels must have the capacity to influence downstream phenotypes, most likely through an initial effect on gene expression.Although often assumed in studies of biological embedding, this relationship is not assured: many CpG sites in mammalian genomes are located outside of known regulatory elements or in inactive heterochro matin (18,27).Additionally, targeted manipulation of DNA methylation levels using epigenome editing or reporter assays shows that methylation-dependent changes to gene regulation are locus-dependent and sometimes undetectable [ (28)(29)(30), but see also (31)].For example, in massively parallel reporter assays that test the regulatory capacity of many loci in both an unmethylated and methylated state, only a small fraction of tested regions influ ence gene regulation in the human genome (29,32).Further, only some of these regions exhibit significantly altered activity as a function of experimentally manipulating DNA methylation levels (29).Thus, candidate CpG sites involved in biological embedding need to be empirically tested before their capacity to affect down stream traits is assumed (17,33).
In mammals, including humans, evidence of DNA methyla tion-mediated embedding in natural populations remains limited.In humans, most work has focused on identifying associations between early-life experience and DNA methylation levels in sam ples collected in adulthood (34)(35)(36).For example, DNA methyl ation levels in the blood of individuals exposed in utero to the Dutch hunger winter [a period of extreme caloric restriction induced by a German blockade during World War II: (37)] differ from unexposed individuals near genes involved in growth and metabolism (38).Similarly, people born in rural Gambia during the wet season (a period of relatively high malarial burden and low food availability) exhibit differences in DNA methylationmeasured nearly a decade later-compared to those born in the dry season (39).However, large cohort studies that focus on the typical spectrum of variation in developed nations often find rel atively few associations between early adversity and DNA meth ylation, especially after controlling for confounding factors (e.g., smoking behavior) that also vary as a function of early adversity (34)(35)(36)40).Meanwhile, in natural animal populations, studies of biological embedding via DNA methylation remain rare, power-limited, and focused on global rather than site-specific meas ures of DNA methylation levels (41,42).For example, higher levels of maternal care and subadult social connectedness predict higher global DNA methylation levels in wild spotted hyenas, but the indi vidual regulatory elements, genes, and pathways that drive this obser vation are unknown (42,43).Finally, in both human and nonhuman animal studies, analyses typically stop after identifying putative early-life-DNA methylation associations.Without testing the func tional consequences of DNA methylation at early environment-associated sites (e.g., by linking variation in DNA methylation to gene expression through a causal chain), the importance of DNA methylation in bio logical embedding remains unclear.
To address this gap, we investigated locus-specific associations between DNA methylation and major sources of early-life adversity in a longitudinally studied population of wild baboons living in the Amboseli ecosystem of Kenya (n = 256 individuals; 115 male, 141 female) (44).We combined DNA methylation data on nearly half a million CpG sites genome-wide with five decades of ecolog ical, behavioral, and life history data for individually recognized baboons followed across the life course.Importantly, strong rela tionships between the early-life environment and physiology, fer tility, and survival are well established for this population and for baboons and nonhuman primates more generally (5,(45)(46)(47)(48)(49).In Amboseli, female baboons who experience high levels of early-life adversity die at substantially younger ages, on average, than those who experience little to no early adversity (5).These females also have elevated glucocorticoids in adulthood (50) and weaker social bonds (5), and their offspring are less likely to survive to adult hood (45).
In addition to five sources of early adversity that have been extensively studied in the Amboseli baboons [drought in the first year of life, the presence of a close-in-age younger sibling, being born to a low-rank mother, being born into a group with high density, and death of an animal's mother prior to independence: (5,45,51)], we also investigated associations with habitat quality, a primary driver of resource availability in our population.In particular, large differences in habitat quality differentiate study subjects who were born early in the long-term study period (before the two original study groups shifted their home ranges to a new part of the study site) from those born after the home range shift.This shift was precipitated by a rapid die-off of fever trees (Vachellia xanthophloea), a major source of food and protection from pred ators, in the pre-shift habitat.The pre-and post-shift habitats are in close geographic proximity (within ~8 to 15 km) and have near-identical patterns of seasonality and rainfall.However, they differ in that the low-quality habitat is within the boundaries of Amboseli National Park, where elephants concentrate for protec tion from poaching (52) and browse heavily on V. xanthophloea and other vegetation (53,54), while the high-quality habitat is outside those boundaries and fever trees are more abundant.Female baboons experienced shorter inter-birth intervals, began reaching reproductive maturation earlier, and exhibited improved infant survival rates after the home range shift (55,56), in support of an improved resource base.We therefore included habitat qual ity at birth (pre-shift or post-shift: SI Appendix, Fig. S1) as another source of early-life disadvantage.
By integrating our measures of early-life adversity with genomic data on DNA methylation, as well as data on in vivo and in vitro gene expression, we were able to pursue four major goals.First, we tested for a signature of early-life adversity on DNA methylation levels in blood, including how sources of early adversity that differ entiate animals within the same group interact with overall habitat quality in early life.To place our results in context, we compared the signature of early adversity to those of dominance rank (i.e., social status) at the time of sampling, an important predictor of gene regulation in the Amboseli baboons and other mammals (57)(58)(59)(60).Second, we investigated how the DNA methylation sig natures of distinct environmental variables are distributed across the genome and whether they overlap with one another.Importantly, major sources of early-life adversity in the Amboseli baboons are not well-correlated with each other, and early-life experience is also usually uncorrelated, or weakly correlated, with the adult environ ment (SI Appendix, Fig. S2) (5,45).These features of our study system enabled us to disentangle the DNA methylation signatures associated with distinct environmental exposures, a perennial chal lenge in humans (3).Third, we asked whether the signature of habitat quality in early-life weakens with temporal distance from early life, as predicted if experiences in adulthood also modify the epigenome.Finally, we coupled experimental in vitro evidence from a massively parallel reporter assay, mSTARR-seq (29), and in vivo evidence from gene expression samples from the same population (57) to investigate whether, when, and how often DNA methyla tion levels at environment-associated CpG sites are likely to be functionally relevant for gene regulation in blood.

DNA Methylation Levels Are Associated with Environmental
Variation in Early Life and Adulthood.To investigate the signature of environmental variation on the baboon DNA methylome, we used reduced-representation bisulfite sequencing [RRBS (61,62)] to profile DNA methylation in blood for 477,270 CpG sites in the baboon genome, in 256 unique individuals (115 males, 141 females; SI Appendix, Fig. S3).This set of sites is a subset of the 1,590,767 CpG sites captured in the full sequencing dataset, filtered to remove invariant, constitutively hypomethylated, and constitutively hypermethylated sites (SI Appendix).For 37 individuals, we profiled repeated, longitudinally collected samples (2-3 samples per individual), for a total of n = 295 samples (SI Appendix, Table S1).
For each CpG site separately, we first modeled DNA methyl ation levels as a function of habitat quality at birth, cumulative early-life adversity, and age and ordinal dominance rank at the time of sampling, using the binomial mixed effects model imple mented in MACAU (we refer to this analysis as Model 1; Fig. 1A; see SI Appendix for model details) (63).We quantified habitat quality at birth as a simple binary variable indicating whether each study subject was born before or after the home range shift described above (N = 57 individuals were born in the low-quality habitat).We treated habitat quality at birth separately from cumu lative early adversity because of its nature as a strong cohort effect characterized by two distinct time periods, rather than a set of conditions that vary across individuals living at the same time and place (see SI Appendix for a discussion of the resulting unavoidable correlation with time).We considered five sources of early adver sity as components of the cumulative early adversity measure: drought, maternal loss, large group size, the presence of a close-inage younger sibling, and low maternal dominance rank, which collectively predict both reduced survival and reduced offspring survival in this population (5,45) (see also Materials and Methods).We estimated dominance rank effects for each sex separately (by nesting rank within sex), as male and female ranks depend on different traits for each sex (i.e., kinship in females and physical condition in males).Further, the hierarchies for each sex are sep arately estimated, have sex-specific implications, and have sex-specific associations with gene expression (44,57,(64)(65)(66)(67).
In Model 1, the strongest predictors of DNA methylation in adult hood were habitat quality at birth, male dominance rank at sample collection, and age at sample collection.The relationship between habitat quality at birth and DNA methylation was striking, resulting in 3,296 habitat quality-associated sites (10% FDR; SI Appendix, Table S2A).Consistent with the association between dominance rank and other aspects of gene regulation (57,67), associations between male dominance rank and DNA methylation were also widespread (n = 3,736 sites, 10% FDR), in contrast to a weaker relationship with female dominance rank (57, 67-69) (n = 4 sites; see SI Appendix for a discussion of this sex difference, which is consistent with previous findings in our population).Age strongly predicted DNA methylation across the genome (n = 169,439 age-associated sites), with a bias, as reported in other studies (68,70), to increases in DNA meth ylation with age in CpG islands (65%) and decreases in DNA methy lation with age in most other regions of the genome (79%).
In contrast to these three effects, we observed no significant associa tions (10% FDR) between DNA methylation and cumulative early adversity.
Our results for Model 1 suggest that habitat quality in early life is particularly important in the lives of baboons and could mod erate the association between other sources of early adversity and DNA methylation.To test this possibility, we re-ran our analyses, but in this case tested for the effects of cumulative early adversity experienced in the high-quality habitat and low-quality habitat separately (i.e., by nesting cumulative early adversity within hab itat quality; Model 2).To maximize power, we also included individuals for whom early adversity data were available, but dom inance rank data were missing because of observational gaps for males.This model not only strengthens the evidence for a main effect of habitat quality (25,509 habitat quality-associated sites; 10% FDR) but reveals an interaction with cumulative adversity: 2,856 sites are associated with cumulative adversity for baboons born in low-quality habitat (10% FDR), while none are signifi cantly associated with cumulative adversity in baboons born in high-quality habitat (Fig. 1 B and C and SI Appendix, Table S2B).Notably, only 64 of 295 samples derive from low-quality habitat individuals, suggesting that the greater number of associations in low-quality habitat is not driven by greater power.Among the significant sites identified in samples from individuals born in low-quality habitat, the effect sizes for cumulative adversity in the low-quality habitat are uncorrelated with the effect sizes for cumu lative adversity in high-quality habitat (p = 0.838) but positively correlated with the effect sizes for habitat quality itself (R = 0.508, P < 1 × 10 −10 ).This result suggests that the signature of cumulative adversity is amplified by exposure to ecologically challenging con ditions (and vice versa).Importantly, cumulative adversity scores do not differ between animals born in low-quality and high-quality habitats (Wilcoxon rank-sum test P = 0.843).
To investigate whether different components of the cumulative adversity score contribute differently to the early adversity-DNA methylation relationship, we then ran a third model (Model 3) to evaluate each of the five individual sources of early adversity, nested within habitat quality (all other biological and technical covariates remained the same as in Model 2).Among the individ ual sources of adversity we considered, early-life drought most clearly predicted variation in DNA methylation across the genome, especially for individuals born in the low-quality habitat (25,355 sites at a 10% FDR; Fig. 2 A and B; note that drought also affected 11% of baboons born in the high-quality habitat).We also iden tified detectable, but less common signatures of maternal loss (4,893 sites), large group size (3,124 sites), low maternal rank (730 sites), and the presence of a close-in-age younger sibling (619 sites).In contrast, none of the individual sources of early adversity were robust predictors of DNA methylation for individuals born in the high-quality habitat (≤5 sites associated with any individual predictor at 10% FDR; SI Appendix, Table S2C; see SI Appendix, Fig. S4 for comparisons of effect sizes for each predictor in lowversus high-quality habitat).
The Genomic Distribution of Environmental Predictors of DNA Methylation.Our models indicate that some early-life experiences are linked to more pronounced DNA methylation signatures than others.Drought in particular, which is one of the least predictable environmental exposures in Amboseli, is associated with an order of magnitude more CpG sites than maternal rank or group size, the next most common effects.Notably, early-life rainfall and rainfall at the time of sampling are only weakly correlated in our dataset, supporting the idea that our observations capture the signature of early life, not experience at the time of biological pnas.orgsample collection (SI Appendix, Fig. S5).To investigate whether these signatures are unique to specific early-life experiences or reflect a general signature of stress and adversity (perhaps scaled to the magnitude of the stressor), we therefore tested for overlap between the sets of sites linked to each of the five individual-level predictors and to habitat quality based on results from Model 3.
Our results support a generalized rather than an exposure-specific signature (Fig. 2C).Specifically, among sources of early adversity with a substantial number of associated CpG sites (habitat quality, drought, maternal loss, and group size), sites associated with one early-life exposure are 1.04 -8.6-fold more likely to be associated with a second early-life exposure (P < 1 × 10 −10 for 4 of 6 com parisons).Habitat quality and drought (in samples from individ uals born in low-quality habitat) show a particularly striking pattern of overlap: 4,038 CpG sites are significantly associated with both predictors (log 2 (OR)=2.23,P < 1 × 10 −10 ), and almost all of these effects (99.8%) are directionally concordant, such that exposure to low habitat quality in early life and exposure to drought predict the same direction of effect.Drought-associated  To avoid calling sites "unique" due to small differences in FDR values, overlaps show sites that are significant at a 10% FDR threshold for at least one predictor variable and ≤20% FDR for the other predictor variable(s).
sites were enriched in a broad group of gene sets annotated in the Molecular Signatures Database Hallmark set, including pathways involved in development (e.g., genes expressed in pancreatic B cells:), cell proliferation (e.g., targets of MYC signaling), and cellular metabolism (e.g., genes involved in oxidative phospho rylation; all Bonferroni-corrected P < 0.05; no gene sets pass this threshold for habitat quality-associated sites; see SI Appendix, Table S7).
Comparing these findings to the signature of male dominance rank shows that overlap in sensitivity to the environment is not specific to early-life variables [note that we focused on male rank here for comparison because significant associations with female rank are far less common, consistent with findings from our previous work on rank and gene expression in the Amboseli baboons: (57,67)].Male rank-associated sites are 11.21 times more likely to be associ ated with drought than background expectations and 2.43 times more likely to be associated with habitat quality (both P < 1 × 10 −10 ).In these cases, dominance rank effects tend to have directionally opposite effects to habitat quality and drought (log 2 (OR) = −4.06 for overlap with habitat quality; the odds ratio could not be estimated for the overlap with drought because there was no overlap in the direction of effects).Consequently, sites that are more highly meth ylated in high-ranking males also tend to be more highly methylated for baboons of both sexes who were born in poor-quality habitat and exposed to drought within that habitat.
In contrast to male rank-associated patterns of DNA methyla tion, age effects only modestly overlap with drought effects and habitat quality (log 2 (OR) = 0.16 and 0.43, both P < 10 −10 ) and do not overlap with male rank effects at all (log 2 (OR) = 0.045, P = 0.35) (SI Appendix, Fig. S6).These results suggest that despite a shared epigenetic signature of at least some types of early and adult experience (with variation in the magnitude of the effect), the signature of age is distinct and the distribution of differentially methylated sites across the genome is not inevitable for any methylation-associated variable.To test this hypothesis further, we investigated how CpG sites related to age versus socioenviron mental variables are distributed across promoters, gene bodies, CpG islands and shores, putative enhancer elements, and unan notated regions.We focused on the four variables with the strong est DNA methylation signatures: age, habitat quality in early life, drought (in the low-quality habitat), and male dominance rank.
Our results highlight two patterns (Fig. 3 A and B).First, com pared to the full reference set of sites we analyzed, drought and male dominance rank-associated sites are systematically enriched in functionally important regions of the genome, especially gene bodies (log 2 (OR) = 0.25 and 0.72, respectively) and putative enhancer ele ments (log 2 (OR) = 0.52 and 0.99), but depleted in unannotated regions (log 2 (OR) = −0.13 and −0.36) of the genome (all P < 1 × 10 −7 ; SI Appendix, Table S3).Second, and in contrast, age-associated sites are 1.57-fold more likely to occur in unannotated regions of the genome than expected by chance (i.e., in comparison to the back ground set of sites included in our study), but are depleted in enhancers (log 2 (OR) = −0.27)and gene bodies (log 2 (OR) = −0.28,all P < 1 × 10 −10 ).Notably, habitat quality-associated sites, which are much more widely distributed in the genome than rank-or drought-associated sites, follow an intermediate pattern: They are less common in unannotated regions than age-associated sites but are not clearly enriched for gene bodies or enhancers.
To investigate the difference in the functional significance of age-versus socioenvironmental variable-associated sites, we turned to more granular chromatin state annotations available from the Roadmap Epigenomics Consortium, which distinguishes between a class of states associated with "active" versus "repressed" (inactive or quiescent) DNA.We observe clear differences in the distributions of age versus socioenvironmental differentially methylated sites based on enrichment within chromatin state annotations (i.e., predictions of the regulatory state of different regions of the genome based on the presence of epigenetic modifications and chromatin accessibility) (71).Specifically, we lifted over chromatin state coordinates for human peripheral blood mononuclear cells to the baboon genome, Panubis1.0(27,72).Relative to the background set of sites assayed in our study, early-life drought and dominance rank are again enriched in regions of the genome marked for regulatory activity, such as enhancer ele ments (log 2 (OR) = 0.91 and 1.75, respectively, both P < 10 −10 ), and transcriptional activity (log 2 (OR) = 0.62 and 1.52 respectively, both P < 10 −10 ), but depleted in repressed and silenced regions such as heterochromatin (log 2 (OR) = −0.89and −1.32, P = 4.4 × 10 −5 and 0.055) and weakly repressed, polycomb-marked DNA (log 2 (OR) = −0.13 and −0.34, P = 0.03 and 0.016; Fig. 3 C, Top; SI Appendix, Table S3).Age-associated sites show the opposite pat tern (Fig. 3 C, Bottom).

The DNA Methylation Signature of Early-Life Habitat Quality
Attenuates Over Time.Although the individuals in our dataset were predominantly adults, individuals exposed to poor habitat quality were sampled at a range of ages (range = 2.5 to 26.3 y).We took advantage of this variation to test whether the signature of early-life adversity attenuates over time, resulting in weaker signatures with longer times from exposure.To do so, we focused on habitat quality, the strongest association with early life we observed in our data.We first built an elastic net model to ask whether early-life exposure to low-quality habitat (a binary variable indicating whether the subject was born before or after the home range shifts) is predictable based on DNA methylation levels sampled in adulthood (73).
We found that an elastic net model achieves high accuracy in our sample (AUC = 0.92 based on leave-one-out cross-validation; Fig. 4 A and B).However, among animals born in low-quality habitats, the ability of the model to correctly and confidently predict habitat quality in early life depends on the time elapsed between the habitat shift and blood sample collection (linear model P = 0.0084; Fig. 4C), but not on the cumulative amount of time spent in low-quality habitat (linear model P = 0.279).Animal age does not predict overall habitat quality, so we are not indirectly capturing habitat quality through the age of individuals in our sample (linear model P = 0.80).Specifically, animals who had spent more time in high-quality habitat prior to sampling were less confidently predicted to be born in low-quality habitat than those who experienced it more recently.This result suggests that, although DNA methylation signatures of early adversity can persist for years in baboons, they also decay over time or are over written by later life experience.

Evidence for the Functional Importance of Environment-
Associated DNA Methylation Variation.The distribution of environment-associated CpG sites in loci related to transcription and active gene regulation suggests that some subset of these sites have the capacity to causally influence gene expression.To formally test this hypothesis, we performed a massively parallel reporter assay (MPRA), mSTARR-seq, designed to both identify loci capable of regulatory activity in vitro and quantify the effects of differential methylation on the magnitude of this activity (Fig. 5A and SI Appendix, Table S4) (29).mSTARR-seq tests a sequence fragment's ability to drive gene expression in a self-transcribing plasmid, in hundreds of thousands of genomic fragments simul taneously.Because active test regions drive transcription by looping to interact with their own promoter, fragments capable of driving their own transcription are considered to have enhancerlike activity in vitro (74).Since the plasmid backbone is devoid of CpG sites, inserted fragments containing CpG sites in their sequence can be tested in either a fully CpG methylated or fully unmethylated state to investigate whether enhancer activity can be modified by changes in DNA methylation alone.
We performed mSTARR-seq using a mechanically fragmented and restriction enzyme-digested library of baboon DNA frag ments, transfected into the human K562 cell line (SI Appendix, Table S4).K562s are a myelogenous leukemia line that shares properties with several types of peripheral blood mononuclear cells and are therefore often used in studies of immune variation.Importantly, mSTARR-seq has also been extensively optimized in K562 cells (29).Following quality control, we were able to test for regulatory activity in 252,463 500-base pair windows across the baboon genome (4.4% of the genome), of which 32,634 con tained tested CpG sites in the Amboseli baboon dataset.Among these 32,634 windows, we identified 492 windows (1.5% of those tested, using a 10% FDR threshold; SI Appendix, Table S5) capa ble of enhancer activity in either an unmethylated state, a fully methylated state, or both (similar to estimates from ref. 29).
As expected, experimentally identified regulatory regions were strongly enriched in predicted strong enhancers (based on chrom HMM annotations: log 2 (OR) = 2.50.P < 10 −10 ) and depleted in insulators (log 2 (OR) = −2.32;P = 2 × 10 −9 ) and repressed regions (log 2 (OR) = −0.94,P = 5.8 × 10 −18 ; Fig. 5B and SI Appendix, Table S6).Among the 492 regulatory windows overlapping our tested sites, 86% also exhibited methylation-dependent activity, where the capacity to drive transcription differs depending upon whether CpG sites are methylated or not.Of note, 94% of methylation-dependent regions exhibited reduced activity in the methylated state compared to the unmethylated state, consistent with a general role for DNA methylation in repressing gene regulation.
For most discovery thresholds between 5% and 20% FDR, male dominance rank-associated CpG sites are found in mSTARR-seqidentified regulatory windows more often than chance, such that most discovery threshold-predictor variable combinations reach statistical significance (Fig. 5C).For example, 40 dominance rank-associated sites (FDR = 20%; 2.2% of significant sites tested) fall in regions of the genome capable of behaving like enhancer elements [log 2 (OR) = 0.59 FET, P = 0.014; note that promoter regions often also exhibit enhancer-like activity in massively parallel reporter assays (29)].These sites are also enriched (relative to all sites we analyzed within mSTARR-seq analysis windows) in windows where modifying the DNA methylation level of the tested sequence alters its capacity to drive gene expression (log 2 (OR) = 0.71 FET, P = 0.006).The pattern for drought-associated sites is less clear: while they are not more likely to occur in mSTARR-seq regulatory elements than expected by chance (drought-associated sites at FDR = 20%, n = 196 sites, (log 2 (OR)=0.18FET P = 0.1), they exhibit a modest enrichment for DNA methylation-dependent activity (log 2 (OR) = 0.23, FET P = 0.06), and 88% of early drought-associated sites that overlap with mSTARR-seq regulatory windows are in methylation-dependent windows.Age-associated sites again provide a clear contrast, with no evidence of enrichment of such sites among mSTARR-seq-identified regulatory windows (log 2 (OR) = 0.065, P = 0.45).
These results suggest that the associations detected in our field-based sample partially reflect targeted, functionally important changes in the response to the environment, some of which are detectable years to decades post-exposure.If so, environmental effects on DNA methylation should also colocalize with environ mental effects on gene expression in the Amboseli baboons.To test this possibility, we drew on RNA-seq gene expression data from white blood cells collected from 2013 to 2018, in which several thousand associations between male rank and gene expression have previously been identified (57, 67) (note that individuals born in low-quality habitat are not well-represented in this dataset because the start of collection for gene expression analysis long post-dated the habitat shift, so we were not able to perform a parallel analysis for habitat quality or early-life drought).Male rank-associated CpG sites fall closer to, and more often within, genes associated with male dominance rank than they do for the background set of tested genes (Kolmogorov-Smirnov test P = 1.81 × 10 −5 ).Blood-expressed genes that contain a male rank-associated CpG site are also 1.22-fold more likely to exhibit male rank-associated gene expres sion levels (P = 6.60 × 10 −4 ), even though the individuals repre sented in the gene expression dataset and the DNA methylation dataset are largely distinct (34 of 115 males in the DNA methyla tion dataset were included in the gene expression dataset; 34 of 52 males in the gene expression dataset were included in the DNA methylation dataset).Finally, rank effects on gene expression are negatively correlated with rank effects on DNA methylation for CpG sites in the same gene (SI Appendix, Fig. S7).Thus, if DNA methylation levels are higher in high-ranking males, gene expres sion levels tend to be lower in high-ranking males, and vice versa (FET for sign: log 2 (OR) = −1.25,P = 8.6 × 10 −9 ).As a result, multiple pathways enriched among rank-associated genes based on gene expression [in comparison to the background set of expressed and analyzed genes (57)] are also enriched among genes linked to rank-associated DNA methylation patterns, including interferon alpha signaling, NFkB signaling, and the inflammatory response (all P < 0.05; SI Appendix, Table S7).

Discussion
Although early-life effects on fitness are documented in many long-lived species, how these effects bridge across time to link the early environment with trait outcomes later in life is not well understood.Here, in support of the biological embedding hypoth esis, we find that DNA methylation may serve as a persistent link between some forms of early-life adversity and later life phenotypes in wild baboons.We also document a shared fingerprint of earlylife adversity and male dominance rank (i.e., social status) in adulthood, which is in turn distinct from the much more wide spread effects of age.Finally, we leverage in vitro experiments and gene expression data from the same population to show that a subset of environment-associated changes in DNA methylation are functionally relevant to gene regulation.
Our results also highlight that not all sources of early adver sity-even ones that have substantial predictive power for fertility and survival, such as maternal loss-are strong predictors of var iation in DNA methylation.These results are consistent with several possibilities.First, drought and poor habitat quality, unlike maternal loss, may have more widespread associations with DNA methylation because they can influence fetal development as well as postnatal development, so the contrast between the strength of these signatures may reflect more prolonged or earlier exposure to adversity.Alternatively, our findings may support the idea that types of early-life adversity that involve resource deprivation may have stronger links to later life DNA methylation patterns than those involving threat (75).Indeed, drought in Amboseli, when yearly rainfall is similar to desert biomes in the American south west, represents a serious source of resource deprivation (46,76).Drought in the impoverished habitat pre-range shift, when infant survival rates were 19% lower than in the post-shift high-quality habitat (56), was likely even more challenging.The fact that we were only able to detect drought-associated sites in animals born in the low habitat quality environment therefore suggests that biological embedding via DNA methylation is most pronounced and/or most consistent under conditions of considerable material (A) Workflow for the mSTARR-seq experiment and an example of read pileups at a regulatory window that exhibits methylation-dependent regulatory activity and overlaps a drought-associated CpG site in the observational data from Amboseli.Summed read counts are shown for methylated (blue) and unmethylated (yellow) experimental replicates.In the highlighted methylation-dependent regulatory region, unmethylated treatments drive substantial expression (yellow RNA counts) compared to methylated treatments (blue RNA counts), even though the amount of input DNA (overlapping yellow and blue DNA counts) was near-identical across treatments.(B) Enrichment of regulatory regions from mSTARR-seq across 15 chromatin states lifted over to the baboon genome from human peripheral blood mononuclear cells (27).Regions with empirically identified regulatory activity are enriched in regions orthologous to putative enhancer and promoter regions in human PBMCs, and depleted in states associated with regulatory quiescence/repression.(C) Enrichment statistics for male dominance rank-(blue), drought-(red), and age-associated CpGs (gray) in regions capable of regulatory activity in mSTARR-seq.The x-axis shows the FDR threshold for identifying age, drought, or rank-associated CpG sites; the y-axis shows the log 2 (OR) for enrichment in mSTARR-seq putative regulatory elements (all identified at FDR = 10%).Opaque points indicate significant FET enrichment (P < 0.05).
deprivation.This result may also account for observations in humans, in which DNA methylation associations with early-life famine have been discovered more often than associations with early-life stressors such as parental loss and poor maternal bonding (35) (38,39,77).Alternatively, although not mutually exclusive, weaker predictors of DNA methylation variation in our dataset may simply have larger effects in other, non-blood tissues.
A clear implication of our results is that different sources of early-life adversity can have compounding effects on DNA methyl ation.Specifically, all individual-level early-life effects we considered were magnified for individuals born into poor early-life habitat.This observation suggests that, as reported in studies of adverse childhood experiences, health, and longevity in humans, the effects of com bined early adversity can interact to exceed that expected from addi tive effects (78).We speculate that such interactions are particularly likely to occur for components of the environment that have similar mechanisms of action.Both drought and low-quality habitat, for instance, are costly because they constrain the baboons' resource base.Hence, they are likely to affect DNA methylation patterns at a shared set of loci and in a common direction.The baboons' behav ioral choice to shift home ranges may therefore have had long-term ramifications for population variation in both DNA methylation and gene expression profiles.
Our findings also emphasize the importance of explicitly testing for the functional effects of environment-associated DNA methyl ation and gene regulation.The conventional model for CpG meth ylation and gene expression, which proposes that DNA methylation causally alters the expression of nearby genes by altering chromatin accessibility and/or transcription factor binding, does not apply to all CpG sites.Indeed, genomic analyses of the response to stimuli show that changes in DNA methylation often occur downstream of transcription factor binding or changes in gene expression (21), rather than the reverse; indeed, changes in DNA methylation have recently been suggested to be dispensable for the function of many enhancer elements (31).For DNA methylation to mediate biolog ical embedding, however, it must play a functional role.And while our results combine with those of others (79) to show that changes in DNA methylation can indeed precede changes in gene regula tion-196 drought associated CpG sites identified here fall in reg ulatory regions with methylation-dependent activity in vitro-this pattern is far from universal.For example, in this analysis, roughly 25,000 drought-associated sites either do not fall in regions with enhancer activity in our assay, or are in methylation-insensitive reg ulatory regions.This observation suggests that many early adversityassociated sites may be functionally silent, exert effects on gene regulation but not via enhancer activity, or have tissue-or environmentspecific effects invisible in our single-tissue type assay.In either case, empirically testing for the functional consequences of differential methylation can help prioritize environment-associated CpG sites for future work.Such tests should become a standard component of studies of biological embedding.
Our findings set the stage for several types of follow-up work.First, because our study system is observational, not experimental, we cannot unambiguously resolve whether the signatures of early life we document here are direct consequences of the early envi ronmental factors we studied.Complementary work in captive or lab animals, or research that takes advantage of natural experi ments in humans, may help fill this gap.Second, other social or environmental factors may mediate or moderate the relationship between environmental adversity and DNA methylation.For example, previous work has shown that high social status may buffer baboon females from the long-term effects of early-life drought on fertility (46), and that strong social bonds and high social status in adulthood can buffer some negative effects of early adversity on survival (80).Whether social status or other advan tages in life (e.g., strong social bonds) buffer the relationship between early adversity and DNA methylation remains to be tested.Third, DNA methylation levels at many CpG sites have a heritable component [mean h 2 = 0.2 in humans: (81)], which our analyses also identify in the Amboseli baboons (this study: mean h 2 = 0.28 ± 0.2 SD).Whether genetic variants associated with DNA methylation levels (i.e., methylation quantitative trait loci, or meQTL) co-occur or interact with the effects of early adversity is a natural question to address in future work.Finally, although our results suggest that a subset of early adversity-associated sites have the capacity to also influence gene regulation, whether and how these effects influence organism-level physiology, health, and survival remains a puzzle.Investigating the role of differential methylation at such sites for shaping the molecular response to pathogens, nutrient availability, or hormonal signals of stress (as in ref. 82), may help resolve this open question.

Materials and Methods
Study subjects were 256 adult baboons (115 males and 141 females) living in one of the 25 study groups observed by the Amboseli Baboon Research Project (ABRP) between 1979 and 2018 (SI Appendix, Table S1).In all cases, blood samples were obtained via brief anesthetization of each study subject during periodic darting efforts, in which a Telazol-loaded dart was delivered via a handheld blowgun (57,67,83,84).Methylation levels were measured using single or double digest reduced representation bisulfite sequencing (RRBS) of DNA extracted from whole blood.Reads were mapped to the Panubis1.0genome (GCA_008728515.1),and CpG sites with low coverage or that were constitutively hypo/hyper-methylated were removed, leaving 477,270 sites for downstream analyses.The full set of analyzed sites, along with model results for each site, is provided in SI Appendix, Table S2 A-C.Importantly, because RRBS profiles a non-random set of CpG sites in the genome (especially focused on those in CpG-rich regions), all analyses in which we test for enrichment of differentially methylated sites therefore use this set of 477,270 sites as the background reference set.
Measures of early-life adversity were prospectively and directly observed through longitudinal monitoring of the population.Similar to ref. 5, and following ref.45, we quantified cumulative early adversity as the sum of exposures to five major sources of environmental adversity in early life: low maternal dominance rank (lowest quartile of ordinal ranks in the population, where higher numbers correspond to lower social status), social group size at birth (highest quartile) as an index of resource competition, drought in the first year of life (<200 mm of total rainfall), the presence of a close-in-age younger sibling [live birth within 1.5 y of the focal individual, approximately the lowest quartile of interbirth intervals separating live births in this population (5)], and maternal loss in the infant and juvenile period (before age 4, the earliest age of maturation in the Amboseli baboons) (76).
During the 1970 s and 1980 s, the quality of resources in the baboons' habitat markedly degraded leading up to a shift in home range in the early 1990 s (55).We therefore also considered a binary measure of habitat quality at birth, based on the subject's birthdate: Individuals born prior to this home range shift were considered to have been born in low-quality habitat and individuals born after home range shift were considered to have been born in high-quality habitat.Dominance rank was estimated using ordinal ranks, where the highest status animal is given a value of 1 and individuals lower in the hierarchy have progressively larger values (64).Dominance ranks in Amboseli are determined on a monthly basis from the outcomes of dyadic agonistic interactions observed in the same month.For 98% of individuals, age was based on direct observation of birth events, to within a few days' error (SI Appendix).
For each CpG site, we modeled variation in DNA methylation at each CpG site in our analysis set using the binomial mixed-effects model implemented in MACAU (63).We controlled for genetic relatedness between individuals using genotype data derived from low-coverage resequencing data generated for all individuals in our sample in previous work (84) (SI Appendix).We controlled for technical effects (e.g., batch, sequencing depth, bisulfite conversion rate) as additional fixed effects and kinship/population structure using a random effect.Using a subset of our data, we also confirmed that major differences in cell composition (lymphocyte and monocyte ratios, available from blood smear data) do not significantly predict DNA methylation in our sample and that finer-scale variation (based on flow cytometry data) is not correlated with our key predictor variables (SI Appendix).We did not model an effect of sex because in preliminary analysis, we observed little to no signature of sex in the DNA methylation data, consistent with (85).However, we note that although differentially methylated regions by sex are rare on the autosomes, they are very common on the X chromosome, consistent with expected patterns of X-inactivation (86) (SI Appendix).ChromHMM tracts were based on orthology to annotations in human PBMCs generated by the Roadmap Epigenomics Consortium and converted to baboon genome coordinates using liftOver (27,87) (SI Appendix).Measures of regulatory activity were assayed using mSTARR-seq on baboon DNA fragments following (29) (SI Appendix).Gene expression measures from leukocytes for the same population were generated previously (57,67).For analyses linking CpG sites to genes, we focused on CpG sites falling within gene bodies as annotated by the baboon genome GTF (# GCF_008728515.1).
Data, Materials, and Software Availability.Raw FASTA files data have been deposited in NCBI Short Read Archive (PRJNA970398 and PRJNA871297) (88,89).All statistical analyses in this work were performed in R (Version #4.1.2) (90), with code available at https://github.com/janderson94/Anderson_et_al_socio-ecological_methylation_predictors(91).Previously published data were used for this work, and are available via NCBI as referenced in SI Appendix, Table S1.

Fig. 1 .
Fig. 1.Socioenvironmental predictors of DNA methylation depend on early-life habitat quality.(A) Early-life predictors considered in this study include a cohortlevel effect of habitat quality (Left), which is based on the timing of birth (before or after a shift in home range) and five individual-level measures of early-life conditions that vary among individuals living in the same group or at the same time.We modeled individual-level measures based on both a count of cumulative early-life exposures (Models 1 and 2) and using individual, binary measures of exposure (Model 3).Age and dominance rank at the time of sampling were also modeled as controls/comparisons for demography and the social environment at the time of sampling.(B) The absolute value of the standardized cumulative early adversity effect size (i.e., the parameter estimate from Model 2 divided by its SE), estimated for individuals born in high-quality habitat (purple) versus those born in low-quality habitat (peach) for sites passing a 20% FDR in one or both conditions (n = 12,872 CpG sites; Model 2).Solid and dashed lines show the mean and 95% intervals, respectively, for each distribution.(C) The effect size of cumulative adversity is systematically larger for individuals born in low-quality habitat.Main figure shows standardized effect sizes from Model 2, comparing the effect of cumulative early adversity for individuals born in low-versus high-quality habitats, across the same set of sites (n = 12,872).Each line connects the two effect sizes for one CpG site (one effect size estimate from samples of individuals born in the high-quality habitats and the second estimated for those born in low-quality habitats).Inset: example CpG site showing the association between cumulative adversity and DNA methylation levels for animals born in low, but not high, quality habitat.Dots show mean DNA methylation levels for individuals born in a given habitat quality-cumulative adversity score combination; whiskers show SE.

Fig. 2 .
Fig. 2. Early-life adversity is associated with DNA methylation in adulthood for baboons born in low-quality habitat.(A) Sideways bar graph showing the number of CpG sites associated with each tested predictor (<10% FDR) in Model 3. The x-axis is shown on a log 10 scale to accommodate a range of associations that vary by orders of magnitude across predictor variables.Significant counts at 10% and 1% FDR thresholds are available in SI Appendix, Table S2D.(B) Reaction norms for two example CpG sites (Left: chr12_111013997; Right: chr11_430191) that were significantly associated with early-life drought, but only for baboons born in low-quality habitat (peach; 10% FDR).Colored bars indicate SE. (C) UpSet plot of the number of CpG sites associated with habitat quality, each individual source of adversity (within low-quality habitat), and their overlap.Each bar represents the number of sites associated with the source or the combination of sources of adversity that are indicated in the matrix beneath the bar graph.To avoid calling sites "unique" due to small differences in FDR values, overlaps show sites that are significant at a 10% FDR threshold for at least one predictor variable and ≤20% FDR for the other predictor variable(s).

Fig. 3 .
Fig. 3. Genomic distribution of CpG sites associated with age, rank, and early-life adversity.(A) Enrichment of the top four predictors of DNA methylation levels in functional compartments across the genome, relative to the distribution of all 477,270 sites included in this study.Color indicates log 2 (Odds Ratio) from a Fisher's exact test, with the brightest colors indicating highest and lowest odds ratios.(B) An alternative depiction of these results, showing the proportion of sites tested in each functional compartment (i.e., the background set), relative to the proportion of sites associated with each of the top four predictor variables in that category.Asterisks indicate significant enrichment or depletion at various p-value thresholds.Black and blue asterisks indicate over and under-enrichment respectively.(C) Enrichment of the same four sets of age, rank, or early environment-associated CpG sites, across 15 distinct chromatin states, based on annotation in human peripheral blood mononuclear cells with coordinates lifted over to Panubis1.0.States are ordered roughly by their association with active gene regulation following the chromatin state numbering provided by the Roadmap Epigenomics Consortium (27), from left (active) to right (repressed/quiescent), but are not intended to reflect a linear increase in activity.Opaque dots indicate P < 0.05 for enrichment based on Fisher's exact test.

Fig. 4 .
Fig. 4. Early-life habitat quality can be accurately predicted from DNA methylation, but this signal attenuates over time.(A) Known early-life habitat quality (x-axis) versus predicted early-life habitat quality from an elastic net regularization model (y-axis).More negative values correspond to cases in which the model predicted that the individual was born in high-quality habitat (the post-habitat shift environment); more positive values correspond to cases in which the model predicted that the individual was born in low-quality habitat (the pre-shift environment).(B) Receiver operating characteristic (ROC) curve for early-life habitat quality predictions (AUC = 0.926; dashed line denotes the y = x line).(C) Predicted habitat quality (y-axis) versus the time since habitat shift in days (x-axis) for animals born in low-quality habitat (linear model P = 0.0084).0 d since habitat shift indicates a sample from an animal still in the low-quality environment.

Fig. 5 .
Fig. 5. CpG sites associated with male dominance rank are enriched in functional regions of the genome based on a high-throughput reporter assay.(A)Workflow for the mSTARR-seq experiment and an example of read pileups at a regulatory window that exhibits methylation-dependent regulatory activity and overlaps a drought-associated CpG site in the observational data from Amboseli.Summed read counts are shown for methylated (blue) and unmethylated (yellow) experimental replicates.In the highlighted methylation-dependent regulatory region, unmethylated treatments drive substantial expression (yellow RNA counts) compared to methylated treatments (blue RNA counts), even though the amount of input DNA (overlapping yellow and blue DNA counts) was near-identical across treatments.(B) Enrichment of regulatory regions from mSTARR-seq across 15 chromatin states lifted over to the baboon genome from human peripheral blood mononuclear cells(27).Regions with empirically identified regulatory activity are enriched in regions orthologous to putative enhancer and promoter regions in human PBMCs, and depleted in states associated with regulatory quiescence/repression.(C) Enrichment statistics for male dominance rank-(blue), drought-(red), and age-associated CpGs (gray) in regions capable of regulatory activity in mSTARR-seq.The x-axis shows the FDR threshold for identifying age, drought, or rank-associated CpG sites; the y-axis shows the log 2 (OR) for enrichment in mSTARR-seq putative regulatory elements (all identified at FDR = 10%).Opaque points indicate significant FET enrichment (P < 0.05).