The metabolic domestication syndrome of budding yeast

Significance Metabolic networks evolve through changes in enzyme content and activity states, but the latter aspect remains elusive. This is a major knowledge gap since differences in enzyme kinetic properties and regulation can lead to vastly different metabolic activities despite similar enzyme contents. Here, we profiled metabolite levels across several budding yeast species and populations to delineate the evolutionary dynamics of metabolic states at exceptional phylogenetic resolution. The approach allowed us to uncover a global signature of domestication that evolved convergently in independently domesticated groups of Saccharomyces cerevisiae and likely reflects adaptation to human-made niches. More broadly, our results show that studying metabolic evolution through metabolomics provides largely complementary information that cannot be obtained by studying enzyme content alone.

distinct lineages (populations) in this species (23).Importantly, S. cerevisiae is the dominant species in the fermentation of various beverages and foods and has been domesticated on several independent occasions (19,24).Domestication had a large impact on the sexual life cycle, stress tolerance, and fermentative growth capacity of this species, raising the possibility that it also dramatically altered the functional states of its metabolic network.However, the diversity of metabolite levels is poorly characterized in yeasts due to limited metabolite coverage and narrow representation of natural variation in earlier studies (3,25).Thus, beyond a few specific metabolic traits (19,26), it remains unknown how evolutionary adaptation to human-made environments (domestication) influenced the yeast metabolome.
Here, we studied variation in metabolite levels within and among species by focusing on a diverse strain set representing nine species of the Saccharomycetaceae family and 17 genetically distinct populations of S. cerevisiae.We employed multiple complementary metabolomic platforms to obtain a global picture of variation in central metabolites.Analysis of the data in a phylogenetic context allowed us to address several open questions.First, we examined the relationship between two major aspects of metabolic evolution across multiple timescales: divergence in metabolite levels and divergence in the metabolic network's capacity to produce metabolites.Second, we examined the impact of domestication on the metabolome and tested whether independent domestication events converged on a similar metabolome signature.We found that metabolite levels evolve much more rapidly and largely independently of the network's structural properties and display a global recurrent signature of adaptation to human-made niches.

Metabolomics Reveals Substantial Metabolic Diversity in Budding
Yeasts.To study the evolutionary divergence of metabolomes, we focused on 71 yeast strains representing 27 populations of 9 budding yeast species and spanning ~90 My of evolution (Fig. 1 and Dataset S6).The set of isolates cover 17 genetically isolated populations of S. cerevisiae, including both wild and domesticated ones, and capture well the genetic and phenotypic differentiation of this species.Importantly, we only analyzed non-mosaic (clean) S. cerevisiae isolates to allow within-species inference of evolutionary history.Multiple isolates per population were included for S. cerevisiae and Saccharomyces paradoxus, where population structure has been well characterized.Even though some domesticated S. cerevisiae populations contain wild isolates as well (e.g., Asian Fermentation), we took care to represent such populations with only isolates collected from human-associated environments.Overall, the strain set studied here provides a graduated view of metabolome variation across multiple evolutionary time scales.Phylogenomic analysis of the strain set resulted in a phylogenetic tree that agrees well with previous studies (23,(27)(28)(29)(30) and supports the basal position of wild isolates collected in Taiwan and China within S. cerevisiae (Fig. 1A, see Methods for details of the phylogenetic reconstruction).
We first measured the levels of 19 amino acids across the ~70 strains using a quantitative targeted metabolomics workflow.We primarily focused on amino acids as these are the building blocks of proteins and key intermediates of several biosynthetic pathways and therefore capture the activity of various metabolic processes (13).We collected metabolome data for all strains in the same defined growth environment and same growth phase (exponential) to minimize environmental variation and to reveal evolutionary differences in metabolite concentrations [i.e., common garden design (31)].We used a synthetic minimal medium that is especially well suited to capture variation in biosynthetic metabolism (13) and a growth temperature (25 °C) that diminishes growth rate heterogeneity across strains (SI Appendix, Fig. S1).To exclude the possibility that metabolome differences are dominated by remaining growth differences among genotypes, we measured optical density (OD) of each culture at the time of sampling and applied a normalization strategy to remove potential OD-dependencies of metabolite levels (32,33) (Dataset S1 and Methods).
We next characterized the extent of variation of amino acid levels across yeast species and populations by calculating the average values for each population (Fig. 1 B and C).We found substantial divergence in amino acid levels, with a 6.6-fold range among the different species when averaged across amino acids.As might be expected, the two phylogenetically most distantly related budding yeast species, Kazachstania saulgeensis and Saccharomyces castelli, show the largest deviations in several amino acids from the rest of the species (Fig. 1B).Statistical analysis confirmed that 18 out of 19 amino acids show significant variation across the 27 populations studied (ANOVA tests, SI Appendix, Table S1).Importantly, a similar conclusion holds even when comparing distinct populations of the same species, S. cerevisiae (17 out of 19 amino acids vary significantly, ANOVA tests, see SI Appendix, Table S1).
The degree of between-species divergence varies significantly among different amino acids (Bartlett's test of homogeneity of variances was applied to the fold change values, P < 2.2e-16, Fig. 1B).Furthermore, amino acids also differ in their degree of divergence when measured among S. cerevisiae populations (Bartlett's test, P < 2.2e-16, Fig. 1C).Ranking of amino acids based on their evolutionary variability suggests that the levels of tyrosine, histidine, arginine, and lysine diverged more extensively than those of tryptophan, isoleucine, and threonine, which appear more conserved (Fig. 1 B and C).Notably, both the between-species and the within-S.cerevisiae comparisons reveal this trend, indicating shared evolutionary forces shaping the metabolome across multiple phylogenetic time scales.
To estimate the phylogenetic diversity of metabolite levels beyond amino acids, we also measured 78 metabolites of primary metabolism, excluding amino acids, using a non-targeted metabolomics platform (Dataset S2 and Methods).Consistent with the amino acid data, we find that the majority of detected metabolites show significant variation between the 27 populations (74 out of 78, see Dataset S7).Together, these patterns reveal a previously hidden diversity of metabolic activities associated with central metabolites.

Decoupled Evolution of Metabolite Levels and Metabolic
Network Structure.We next interrogated the relationship between metabolic evolution occurring at two levels: metabolite concentrations and network structure.Alterations in the reaction content of a metabolic network may alter its capacity to produce metabolites, i.e., the maximum theoretical yield of metabolite biosynthesis.Therefore, we used metabolite production capacities, derived from computational models of genome-scale metabolic networks, as measures of how the structure of the entire metabolic network determines its metabolic potentials.Note that metabolite production capacities represent maximum possible biosynthetic yields and therefore can be calculated from the structure and reaction stoichiometries of the metabolic network, without requiring any regulatory or enzyme kinetic information.We leveraged genomescale metabolic network reconstructions available for 7 yeast species and 54 S. cerevisiae strains in our dataset.These networks have been reconstructed using similar methodology based on genomic information and contain on average ~1,100 genes and ~4,000 associated reactions per strain (8,34).To test whether evolutionary divergence in the overall amino acid synthesis capacity correlates with divergence in the amino acid metabolome, we calculated the profile similarity of computed maximum yields across the 19 amino acids for each pair of yeast species and each pair of S. cerevisiae populations (Methods).We found that the yield profile similarity follows a binomial distribution both within and between species, with several species pairs and the vast majority of population pairs exhibiting identical amino acid production capacities (SI Appendix, Table S2 and Dataset S8).Remarkably, we found no significant difference in the overall divergence of amino acid levels when comparing species pairs with identical versus dissimilar production capacities (Fig. 2A, P = 0.26, permutation test).In a similar vein, S. cerevisiae populations that differ in their amino acid production capacities are not more diverged in their amino acid levels than those with identical production capacities (Fig. 2B, P = 0.081, permutation test).
In part, the lack of association between divergence in synthesis capacity and metabolite levels might be due to the faster evolution of metabolite levels than network structure.Indeed, the large majority of S. cerevisiae strain pairs show identical amino acid production capacities (1,278 out of 1,431 pairs among 54 strains).Remarkably, virtually all (98.6%) such strain pairs show a significant change in at least one amino acid, with a median of 9 significantly altered amino acid levels (Fig. 2C and Dataset S9).
Further support was provided by strain pairs of S. cerevisiae with identical reaction repertoires and, consequently, identical metabolic network structures.Again, we found that such strain pairs have a median difference of 8 amino acid levels (SI Appendix, Fig. S2).Thus, even strains with identical metabolic network structures or metabolite production capacities differ substantially in their metabolite levels.Overall, these findings indicate that the evolution of metabolite levels is decoupled from the metabolic network's structure and occurs at substantially faster rates.This is broadly consistent with the notion that network structure is highly constrained by natural selection, whereas metabolite levels are subject to less stringent selection and accumulate a considerable amount of neutral changes.If so, variation in metabolite levels should largely reflect phylogenetic history.Alternatively, divergence in metabolite levels might represent adaptive differences in metabolic regulation that are not caused by rewired network structures.In the next section, we examine the influence of phylogenetic history and ecological origins on metabolome variation.

Metabolome Variation Is Driven by Both Population History and
Domestication in S. cerevisiae.We next interrogated the temporal dynamics of metabolome divergence by focusing on S. cerevisiae, which is represented by 17 genetically differentiated populations in our dataset.Plotting the overall divergence of 19 amino acid levels as a function of phylogenetic distance between pairs of populations revealed no significant correlation between the two (SI Appendix, Fig. S3, r = 0.33, P = 0.33, phylogenetic Mantel test).This pattern may reflect population-specific changes in metabolism induced by major lifestyle differences among distinct S. cerevisiae populations.
Domestication has been identified as the most dramatic life history-changing event during the intra-species evolution of S. cerevisiae that profoundly influenced the biology of the domesticated populations (35,36).We thus hypothesized that domestication also had a major impact on metabolome evolution, overriding the signatures of phylogenetic relatedness among populations.Two lines of evidence support this scenario.First, we found that the overall metabolome profile differs substantially between wild and domesticated populations (Fig. 3 and SI Appendix, Fig. S4).Such a difference may indicate adaptation to human-made niches and is analyzed in more detail below.Second, we found that the overall metabolome divergence correlates with phylogenetic distance when wild and domesticated clades are analyzed separately.Specifically, consistent with earlier studies (23,24), reconstructing the phylogenetic relationship between S. cerevisiae populations revealed two widely separated clades containing predominantly domesticated populations, suggesting at least two independent domestication events (SI Appendix, Fig. S5).Removing these two domesticated clades revealed a strong correlation between metabolome divergence and phylogenetic distance among wild populations (r = 0.74, P = 0.013, phylogenetic Mantel test, Fig. 4A).Similarly, metabolome divergence correlates well with phylogenetic distance among pairs of domesticated populations belonging to the same domesticated clades (r = 0.96, P = 0.006, phylogenetic Mantel test, Fig. 4B).Similar results are obtained when defining overall metabolome divergence based on the global metabolome profile excluding amino acids (r = 0.75, P = 0.03 for wild and r = 0.98, P = 8.5e-4 for domesticated population pairs, phylogenetic Mantel test, SI Appendix, Fig. S6).Thus, metabolome variation largely follows the genetic history of populations in S. cerevisiae, with the exception of major metabolome rewiring associated with domestication events.
We next asked whether domesticated and wild populations differ in their tempo of metabolome diversification.To estimate the rates of metabolome evolution, we calculated phylogenetic distance-adjusted metabolome divergences between pairs of wild populations as well as between pairs of domestic populations that come from the same domesticated clades.Remarkably, the phylogenetic distance-adjusted metabolome divergence is ~2.6fold and 5.7-fold higher among domesticated populations than among wild populations for amino acids and non-amino acid metabolites, respectively, indicating a faster overall tempo of metabolome evolution in domesticated yeasts (Fig. 4C; for non-amino acid metabolites see and SI Appendix, Fig. S7).In general, evolutionary diversification is slower when measured over longer evolutionary time intervals (4,37) and our dataset includes some highly diverged wild populations, potentially biasing the above comparison.However, the inferred faster tempo of metabolome evolution in domesticated yeasts is not an artifact of comparing evolutionary rates over very different time scales.First, even without adjusting for phylogenetic distance, domesticated populations show larger metabolome differences than wild populations despite being genetically less diverged from each other (Fig. 4D).Second, after excluding wild population pairs that are phylogenetically more distant than any of the domesticated population pairs, we still observe a 2.2-fold higher phylogenetic distance-adjusted metabolome divergence among domesticated populations than among wild populations (SI Appendix, Fig. S8).The rapid metabolic diversification of domesticated yeasts is well illustrated by several differences in specific metabolite levels between populations coming from the same domesticated clades (Dataset S10).For example, on average, isolates from the African beer population display higher alanine and proline levels than isolates from the Wine population (SI Appendix, Fig. S9).
Finally, while analyzing the dynamics of metabolome evolution across species would require a larger species set, we note that metabolome differences appear to accumulate gradually with phylogenetic distance among species (SI Appendix, Fig. S10, r = 0.71 and P = 0.011 for amino acids and r = 0.91 and P = 4.6e-4 for non-amino acid metabolites, phylogenetic Mantel tests).As this pattern resembles the intra-species variation patterns without domestication, we conclude that phylogenetic history may generally play an important role in metabolome divergence in budding yeasts.
The Metabolic Domestication Syndrome of S. cerevisiae.The substantial metabolome differences observed between wild and domesticated S. cerevisiae populations are consistent with at least two scenarios.First, evolution in human-made niches might result in a set of universal metabolomic changes, i.e., a metabolomic domestication syndrome, regardless of the genetic makeup of the ancestor and the specific details of the domestication niches.Alternatively, populations from the two major and widely separated domesticated clades might differ substantially both from wild yeasts and from each other in their metabolomes due to differences in the genetic makeup of their ancestors and/or selective forces.To distinguish between these scenarios, we systematically compared the levels of individual metabolites across domesticated and wild populations.Specifically, we performed a series of phylogenetic ANOVAs to test i) whether domesticated populations show recurrent metabolite changes compared to wild populations that cannot be explained by shared ancestry alone, and ii) whether populations from the two distinct domesticated clades differ from each other.
Our analysis revealed pervasive signatures of parallel evolution in the metabolomes of domesticated yeast (i.e., domestication signature).Specifically, 7 out of 19 amino acids and 27 out of 78 non-amino acid metabolites display a significantly increased or decreased level in independently domesticated populations compared to wild populations (Fig. 5A, SI Appendix, Fig. S11, and Dataset S11).For example, histidine displays a particularly strong signature: domesticated populations have, on average, twofold lower histidine levels than wild populations.As S. paradoxus populations exhibit similar histidine levels to wild S. cerevisiae populations, the low histidine levels of domesticated populations are likely to be derived states and have evolved convergently (Fig. 5B).
Several specific amino acids have been implicated in the formation of various aroma compounds (39) and therefore their metabolism might have been shaped by domestication.Consistent with this scenario, four of the seven amino acids (leucine, isoleucine, phenylalanine, and threonine) with a domestication signature also impact aroma production according to a systematic study (39), which is a statistically significant overlap (P = 0.038, Fisher's exact test, Dataset S11).
The set of metabolites displaying parallel changes in domesticated populations covers ~35% of the compounds measured via non-targeted metabolomics and span several metabolic subsystems, indicating a widespread metabolome signature associated with domestication (SI Appendix, Fig. S11).Notably, this signature also includes several metabolites of the TCA cycle.To confirm the domestication signature of the TCA cycle, we quantified the levels of TCA intermediates using a well-established targeted metabolomics method (40) (Dataset S3 and Methods).Reassuringly, 5 out of the 8 measured TCA intermediates display a domestication signature, with metabolites in the early steps of the TCA cycle having increased, and succinate and malate decreased levels (Fig. 5 C and D).This pattern indicates that domestication rewired the TCA cycle, possibly as a consequence of adaptation to better utilize carbon sources that support fermentative growth (36).
In sharp contrast to the widespread metabolome rewiring between wild and domesticated yeasts, we found no significant metabolome differences among populations belonging to the two major phylogenetically separated domesticated clades (Dataset S12).Furthermore, despite a marked difference between the current environments of the French dairy strains and the Wine and Sake populations (i.e., Camembert cheese or raw milk versus alcoholic fermentation) the domestication signature is also present in the French dairy population (Fig. 5 A-C).We note that the domestication signature is also evident when the Asian fermentation population is removed from the analysis (Dataset S11), which is a clade with a less clearly determined domestication status than the other populations (36).In addition, further statistical analysis confirmed that the domestication signature is not an artifact of potential growth state differences between strains at the time of sampling (Dataset S11).Overall, these results suggest that independent domestication events converged on a recurrent metabolome signature, supporting the existence of a universal metabolic domestication syndrome in S. cerevisiae.
Last, we asked whether other ecological factors beyond domestication have shaped the yeast metabolome.The S. cerevisiae populations studied here originate from a range of geographical and climatic regions.However, we found no evidence that yeast strains isolated from the same climate zone or broad geographic area show increased metabolome similarities (SI Appendix, Tables S3 and S4).

Discussion
In this paper, we combined metabolomic measurements with phylogenetic comparative analyses to study the evolution of metabolic states in the unicellular model eukaryote budding yeast.By studying variation of central metabolites across a panel of natural isolates representing the major clades of S. cerevisiae and several related species, we reached several general conclusions.
We found that although metabolite levels diverge mostly gradually with evolutionary time, domestication has impacted them profoundly.Domestication had major effects on the genome, life history traits, and nutrient utilization capacity of S. cerevisiae (35,36).These traits are shared between phylogenetically separated and independently domesticated lineages, indicating convergent evolution of domestication-associated traits and hence the existence of a yeast domestication syndrome.Our results demonstrate that a substantial fraction of metabolite levels also evolved convergently in independently domesticated clades, revealing a recurrent metabolic domestication syndrome in yeast.The set of affected metabolites covers various central pathways from amino acid metabolism to pyruvate metabolism and the TCA cycle.For instance, several metabolites of the TCA cycle have altered levels in domesticated populations, possibly as a consequence of adaptation to fermentative growth (36).We also found evidence for the convergent evolution of several amino acid levels across distinct domesticated clades.Since the biosynthesis of several amino acids originates from the TCA cycle, one might expect an overlap between these amino acids and those with domestication signatures.However, visual inspection revealed no clear link between the two sets (SI Appendix, Fig. S13).Consequently, parallel changes in amino acid levels and TCA cycle intermediates cannot simply be explained by their topological proximity in the network.A previous study reported that two of the domestication-associated amino acids (threonine and isoleucine) are better utilized as nitrogen sources by domesticated than wild yeasts (36).This indicates that domestication changed both the intracellular levels and consumption of these amino acids.Most notably, four out of seven domestication-associated amino acids (leucine, isoleucine, phenylalanine, and threonine) impact aroma production (39), suggesting that their evolution may have been driven by human selection on aroma profiles.Indeed, a previous study reported convergent evolution of a specific aroma compound across multiple domesticated lineages (19).Overall, our study expands such case studies and indicates pervasive rewiring of the metabolome upon adaptation to human-made niches.Thus, domestication reprogrammed not only the life cycle of yeast ( 36) but also its central metabolic pathway activities.This finding has far-reaching implications for our understanding of domestication in this prime model organism.
An important open question is whether analogous signatures of domestication are also apparent at the gene expression level and whether the metabolomic signature is primarily driven by gene regulatory changes.Previous studies of transcriptome diversity across S. cerevisiae isolates have not explicitly sought to identify a recurrent signature of domestication, so there is room for alternative interpretations.On the one hand, it appears that most transcriptomic variation follows the phylogenetic history (41) and is dominated by population-specific transcriptional signatures, including those that are unique to each domestication (42).However, these studies do not rule out the possibility of a common transcriptional signature of domestication that involves specific gene sets.Indeed, an earlier study reported recurrent gene expression changes in the TCA cycle and fermentation pathways between isolates from the Wine and Sake populations (25).More in-depth analyses are required to unravel the possible gene regulatory changes linked to the metabolic domestication syndrome.
Our findings have implications for the neutral theory of molecular evolution.According to this theory, most variations in nucleotide and protein sequences within and between species are selectively neutral (43).Although the theory was specifically proposed to explain sequence evolution, in principle, it could be also applied to molecular traits that are not perfectly correlated with organismal fitness (44).Indeed, gene expression levels appear to evolve largely neutrally in both yeasts and animals (41,45,46).However, it is largely unclear whether molecular traits that lie closer to fitness, such as metabolite levels, are mainly governed by adaptive or neutral evolution.We found that phylogenetic relatedness explains well the overall divergence of metabolite levels among S. cerevisiae populations after accounting for domestication (Fig. 4).This pattern is broadly consistent with a neutral scenario, where metabolome variations arise largely due to the steady accumulation of neutral or nearly neutral mutations.Notably, similar results have been reported for variation in gene expression and several phenotypic traits in S. cerevisiae (41,47).The neutral scenario is further supported by our observation that, beyond domestication, S. cerevisiae isolates originating from similar environments do not show increased similarity in their metabolomes.In addition, our finding that different amino acids diverge at different rates also fits into the neutral theory.Specifically, we found that metabolites whose levels vary more among species also tend to show larger variations across S. cerevisiae populations (Fig. 1 B  and C).This pattern is consistent with the notion that some metabolites are subject to less stringent selective constraints and therefore can accumulate more neutral changes across multiple evolutionary timescales.Recent research into the evolution of metabolite levels in mammals found similar results.It has been shown that evolutionary conservation of metabolite levels varies greatly among metabolites and can be explained by a neutral model (22).
Despite such evidence supporting neutral evolution, a substantial part of the metabolome (>30% of measured metabolites) shows signatures of convergent evolution across independently domesticated lineages, indicating that adaptive evolution has also substantially shaped the metabolome diversity of S. cerevisiae.Adaptive evolution has also been shown to be responsible for metabolite changes associated with wheat domestication (15).Clearly, the relative importance of neutral versus adaptive evolution of metabolite levels in yeasts remains to be formally quantified, for example, through quantitative genetic methods (15).In addition, further study is needed to test whether the faster rate of metabolome divergence among domesticated populations is driven by niche adaptations or enhanced drift owing to population bottlenecks.
We found that metabolite levels evolve faster than the metabolic network's capacity to produce metabolites (i.e., yield).Notably, closely related yeast isolates often show identical or highly similar enzyme reaction repertoires, yet display substantial metabolome differences.More broadly, the evolution of metabolic states appears to be largely uncoupled from divergence in the structure of the network.Evolution at these two levels might differ for several reasons.First, the underlying mutational target sizes differ greatly.While the rewiring of metabolic pathway structure depends on highly specific gene gain and loss events, metabolite levels can be impacted by a broad range of mutations.For example, individual amino acid levels are influenced by the activity of up to ~400 genes in yeast (13).Thus, mutations in many proteins that do not directly interact with a metabolite can also affect its level.Second, metabolic reaction repertoires are likely under strong purifying selection with limited room for selectively neutral changes.Indeed, the vast majority of genes, including enzyme-encoding ones, have a measurable fitness contribution under some environmental conditions (48,49).Thus, phylogenetic variation in enzyme repertoire likely reflects environmental adaptation.In contrast, metabolite levels may readily vary without measurably affecting fitness (20,21) and may therefore evolve more rapidly and independently of adaptive changes in the network's structure.Based on these considerations, we propose that studying metabolic evolution through metabolomics provides largely complementary information that cannot be obtained by studying enzyme content alone.This view extends recent findings that patterns of natural selection on metabolite levels are not easily predictable from selection acting on gene expression (50).
Our study has several limitations.First, the examined strain panel captures divergence between clean (i.e., non-mosaic) S. cerevisiae populations and hence is well suited for phylogenetic comparative analyses.However, our study was not designed to investigate within-population diversity or evolutionary changes in mosaic strains.Second, the metabolome measurements were performed in a single environment that is ideally suited for assessing variation in biosynthetic metabolism (13) and in which all strains exhibit stable growth.This corresponds to the common garden design in comparative metabolomics studies (51).However, in the real world, yeasts experience a plethora of environmental conditions and therefore our study likely underestimates the variety of metabolomic states displayed by different lineages.Importantly, the metabolomic signature of domestication is evident in all human-associated populations regardless of ecological differences.It is therefore unlikely that the domestication signature is an artifact of measuring metabolomes in a single environment to which some domesticated clades are better adapted than others.Nevertheless, we note that measuring phylogenetically diverged species in the same environment can be especially challenging as they might display very different physiology.Indeed, S. castelli displays the lowest growth rate among the examined species (Dataset S1), which may confound its perceived metabolome divergence.Finally, our study only included single strains of several species, which may limit the accurate estimation of metabolome differences between species.In light of the substantial metabolic diversity observed even within wild S. cerevisiae, studying metabolome variation in other yeast species is an important open question.
In sum, our work uncovers substantial variations in central metabolite levels across budding yeasts and establishes a recurrent metabolomic signature of domestication.

Methods
Strains. S. cerevisiae isolates included in this project were sequenced and analyzed previously (52) and provided by Gianni Liti.Strains were carefully selected to be representative of non-admixed wild and domesticated S. cerevisiae populations with at least three strains when possible.S. paradoxus, Saccharomyces mikatae, and Saccharomyces kudriavzevii isolates were selected from the Saccharomyces Genome Resequencing Project (SGRP) collection and sequenced and analyzed previously (52).Saccharomyces arboricola, Saccharomyces eubayanus, S. castelli, and Kazachstania saulgeensis strains were purchased from CBS (https://wi.knaw.nl/Collection).The Saccharomyces jurei isolate was provided by Daniela Delneri.
Growth Rate Measurements.We performed growth rate measurements at both 25 °C and 30 °C to select a temperature that minimizes growth rate differences across the diverse yeast isolates studied here.Yeast cells were inoculated from frozen (−80 °C) samples on 2% agar solidified synthetic dextrose (SD) medium (0.145%yeast nitrogen base minus amino acids/0.5% ammonium sulfate, and 2% glucose + 2% agar).Cells were cultivated for 48 h at 25 °C and 30 °C.Pre-cultures were inoculated from colonies in 96-well format using Hamilton Microlab star liquid handling robot equipped with core grippers and V&P scientific pin-tool with 1.58-mm floating pins (53).Pre-cultures were cultivated in 96-well plates in SD media for 16 h using a Heidolph Titramax 1000 (900 rpm) vibrating plate shaker incubator at both temperatures.
We used a protocol designed to measure growth rate in minimal media (53).Growth was assayed by monitoring the OD (OD 600 ) of liquid cultures of each strain using 384 well microtiter plates (Corning).The 384-well plates filled with 60 µL medium per well were inoculated for growth curve recording with 1% starter culture.The 384-well plates were incubated at 25 °C with medium linear shaking in Powerwave HT plate readers (BioTek Instruments Inc).Cell growth was followed by recording the OD at 600 nm every 5 min.Four technical replicate measurements were executed on all strains.Growth rate was calculated from time series OD 600 data as described elsewhere (47).
Targeted Quantification of Amino Acids.For metabolomics experiments, precultures were cultivated as described for growth rate measurements.We inoculated 4% pre-cultures into 1,200 µL/wells of SD media in a 2 mL/well deep well plate and cultivated for 10 h at 25 °C (until the mid-exponential phase) in a Heidolph Titramax 1000 vibrating plate shaker incubator at 900 rpm.Stirring was enhanced by using a 2-mm size borosilicate glass/well.To obtain OD values at the time of inoculation, sampling, and stationary phase, we cultivated a separate plate in parallel and took a sample volume of 100 µL for OD measurement at the time of sample harvest for metabolome measurement.Growth state was calculated by subtracting OD values at inoculation from OD values at sampling and dividing this result by the OD values at stationary phase (Dataset S1).ODs were measured using a Powerwave HT plate reader.The full sample was harvested and extracted as described in ref. 13 with minor modifications.Briefly, cell pellets were extracted with 200 µL 80 °C hot ethanol containing isotope labeled algal extract 5 mg/mL extraction solution.We obtained Algal Lyophilized Cells (U-13C, 98%+) from Cambridge Isotope Laboratories CLM-2065-PK.After 1 min of vigorous vortexing, we let the mixture incubate without shaking for 5 min and centrifuged for 5 min at maximum rpm.The aqueous phase was transferred to the prepared extraction solution.LC-MS-based quantification was also carried out in the same way as described in ref. 13, but free amino acids of ~0.5 mg 13C-labeled algae/sample extracts were used as internal standard for quantification instead of algal hydrolysate amino acid mixture.Metabolite identification was supported with standard injection and by monitoring 1 quantifier + 1 qualifier ion/metabolite.Metabolic concentration data were normalized using i) probabilistic quotient normalization (PQN) to control for biomass amount and extraction efficiency differences between replicates and ii) linear regression between OD at the time of sampling and metabolite intensities to control for potential intensity differences stemming from growth differences (32).

Preparation of Internal Standards for Amino Acid Quantification.
13C-labeled algae were extracted using a modified Bligh-and-Dyer extraction method (54).First, 700 µL of 4:10 water/MetOH was added to the 250 mg of lyophilized algal cells and vortexed vigorously.It was followed by the addition of 250 µL chloroform and vortexed vigorously for 1 min.After this, 250 µL water was added and vortexed vigorously for 1 min, which was followed by the addition of 250 µL chloroform again.
Non-Targeted Metabolomics.For non-targeted metabolomics, 1,200 µL culture/sample was cultivated at 25 °C in the same way as for amino acid quantification.For metabolite extraction, 400 µL of cells/sample were centrifuged at 4 °C for 3 min (13).Cell pellets were extracted using 200 µL of 40:40:20 (v/v/v) methanol/acetonitrile/water (55,56) with 900 rpm shaking for 10 min at 4 °C.After pelleting the cells with 4,500 rpm −20 °C centrifugation for 5 min, the cells were re-extracted with the same method.Combined extracts from both extractions were centrifuged to remove debris.A pooled extract (QC, quality control) was prepared by joining the extracts from each experimental batch separately.Extracts were stored at −80 °C for high resolution mass spectrometry (HR-MS) measurements using polypropylene plates with a V-bottom.Before analysis, extracts were thawed and shaken for 10 min at room temperature at 900 rpm with Titramax 101.Metabolite extracts (2 µL) were injected directly into a Thermo Q-exactive Focus mass spectrometer (Thermo Fisher Corporation) operated in negative mode only, at 70,000 resolution, 2 micro-scans, and 3 × 10^6 AGC, using Dionex 3000 HPLC eluent flow.Flow rate was 150 mL/min, and the mobile phase was acetonitrile:water (70:30, v/v).To ensure measurement stability during analysis, a pooled extract sample (QC) was assessed every 20 samples (57).We used ProFIA, a data preprocessing workflow specifically designed to process high-resolution, high-throughput metabolomics data for flow injection analysis (58).Signal to noise threshold was set to 10 and background intensities were measured at the last 30 s of the 3-min runtime.In one experimental batch, 4 replicates were cultivated and measured.
We assigned putative metabolite to the measured ions using a high-quality genome-scale reconstruction of yeast metabolism (34,59) based on their exact masses and only considering deprotonation.Putative metabolites were kept only if the i) corresponding ion was detected in at least 87.5% of the pooled extract (QC) samples, which were injected 8 times at the beginning of the measurement sequence, ii) showed a CV (coefficient of variation) of peak area lower than 25% and iii) was detected in at least 90% of the biological samples.Missing values were imputed using the k-nearest neighbors method (60,61).Intensities were normalized as in amino acid quantification, i.e., using PQN normalization and linear regression between OD at the time of sampling and metabolite intensities (32).
Targeted Quantification of TCA Cycle Intermediates and Pyruvic Acid.We followed an established method (40) with small modifications for the quantification of TCA cycle intermediates and pyruvate.We determined TCA cycle intermediates and pyruvic acid from 1,000 µL culture/sample in deep well plates.Pellets were extracted in 40:40:20 (v/v/v) methanol/acetonitrile/water and dried.Dried extracts were dissolved in 150 µL LC-MS grade water and cleared.Compounds were separated using a Waters HSS T3 column (1.7 μm, 2.1 mm × 100 mm) on a liquid chromatography (Waters ACQUITY Premier) and tandem mass spectrometry (Waters TQS-Micro) system.For a more detailed description, see SI Appendix.Phylogenetic Analyses.We constructed a phylogenetic tree of the budding yeasts studied here using a phylogenomic approach following (62).We used the genomes of the 71 yeast isolates and two outgroups, Torulaspora delbrueckii CBS1146, and Tetrapisispora phaffii CBS4417.Genome sequences were acquired from different sources, see Dataset S6, column "Sequence source" (23,28,35,52,(63)(64)(65)(66)(67).To obtain consistent genome annotation across the studied yeast isolates, we annotated all budding yeast genomes, except the outgroups, using the MAKER genome annotation pipeline v2.31.10 (68).For a detailed description, see SI Appendix, Supporting text, Genome Annotation section.Note that for the outgroups, we used the published genome annotations.
To construct the phylogenetic tree, we first translated all nucleotide sequences to protein sequences with the transeq tool from the EMBOSS package (version 6.6.0)(69).Next, we used OrthoFinder (version 2.4.0)(70) to cluster homologous genes across all genomes.The clustering resulted in 1,1280 orthogroups in total.Of note, 1,218 orthogroups contained a single gene copy from each genome and were further considered for concatenation-based phylogenetic tree inference.We performed multiple sequence alignment on each of the 1,218 orthogroups using MAFFT (version 7.471) (71).TrimAl (version 1.4.1)(72) was applied with the "-gappyout" option to remove poorly aligned regions.Trimmed sequences shorter than 167 amino acids or shorter than 50% of the length of the total trimmed alignment were removed.We also removed sequences with missing amino acids.Orthogroups containing removed sequences were excluded from further analysis.After filtering, we concatenated the remaining 983 orthogroups to generate a single concatenated sequence for each genome.We built a concatenated maximum likelihood tree with RAxML (AVX version, 8.2.12) (73) using the LG amino acid substitution matrix with the GAMMA model and using one partition for each orthogroup.Rapid bootstrapping was enabled, and the "autoMRE" option was used to determine the sufficient number of bootstrap replicates.Most internal nodes at or above the population level received bootstrap support values higher than 95% (SI Appendix, Fig. S12).The tree with the best likelihood value was collected to represent the phylogenetic relationships between the strains in our dataset (Dataset S4).However, most of our analyses focus on the phylogenetic relationships of yeast populations instead of individual strains.In order to obtain a phylogenetic tree of populations, we selected one representative strain from each population and removed the rest of the strains from the tree (Dataset S5).
Calculation of Metabolome Divergence.We calculated metabolome divergence between strain pairs as the average of the squared differences in metabolite levels for each metabolite (or putative metabolite in the case of the non-targeted dataset) using the following equation: where N is the number of metabolites in the dataset, k and l represent two separate strains, and m k,i and m l,i represent the log-scaled and OD normalized level of the ith metabolite in strain k and l, respectively.Metabolome divergence between a pair of population was calculated as the average metabolome divergence of strain pairs where each strain belongs to one of the two populations.Similarly, the metabolome divergence between two species is the average metabolome divergence between strains of the two species.
To calculate phylogenetic distance-adjusted metabolome divergence, we divided the metabolome divergence of each population pair with the phylogenetic distance between the two populations.The phylogenetic distance between two populations is the sum of the lengths of the branches of the phylogenetic tree that connect the two populations to their common ancestor.

Metabolic Network Structure and Metabolite Production Capacities.
Metabolic network reconstructions and amino acid production capacity (i.e., maximum theoretical yield of metabolite production) simulations were taken from the literature for seven of the species (S. castelli, S. arboricola, S. cerevisiae, S. eubayanus, S. kudriavzevii, S. mikatae, and S. paradoxus) and all of the S. cerevisiae strains present in our dataset (Dataset S6).Metabolite yield data were retained for the 19 amino acids present in our dataset.Similarity between yield profiles was calculated in the same way as metabolite divergence; see section "Methods/Calculation of metabolome divergence" for more details.Briefly, for each species/population pair, we calculated the average of the squared differences of metabolite yields for each amino acid.Yield profiles were considered identical when yield distance between species/populations was lower than 1e-3, as small yield differences may occur due to the finite precision arithmetic used in yield calculations.Because several species pairs and the majority of population pairs display identical amino acid yield profiles according to this definition, we compared the metabolome divergence between species and population pairs with identical versus different yield profiles (Fig. 2).

Statistical Analyses.
Phylogenetic mantel test.We applied phylogenetic Mantel tests to calculate correlations between distance matrices, such as metabolome divergence and phylogenetic distance (74).This method uses phylogeny-based permutations to adjust the P-value for the non-independent nature of the values in distance matrices.The strength of the correlation was measured with Pearson's r.The number of permutations was 1e-5.

pnas.org
Permutation test to compare metabolome divergence.To compare metabolome divergence or phylogenetic distance-adjusted metabolome divergence between different groups of population pairs or species pairs, we employed permutation tests.First, we sorted each population (or species) into groups according to our analysis (e.g., domesticated and wild).Next, we shuffled (resampled without replacement) the group associations of the populations (or species).We calculated the average (phylogenetic distance-adjusted) metabolome divergence of population (or species) pairs for each group and used the ratio of these two numbers as our test statistic.We repeated this randomization process 1e-5 times to acquire a distribution of our randomized test statistic.Finally, we calculated the probability that the original, unshuffled test statistic (ratio of average metabolome divergences between the two groups) comes from our randomized distribution with the following equation: We reported this probability as the p-value of our permutation test.Note that ratios were calculated in a way that the group in the numerator had higher average metabolome divergence in the original, unshuffled data compared to the group in the denominator.Phylogenetic ANOVA.To determine parallel changes in metabolite levels in domesticated S. cerevisiae populations compared to wild populations (or populations with different climates or geographic locations), we performed phylogenetic generalized least squares ANOVA tests (75).First, we calculated the average metabolite levels for each population and marked each population as either wild or domesticated.Then, we used the corBrownian function from the ape package (version 5.7-1) (76) to calculate covariance for the phylogenetic tree of the populations.Next, we used the gls function from the nlme package (version 3.1-162) (77) to fit phylogenetic ANOVA with the average metabolite level and domestication status of each population as variables and the corBrownian object as the correlation structure of the model.We repeated the calculations for each metabolite.The analysis was repeated without the Asian Fermentation population.In order to control for growth state, we added the average growth state of each population as a covariate to the phylogenetic ANOVA model using the gls function.For each metabolite, the significance level of the independent effect of domestication on the metabolite level was returned.
Data, Materials, and Software Availability.Raw LC-MS data reported in this article are accessible via Zenodo (https://zenodo.org/records/10680639)under accession number MTBLS9200 (78).Previously published data were used for this work (We used publicly available genome sequence data for yeast isolates from ref. 23).

Fig. 1 .
Fig. 1.Phylogenetic diversity of amino acid levels in yeasts.(A) Maximum likelihood phylogenetic tree of yeast strains based on the alignment of 983 concatenated single-copy orthologs.Branch lengths represent the average number of substitutions per site.Species are indicated in bold, while strains and populations are indicated in italics.Green and orange dots denote wild and domesticated S. cerevisiae strains, respectively.Strains were considered as domesticated if the entire population is considered domesticated.(B) Metabolite fold change values for nine yeast species for each amino acid.Amino acids are ranked based on their between-species variance of fold change.Species-level fold change values were calculated by using species averages and using the median value of the species averages as a reference point.Colors represent species.(C) Distributions of metabolite fold change values across 17 S. cerevisiae populations, averaged by populations, for each amino acid.Amino acids are ranked based on their between-population variance of fold change.Population-level fold change values were calculated by using population averages and using the median value of the population averages as a reference point.

Fig. 2 .Fig. 3 .
Fig. 2. Evolutionary divergence in metabolome profile versus metabolite production capacity profile.(A) Amino acid metabolome distance of pairs of yeast species.Species pairs are divided into two groups based on their similarity in amino acid production capacity.Pairs with identical (N = 9) or different (N = 12) production capacities are shown in blue and orange, respectively.The P-value was determined using a permutation test (Methods).(B) Amino acid metabolome distance of S. cerevisiae population pairs.Population pairs are divided into two groups based on their similarity in amino acid production capacity.Pairs with identical (N = 91) or different (N = 45) production capacities are shown in blue or orange, respectively.The P-value was determined using a permutation test.(C) Extent of metabolite variations between S. cerevisiae strain pairs with identical amino acid yields.Distribution of the number of significantly changed amino acid levels are shown for those S. cerevisiae strain pairs which have no difference in their amino acid yields.

Fig. 4 .
Fig. 4. Metabolome divergence as a function of phylogenetic distance among S. cerevisiae population pairs based on amino acid measurements.(A) Pairs of wild S. cerevisiae populations (N = 28).North American oak and Mediterranean oak populations are excluded, as their closest relatives are domesticated populations.The dashed line (blue) indicates the linear regression line.(B) Pairs of domesticated populations that come from the same domesticated clades (N = 9).The dashed line (blue) indicates the linear regression line.(C) Comparison of phylogenetic distance-adjusted metabolome divergence among pairs of domesticated (N = 9) and wild (N = 28) S. cerevisiae populations.The P-value was determined using a permutation test (Methods).(D) Comparison of metabolome divergence among pairs of domesticated (N = 9) and wild (N = 28) S. cerevisiae populations.The P-value was determined using a permutation test.

Fig. 5 .
Fig. 5.The metabolomic domestication syndrome in S. cerevisiae populations.(A) Average levels of amino acids showing domestication signature in across S. cerevisiae populations.Wild and domesticated populations are marked as green and orange, respectively.Metabolite levels are studentized for each metabolite for visualization purposes.(B) Traitgram of histidine level changes on the S. cerevisiae phylogenetic tree.Note that branch lengths do not scale with phylogenetic distance for visual clarity.Domesticated, wild S. cerevisiae, and S. paradoxus populations are colored as orange, green, and blue, respectively.(C) Average levels of TCA cycle metabolites showing domestication signature across S. cerevisiae populations.Wild and domesticated populations are marked as green and orange, respectively.Metabolite levels are studentized for each metabolite for visualization purposes.(D) Schematic of the TCA cycle (based on ref. 38).Metabolites with increased or decreased levels in domesticated populations are marked as red or blue, respectively.Metabolites showing no domestication signature and unmeasured metabolites are denoted with filled gray and white labels, respectively.
p = (Number of randomized ratios higher than or equal to the ratio of the original data) + 1 (Number of permutations) + 1 .