Skip to main content

Main menu

  • Home
  • Articles
    • Current
    • Special Feature Articles - Most Recent
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • List of Issues
  • Front Matter
    • Front Matter Portal
    • Journal Club
  • News
    • For the Press
    • This Week In PNAS
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Editorial and Journal Policies
    • Submission Procedures
    • Fees and Licenses
  • Submit
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Accessibility Statement
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian

User menu

  • Log in
  • My Cart

Search

  • Advanced search
Home
Home
  • Log in
  • My Cart

Advanced Search

  • Home
  • Articles
    • Current
    • Special Feature Articles - Most Recent
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • List of Issues
  • Front Matter
    • Front Matter Portal
    • Journal Club
  • News
    • For the Press
    • This Week In PNAS
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Editorial and Journal Policies
    • Submission Procedures
    • Fees and Licenses
  • Submit
Research Article

Fitness landscape for nucleosome positioning

Donate Weghorn and Michael Lässig
  1. Institute for Theoretical Physics, University of Cologne, 50937 Cologne, Germany

See allHide authors and affiliations

PNAS July 2, 2013 110 (27) 10988-10993; https://doi.org/10.1073/pnas.1210887110
Donate Weghorn
Institute for Theoretical Physics, University of Cologne, 50937 Cologne, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael Lässig
Institute for Theoretical Physics, University of Cologne, 50937 Cologne, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: mlaessig@uni-koeln.de
  1. Edited by José N. Onuchic, Rice University, Houston, TX, and approved May 21, 2013 (received for review June 27, 2012)

  • Article
  • Figures & SI
  • Info & Metrics
  • PDF
Loading

Abstract

Histone–DNA complexes, so-called nucleosomes, are the building blocks of DNA packaging in eukaryotic cells. The histone-binding affinity of a local DNA segment depends on its elastic properties and determines its accessibility within the nucleus, which plays an important role in the regulation of gene expression. Here, we derive a fitness landscape for intergenic DNA segments in yeast as a function of two molecular phenotypes: their elasticity-dependent histone affinity and their coverage with transcription factor binding sites. This landscape reveals substantial selection against nucleosome formation over a wide range of both phenotypes. We use it as the core component of a quantitative evolutionary model for intergenic DNA segments. This model consistently predicts the observed diversity of histone affinities within wild Saccharomyces paradoxus populations, as well as the affinity divergence between neighboring Saccharomyces species. Our analysis establishes histone binding and transcription factor binding as two separable modes of sequence evolution, each of which is a direct target of natural selection.

  • biophysics
  • nucleosome-depleted regions
  • evolution of regulation
  • quantitative traits
  • inference of selection

The positional organization of nucleosomes in eukaryotic cells is of key importance for the overall chromatin structure and, thus, for the regulation of gene expression (1⇓–3). Nucleosomes form through binding of a histone octamer to a DNA sequence segment of average length 146 base pairs (bp), which wraps around the protein complex (4). Histone-bound DNA segments are interspersed with unbound “linker” segments. Particularly prominent features of this pattern are so-called nucleosome-depleted regions (NDRs). These are extended troughs in occupancy at least ∼100 bp long, primarily located in intergenic DNA. Changes in nucleosome positioning affect the accessibility of local DNA segments for binding interactions with transcription factors and lead to observable changes of gene expression in yeast (3, 5).

Explaining two correlated molecular functions—histone binding and transcriptional regulation—in the same sequence segment may be seen as a chicken-and-egg problem (6⇓⇓–9). Is transcription factor binding the primary function, which displaces nucleosomes to sequence segments in which transcription is neutral or deleterious? Or, conversely, does nucleosome positioning constrain transcriptional interactions? Here, we address this problem by a quantitative evolutionary analysis of yeast genomes. We infer a fitness landscape for intergenic sequence segments that measures selection on their regulatory interactions and on local nucleosome formation. We capture these functions by two molecular phenotypes, the regulatory binding site content and the histone binding affinity, which reflect distinct biophysical characteristics of a DNA segment. The fitness landscape resulting from our analysis shows substantial selection acting jointly on transcriptional interactions and on nucleosome formation. Specifically, we find broad selection against histone binding—that is, in favor of nucleosome depletion—in sequence segments ∼100 bp long, although individual nucleotides within these segments are under only weak selection. Our inference of selection on nucleosome positioning is corroborated by an evolutionary analysis within and across yeast species. We model the evolution of sequence segments by mutations, genetic drift, and selection given by our fitness landscape. This model explains the observed intraspecies diversity as well as the cross-species divergence of nucleosome positioning in a quantitative way. At the end of the paper, we discuss the implications of our findings for the functional and evolutionary relationship between nucleosome positioning and transcriptional regulation and, in a broader context, for the inference of selection on correlated molecular functions.

Our evolutionary analysis is based on established biophysical models that relate the histone binding affinity and the regulatory site content of a DNA segment to its nucleotide sequence. Several mechanisms are known to influence the local probability of nucleosome formation (8). Histone-affine DNA has a specific nucleotide composition that facilitates superhelical turns around the cylinder-shaped octamer (10, 11). In contrast, histone-repelling sequence contains homopolymeric adenine segments on one strand paired with thymine segments on the other strand; these A:T tracts confer a high rigidity to the DNA double strand (12, 13). In addition, competition with other DNA-binding proteins (3, 14, 15), as well as active rearrangement through chromatin remodelers (16, 17), may alter histone binding to DNA. All these factors contribute, to different degrees, to the positioning of nucleosomes in vivo (15). Here, we choose one particular biophysical phenotype, the elasticity-mediated histone binding affinity, to map direct selection on nucleosome formation in yeast intergenic regions. Our finding of broad selection in favor of nucleosome depletion is consistent with the known functional role of NDRs. They reflect stable barriers in the histone binding energy landscape, which constrain the positioning of nucleosomes between them (18⇓⇓–21). To infer regulatory binding sites in the yeast genome, we use standard statistical models of the position-dependent binding energy profile for specific transcription factors (22).

Our findings are consistent with previous results on the evolution of nucleosome positioning. About 70% of interspecific nucleosome architecture changes in yeast are caused by cis effects as opposed to trans-acting factors (23), which supports our inference of a local histone binding phenotype. At the level of sequence evolution, it has been shown that linker regions in yeast coding sequence are more conserved than regions of higher nucleosome occupancy (24, 25), in agreement with a previous analysis of chromosome III promoters (15). More specifically, A:T-loss nucleotide changes are reduced in NDRs compared with high-occupancy regions (26), which is consistent with A:T-rich sequence disfavoring nucleosome formation. Similar signatures of selection acting on nucleotide frequencies also have been found in the human lineage (27). It is important to note, however, that observations of sequence conservation do not distinguish the evolutionary signal of direct selection acting on a specific function, in this case nucleosome formation or transcriptional interactions, from selection acting on other, potentially unrelated functions encoded in the same sequence segment. This is why we base our study on biophysically grounded models: The statistics of a biophysical trait associated with a specific function will prove to be less confounded by apparent selection than summary sequence measures. Our inference method can be applied to other quantitative traits with a large sequence target, even if individual nucleotide changes are under only weak selection.

Results

Phenotypes of Histone Binding and of Transcription Factor Binding.

Wrapping DNA around histones necessitates specific elastic deformations of its double strand. We evaluate the energy cost of these deformations using the model of references (20, 28). The local energy cost depends on sequence content, because different nucleotide triplets have different a priori deformations in the unbound state. Given the genomic landscape of energy costs, the resulting mean nucleosome occupancy ω of a given sequence segment is determined by equilibrium thermodynamics. We call this phenotype the histone binding affinity of the segment. Our analysis uses the thermodynamic model and algorithm of references (20, 28) (for details, see Methods). This model successfully predicts the nucleosome positioning observed under in vitro conditions, that is, without the competitive binding of transcription factors (20). As expected, the ensemble average of ω decreases with increasing energy cost and increases with increasing histone density (or equivalently, with the associated chemical potential) (Fig. S1). For our genomic analysis, we use a chemical potential that reproduces the genome-wide occupancy average in vivo of about 80%. With these settings, we take ω as the best computable phenotype to measure the elasticity-mediated histone binding affinity of a given sequence segment. By definition, this phenotype is independent of the regulatory interactions encoded in that segment. We measure these interactions by an independent phenotype, n, given by the number of annotated transcription factor binding sites (Methods).

We can relate these phenotypes to the in vivo nucleosome positioning in Saccharomyces cerevisiae, which was measured in (3). In Fig. 1A, we evaluate the mean in vivo occupancy score Graphic for intergenic sequence segments of length 100 bp. We find a strong dependence on both phenotypes: Graphic is an increasing function of ω and a decreasing function of n. We conclude that DNA rigidity and transcription factor binding jointly contribute to nucleosome depletion in living yeast cells. This motivates our joint analysis of selection on exactly these phenotypes, to which we now turn.

Fig. 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 1.

In vivo nucleosome occupancy and fitness for yeast intergenic sequence segments. (A) The mean nucleosome occupancy, Graphic, is plotted against two molecular phenotypes: the elasticity-mediated histone binding affinity, ω, and the number of transcription factor binding sites, n. Occupancy data in S. cerevisiae are taken from (3) and shown for nonoverlapping intergenic sequence segments of length 100 bp. Data points not shown reflect insufficient phenotype counts Graphic. (B) The scaled fitness landscape Graphic inferred from the genomic phenotype distribution (by Eq. 1). This landscape shows that direct selection acts on both phenotypes and establishes sequence-dictated nucleosome positioning as a primary mode of the evolution of intergenic DNA.

Phenotype-Dependent Fitness Landscape.

To infer a map between phenotype and fitness, we compare the genomic distribution of phenotype value pairs, Graphic, with the corresponding distribution Graphic evaluated in a suitable null model. To obtain Graphic, we construct a tiling of the yeast genome into nonoverlapping segments of fixed length Graphic bp. This procedure is designed to avoid overcounting in longer NDRs and to make the phenotype data comparable between segments (for details, see Methods). The resulting distribution Graphic for intergenic sequence in S. cerevisiae is shown in Fig. S2A. As a genomic null model, we use uncorrelated random sequence, which implies that nucleotide triplets conferring specific local elasticity properties are scrambled in the null model. The resulting phenotypic null distribution may be approximated as a product, Graphic. We obtain the marginal distribution Graphic using the same tiling procedure as in the actual yeast genome (which ensures that our results are insensitive to its bioinformatic details). This distribution is shown as a black line in Fig. 2. The marginal distribution Graphic can even be evaluated analytically, using the information content (or relative entropy) of the binding motifs of individual transcription factors. Details on both components of the null model are given in SI Text. The resulting joint distribution Graphic is shown in Fig. S2B.

Fig. 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 2.

Selection against nucleosome formation. Distribution of histone binding affinity for nonoverlapping intergenic segments of length 100 bp in S. cerevisiae, Graphic (purple ●), compared with the analogous distribution from random sequence Graphic (solid black line). Both distributions are evaluated in bins of width 0.05. The effective scaled fitness landscape for histone binding affinity, Graphic (red line), is the log-likelihood of the distributions Graphic and Graphic.

We now can infer the scaled phenotype-fitness map Graphic as the log-likelihood score of the genomic phenotype distribution and the null distribution (29, 30):Embedded ImageAll fitness values on the left-hand side are measured in units of Graphic, where N is the effective population size. This landscape is defined up to an arbitrary constant, because only fitness differences (selection coefficients) enter the evolution of phenotype frequencies. Our inference of selection involves several assumptions. First, Eq. 1 is valid if nucleosome positioning is at an evolutionary equilibrium of mutations, genetic drift, and selection. This assumption is corroborated by our cross-species analysis described below. Second, the landscape Graphic is inferred from all intergenic sequence segments. The underlying uniformity assumption may be relaxed: If the fraction of segments under selection against histone binding is anywhere above ∼20%, our inference of selection essentially remains unchanged in the regime of reduced affinity, Graphic (SI Text and Fig. S3A). Similarly, our results are insensitive to variations of the tiling length Graphic within the length range of functional NDRs, as shown in Fig. S3B.

The scaled fitness landscape Graphic inferred for S. cerevisiae intergenic sequence is shown in Fig. 1B. It reveals substantial selection on both histone binding affinity and transcriptional regulation: We find scaled fitness differences Graphic in our set of intergenic segments. Importantly, the selection on histone binding affinity is a primary effect; that is, the overrepresentation of NDRs in the yeast genome cannot be explained by direct selection on regulatory site content alone. Our finding of substantial direct selection on ω gives an a posteriori justification for our choice of this phenotype. Before we discuss the implications of the inferred fitness landscape, we test its predictions for evolution of sequence-dictated nucleosome positioning within and across species.

Selection Against Nucleosome Formation.

As shown in Fig. 1B, the selection on histone binding affinity does not depend strongly on the regulatory phenotype n. Therefore, it can be evaluated in good approximation from an effective fitness landscape for histone binding affinity, Graphic, which is most convenient for our subsequent evolutionary analysis. This landscape is inferred from the marginal distributions Graphic and Graphic by an equilibrium relation analogous to Eq. 1, and is shown in Fig. 2. Again, the function Graphic is insensitive to the fraction of segments under selection and to the choice of tiling length (SI Text and Fig. S3).

The effective fitness landscape shows that selection in favor of nucleosome depletion acts across a broad range of affinity values, beyond what commonly would be considered a nucleosome-free region. This implies that there is predominantly directional selection on affinity changes,Embedded Imagewith an average proportionality constant Graphic obtained from a linear fit to the function Graphic in the range Graphic. Affinity changes of Graphic are under substantial selection, i.e., they lead to fitness changes of magnitude Graphic. However, most point mutations confer smaller affinity changes and are only weakly selected. The efficacy of selection on nucleosome formation is not caused by large effects of single mutations, but by the multitude of elasticity-changing mutations in an extended sequence segment.

Selection on Affinity Polymorphisms.

We now show that the fitness landscape of Eq. 2 correctly predicts the frequency bias of intergenic single-nucleotide polymorphisms (SNPs) that is related to selection against nucleosome formation. From the Saccharomyces Genome Resequencing Project, we obtained the genomes of 35 Saccharomyces paradoxus isolates and their alignments (Methods). We choose this species for the analysis because it has a simpler population structure than S. cerevisiae (31). We analyze SNPs in nonoverlapping intergenic NDRs with Graphic identified on the S. paradoxus reference genome. To determine the SNP allele frequency as a function of the associated phenotypic effect, we compute the average binding affinity in the two subpopulations carrying either allele. In this way, we obtain a polarized phenotype difference Graphic, where Graphic denotes the larger and Graphic the smaller of the two subpopulation averages. Under selection against histone binding, we expect a decrease in the average frequency of the high-affinity allele, Graphic, with increasing deleterious effect. Fig. 3 shows the data points Graphic and the resulting average frequencies in bins of the affinity difference. These data permit a linear fit of Graphic as a function of Graphic,Embedded Imagewith a proportionality constant Graphic. On the other hand, our fitness landscape predicts the scaled selection coefficient Graphic for each of these SNPs according to Eq. 2. Assuming approximate linkage equilibrium, the classic equilibrium allele frequency distribution Graphic then determines the expected frequency of the deleterious allele, Graphic (32) (Methods). To leading order, we obtain a linear dependence as in Eq. 3 with a predicted value Graphic. This is in good agreement with the observed value for S. paradoxus polymorphisms. Here, we treat the S. paradoxus isolates as a mixed population. Performing this analysis separately for the three major subpopulations in the sample (31), we find that population structure has only a minor influence on the signal of selection (Fig. S4).

Fig. 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 3.

Selection on SNPs. The data points show the frequency of the high-affinity allele, Graphic, as a function of the phenotypic effect (i.e., the difference Graphic between both alleles) for SNPs in intergenic S. paradoxus NDRs with Graphic (green dots, with size indicating the number of SNPs contributing to the data point). From these data, we evaluated the effect-dependent average frequency Graphic (in Graphic-bins of size 0.05; green dots with error bars, joined by solid green line). Its approximately linear decrease follows Eq. 3 (least-squares fit, dashed green line) and shows that there is weak selection against alleles of higher affinity. The prediction from the fitness landscape Graphic (dashed red line; see text) is in good agreement with the data. The expectation under neutrality is a constant, Graphic (dashed blue line), and is inconsistent with the data.

Our polymorphism analysis establishes a quantitative inference of selection on NDRs on a microevolutionary timescale, despite the fact that individual mutations are under only weak to moderate selection. Importantly, apparent selection acting on sequence traits other than those relevant to nucleosome depletion is generally random with respect to the phenotype polarization. Therefore, the expectation value of the frequency of the deleterious allele as a function of the selection coefficient, Graphic, is affected only to a small extent by sequence conservation, say, due to the presence of transcription factor binding sites.

Conservation of Histone Binding Affinity and Equilibrium.

Our equilibrium theory of nucleosome positioning makes a definite prediction for cross-species evolution: The phenotype distribution Graphic and, hence, the number of NDRs below a given affinity threshold are conserved. Fig. 4A compares the genomic distributions Graphic for S. cerevisiae and S. paradoxus intergenic regions. These distributions indeed are strikingly similar between the two species. We can compare this conservation with simulated neutral evolution of an ensemble of sequence segments with the S. cerevisiae distribution Graphic as the initial condition (Methods and SI Text). Already over the distance between S. cerevisiae and S. paradoxus, the neutrally evolved sequences show a significant decrease in low-affinity counts, which is inconsistent with the data. For example, we obtain a conserved number of about Graphic nonoverlapping intergenic NDRs with length 100 bp and Graphic in the actual S. cerevisiae and S. paradoxus genomes. In contrast, the count of NDRs with the same characteristics drops to about 980 for simulated neutral evolution over the evolutionary distance between S. cerevisiae and S. paradoxus, and to 170 at neutral equilibrium. Similar results are obtained in a three-species comparison of S. cerevisiae, S. paradoxus, and Saccharomyces bayanus.

Fig. 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 4.

Cross-species evolution of histone binding affinity. (A) Distribution of histone binding affinity, Graphic, for intergenic segments of length 100 bp with Graphic in S. paradoxus (green ●) and in S. cerevisiae (purple ●, same as Fig. 2). These distributions are very similar, which is consistent with evolutionary equilibrium under selection given by the fitness landscape Graphic. In contrast, simulated neutral evolution (blue ●) already leads to a significant reduction of low-affinity counts over the same evolutionary distance, and would approach the neutral equilibrium distribution Graphic (black line, same as Fig. 2) in the long-time limit. (B) Cross-species distribution of affinity pairs Graphic for NDRs in S. cerevisiae and their aligned sequences in S. paradoxus (gray contour areas). The conditional average (green line) and standard deviation (green bars) of Graphic is plotted as a function of Graphic. We compare these data with the conditional distributions Graphic for simulated evolution in the fitness landscape Graphic and under neutrality (average, red and blue lines; standard deviation, red and blue bars). The cross-species data are consistent with evolution under directional selection against nucleosome formation. At the same time, the near-neutral standard deviation shows the variability of cross-species affinity evolution under this fitness model.

The observed cross-species conservation of affinity distribution Graphic and NDR number corroborates the assumption of evolutionary equilibrium underlying our analysis. The equilibrium state is characterized by detailed balance: Between two species, the number of genome segments increasing in affinity above a given threshold equals the number of segments decreasing below the same threshold. As we show below, this turnover describes the occupancy variability of individual NDRs between species.

To test the predictions of our fitness model for the divergence statistics of histone binding affinity, we mapped the set of intergenic NDR segments with Graphic in S. cerevisiae onto their aligned segments in S. paradoxus (Methods and SI Text). Fig. 4B shows the contour lines and binned averages of the resulting scatter plot Graphic. These pairs have lower mean affinity values in S. cerevisiae compared with S. paradoxus. This merely reflects our choice of base species (the opposite effect is observed if the alignment is constructed from a base set of S. paradoxus NDRs).

We can compare the actual process with in silico evolution under selection, using a Wright–Fisher simulation of the S. cerevisiae NDR sequences in the fitness landscape Graphic (for details, see Methods and SI Text). Fig 4B shows the binned average and standard deviation of the resulting conditional distribution Graphic for cross-species phenotype evolution. We find both quantities to be in quantitative agreement with the observed divergence statistics between S. cerevisiae and S. paradoxus. We conclude that our fitness landscape captures selection in favor of nucleosome depletion also over longer evolutionary times.

We also can compare the cross-species data to simulations of neutral evolution. Across the whole range of affinity values on S. cerevisiae NDRs, neutral evolution leads to an average affinity gain—i.e., an average loss of NDR function—that is inconsistent with the observed process. At the same time, the standard deviation of the cross-species affinity change is similar to the neutral value; i.e., the fitness landscape does not strongly constrain phenotype variability. This is in accordance with previous findings showing a high variance across loci in the divergence of both NDR occupancy and A:T enrichment (3).

Discussion

We have inferred a phenotype-fitness map Graphic for yeast intergenic sequence segments, which measures selection depending on histone binding affinity and regulatory site content (Fig. 1B). This map offers a quantitative solution to the chicken-and-egg problem posed in the introduction: Can we rank nucleosome positioning and transcriptional regulation with respect to their selective effects on intergenic sequence? As shown in Fig. 1B, fitness has a genuinely two-dimensional phenotype target: there are two chickens. Histone binding and transcription factor binding are separable primary modes of the evolution of intergenic DNA, subject to direct selection of comparable strength. The selection on histone binding spans an extended set of nucleosome-depleted intergenic segments, which have affinity values up to above 50%. This result contrasts with the merely passive role of DNA methylation that has been inferred from cell-type specific variations of the methylation pattern in human and mouse (33, 34).

Direct selection on nucleosome affinity has an important biological consequence. It establishes a set of nucleosome-depleted regions that are earmarked for interactions with transcription factors. The reduced nucleosome affinity not only increases the equilibrium coverage with transcription factors, but also may speed up the search kinetics of factor molecules toward their binding sites. Because these effects are largely independent of the actual coverage with binding sites, they facilitate binding site turnover and the adaptive formation of new sites. At the same time, the directional selection against histone binding given by our fitness landscape does not favor a specific affinity value, which is consistent with the observed cross-species variability of the affinity phenotype. This may suggest a two-tier model of selection on nucleosome-depleted intergenic regions: Elasticity-mediated directional selection broadly reduces nucleosome coverage, whereas balancing selection jointly tunes nucleosome and transcription factor coverage to gene-specific values.

The phenotypes used in this paper, histone binding affinity and regulatory site content, are distilled from the underlying cellular biophysics. A phenotype-based inference of selection is particularly relevant for histone binding, a quantitative trait that has extended (>100 bp) sequence targets with small phenotypic effects of individual mutations. Only by mapping nucleotide changes onto an affinity phenotype can we infer substantial aggregate selection against nucleosome formation. However, given the complexity of the molecular machinery of transcriptional regulation and chromatin organization, our analysis in terms of just two phenotypes is necessarily incomplete. For example, histone binding in vivo is expected to depend on additional sequence features besides our elasticity-mediated binding phenotype (10). Integrating additional phenotypes into the inference of selection leads to a higher-dimensional fitness landscape, which can be analyzed for its principal directions of selection. The projection on the two phenotypes used in this paper likely will lead to an underestimate but will not generate a spurious signal of selection. A more comprehensive analysis can also address fitness interactions or interference selection; our results suggest an avenue to infer these effects by a phenotype-based approach.

From a broader perspective, this paper is a case study analyzing quantitative traits that are encoded in overlapping sequence and represent coupled molecular functions. This scenario is at some distance from idealized models of population genetics and quantitative genetics but probably is typical—at least in the densely packed genomes of prokaryotes and unicellular eukaryotes. We have shown that a joint phenotype-fitness map can disentangle selective effects on such functions, i.e., distinguish direct from apparent selection. We expect this method to be applicable to a broader class of complex molecular functions, for which we can measure or infer at least some key phenotypes.

Methods

Histone Binding Affinity.

The biophysical model for histone binding underlying our analysis follows (20, 28). This model defines a histone-binding free energy landscape Graphic as a function of the 5′ genomic coordinate r of a nucleosome. The free energy of a DNA sequence segment Graphic is given byEmbedded Image

where Graphic denote trinucleotide subsegments; Graphic are the roll, twist, and tilt deformations in the nucleosome state, Graphic are the intrinsic deformations in the unbound state (35), Graphic denotes the corresponding elastic constants, and we use a core binding length Graphic bp (28). The statistics of nucleosome positioning is then given by standard equilibrium thermodynamics. It may be derived from the grand canonical partition functionEmbedded Image

with the no-overlap constraint Graphic. The partition function depends on the temperature via Graphic and on the chemical potential η, which are adjusted to in vivo conditions. This determines the expected single-nucleotide nucleosome occupancies (36),Embedded Image

and the expected mean occupancyEmbedded Image

over sequence segments of length Graphic. The dependence of ω on local binding energies and on the chemical potential is shown in Fig. S1.

Data Analysis.

We used genomic sequences and their alignments from University of California, Santa Cruz (UCSC) Genome Browser (sacCer3) for the interspecific analysis of S. cerevisiae and S. paradoxus. Up to a threshold, insertions and deletions were corrected to exclude alignment uncertainties. This procedure did not affect our cross-species analysis (for details, see SI Text and Fig. S5). The resulting total sequence length was Graphic bp, with Graphic bp in intergenic regions (37). The second dataset, obtained from the Saccharomyces Genome Resequencing Project, contains aligned genomes of 35 S. paradoxus strains, including SNPs. This dataset has a well-separable substructure (31). To control for demographic effects, we partitioned this dataset into three groups (European, Far Eastern, and American). We obtained annotated transcription factor binding sites on S. cerevisiae from the SwissRegulon Portal (Feb 2012) (22). Only nonoverlapping binding sites with a posterior probability >0.5 were used. To identify low-occupancy regions predicted by our affinity model, we constructed a tiling of the genome into nonoverlapping segments of fixed length Graphic bp, using a dynamic programming algorithm with an upper bound of 0.95 of the predicted mean nucleosome occupancy ω in each individual segment. Experimental in vivo nucleosome occupancy scores for S. cerevisiae were obtained from the Gene Expression Omnibus database (accession series GSE22211) (3) and processed to reduce the effects of measurement uncertainties (SI Text).

Polymorphism Statistics.

To predict the expected deleterious allele frequency given by the fitness landscape, we use the equilibrium allele frequency spectrum for a two-allele locus, Graphic, where Graphic is the scaled selection coefficient, Graphic is the scaled neutral mutation rate, and Graphic is a normalization factor. From this distribution, we determine the allele frequency spectrum for polymorphic loci, Graphic, in a set of m isolates by binomial sampling Graphic. This distribution produces an average frequency of the deleterious allele, Graphic, with a proportionality constant Graphic (for Graphic).

Modeling Sequence Evolution.

We use a Wright–Fisher simulation for a population of NDR sequences evolving under mutations, genetic drift, and selection given by the fitness landscape Graphic. The evolutionary time for simulation of the cross-species evolution is chosen so that the average sequence divergence in the set of predicted NDRs equals the observed real value of 13%. Simulations of neutral evolution use the same model, but without selection. More details are given in SI Text and Fig. S6.

Acknowledgments

We thank Alain Arneodo for providing the sequence-based algorithm to compute histone binding energies and nucleosome occupancy (28) and Stephan Schiffels for kindly making his Wright–Fisher evolution model algorithm available. We also are grateful for stimulating discussions with Ville Mustonen. This work was supported by Deutsche Forschungsgemeinschaft Grant SFB 680, by German Federal Ministry of Education and Research Grant 0315893-Sybacol, and in part by the National Science Foundation (NSF) under Grant NSF PHY05-51164 during a visit to the Kavli Institute for Theoretical Physics (Santa Barbara, CA).

Footnotes

  • ↵1To whom correspondence should be addressed. E-mail: mlaessig{at}uni-koeln.de.
  • Author contributions: D.W. and M.L. performed research, analyzed data, and wrote the paper.

  • The authors declare no conflict of interest.

  • This article is a PNAS Direct Submission.

  • This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1210887110/-/DCSupplemental.

References

  1. ↵
    1. Lee W,
    2. et al.
    (2007) A high-resolution atlas of nucleosome occupancy in yeast. Nat Genet 39(10):1235–1244.
    OpenUrlCrossRefPubMed
  2. ↵
    1. Bai L,
    2. Morozov AV
    (2010) Gene regulation by nucleosome positioning. Trends Genet 26(11):476–483.
    OpenUrlCrossRefPubMed
  3. ↵
    1. Tsankov AM,
    2. Thompson DA,
    3. Socha A,
    4. Regev A,
    5. Rando OJ
    (2010) The role of nucleosome positioning in the evolution of gene regulation. PLoS Biol 8(7):e1000414.
    OpenUrlCrossRefPubMed
  4. ↵
    1. Luger K,
    2. Mäder AW,
    3. Richmond RK,
    4. Sargent DF,
    5. Richmond TJ
    (1997) Crystal structure of the nucleosome core particle at 2.8 Å resolution. Nature 389(6648):251–260.
    OpenUrlCrossRefPubMed
  5. ↵
    1. Field Y,
    2. et al.
    (2009) Gene expression divergence in yeast is coupled to evolution of DNA-encoded nucleosome organization. Nat Genet 41(4):438–445.
    OpenUrlCrossRefPubMed
  6. ↵
    1. Shivaswamy S,
    2. et al.
    (2008) Dynamic remodeling of individual nucleosomes across a eukaryotic genome in response to transcriptional perturbation. PLoS Biol 6(3):e65.
    OpenUrlCrossRefPubMed
  7. ↵
    1. Jiang C,
    2. Pugh BF
    (2009) Nucleosome positioning and gene regulation: Advances through genomics. Nat Rev Genet 10(3):161–172.
    OpenUrlPubMed
  8. ↵
    1. Radman-Livaja M,
    2. Rando OJ
    (2010) Nucleosome positioning: How is it established, and why does it matter? Dev Biol 339(2):258–266.
    OpenUrlCrossRefPubMed
  9. ↵
    1. Swamy KBS,
    2. Chu W-Y,
    3. Wang C-Y,
    4. Tsai H-K,
    5. Wang D
    (2011) Evidence of association between nucleosome occupancy and the evolution of transcription factor binding sites in yeast. BMC Evol Biol 11:150.
    OpenUrlCrossRefPubMed
  10. ↵
    1. Segal E,
    2. et al.
    (2006) A genomic code for nucleosome positioning. Nature 442(7104):772–778.
    OpenUrlCrossRefPubMed
  11. ↵
    1. Thåström A,
    2. et al.
    (1999) Sequence motifs and free energies of selected natural and non-natural nucleosome positioning DNA sequences. J Mol Biol 288(2):213–229.
    OpenUrlCrossRefPubMed
  12. ↵
    1. Widom J
    (2001) Role of DNA sequence in nucleosome stability and dynamics. Q Rev Biophys 34(3):269–324.
    OpenUrlPubMed
  13. ↵
    1. Segal E,
    2. Widom J
    (2009) Poly(dA:dT) tracts: Major determinants of nucleosome organization. Curr Opin Struct Biol 19(1):65–71.
    OpenUrlCrossRefPubMed
  14. ↵
    1. Schones DE,
    2. et al.
    (2008) Dynamic regulation of nucleosome positioning in the human genome. Cell 132(5):887–898.
    OpenUrlCrossRefPubMed
  15. ↵
    1. Yuan G-C,
    2. et al.
    (2005) Genome-scale identification of nucleosome positions in S. cerevisiae. Science 309(5734):626–630.
    OpenUrlAbstract/FREE Full Text
  16. ↵
    1. Cairns BR
    (2009) The logic of chromatin architecture and remodelling at promoters. Nature 461(7261):193–198.
    OpenUrlCrossRefPubMed
  17. ↵
    1. Whitehouse I,
    2. Rando OJ,
    3. Delrow J,
    4. Tsukiyama T
    (2007) Chromatin remodelling at promoters suppresses antisense transcription. Nature 450(7172):1031–1035.
    OpenUrlCrossRefPubMed
  18. ↵
    1. Kornberg RD,
    2. Stryer L
    (1988) Statistical distributions of nucleosomes: Nonrandom locations by a stochastic mechanism. Nucleic Acids Res 16(14A):6677–6690.
    OpenUrlAbstract/FREE Full Text
  19. ↵
    1. Mavrich TN,
    2. et al.
    (2008) A barrier nucleosome model for statistical positioning of nucleosomes throughout the yeast genome. Genome Res 18(7):1073–1083.
    OpenUrlAbstract/FREE Full Text
  20. ↵
    1. Milani P,
    2. et al.
    (2009) Nucleosome positioning by genomic excluding-energy barriers. Proc Natl Acad Sci USA 106(52):22257–22262.
    OpenUrlAbstract/FREE Full Text
  21. ↵
    1. Möbius W,
    2. Gerland U
    (2010) Quantitative test of the barrier nucleosome model for statistical positioning of nucleosomes up- and downstream of transcription start sites. PLoS Comp Biol 6(8):e1000891.
    OpenUrlCrossRef
  22. ↵
    1. van Nimwegen E
    (2007) Finding regulatory elements and regulatory motifs: A general probabilistic framework. BMC Bioinformatics 8(Suppl 6):S4.
    OpenUrl
  23. ↵
    1. Tirosh I,
    2. Sigal N,
    3. Barkai N
    (2010) Divergence of nucleosome positioning between two closely related yeast species: Genetic basis and functional consequences. Mol Syst Biol 6:365.
    OpenUrlPubMed
  24. ↵
    1. Warnecke T,
    2. Batada NN,
    3. Hurst LD
    (2008) The impact of the nucleosome code on protein-coding sequence evolution in yeast. PLoS Genet 4(11):e1000250.
    OpenUrlCrossRefPubMed
  25. ↵
    1. Washietl S,
    2. Machné R,
    3. Goldman N
    (2008) Evolutionary footprints of nucleosome positions in yeast. Trends Genet 24(12):583–587.
    OpenUrlCrossRefPubMed
  26. ↵
    1. Kenigsberg E,
    2. Bar A,
    3. Segal E,
    4. Tanay A
    (2010) Widespread compensatory evolution conserves DNA-encoded nucleosome organization in yeast. PLOS Comput Biol 6(12):e1001039.
    OpenUrlCrossRefPubMed
  27. ↵
    1. Prendergast JG,
    2. Semple CA
    (2011) Widespread signatures of recent selection linked to nucleosome positioning in the human lineage. Genome Res 21(11):1777–1787.
    OpenUrlAbstract/FREE Full Text
  28. ↵
    1. Vaillant C,
    2. Audit B,
    3. Arneodo A
    (2007) Experiments confirm the influence of genome long-range correlations on nucleosome positioning. Phys Rev Lett 99(21):218103.
    OpenUrlCrossRefPubMed
  29. ↵
    1. Berg J,
    2. Willmann S,
    3. Lässig M
    (2004) Adaptive evolution of transcription factor binding sites. BMC Evol Biol 4:42.
    OpenUrlCrossRefPubMed
  30. ↵
    1. Mustonen V,
    2. Kinney J,
    3. Callan CG Jr.,
    4. Lässig M
    (2008) Energy-dependent fitness: A quantitative model for the evolution of yeast transcription factor binding sites. Proc Natl Acad Sci USA 105(34):12376–12381.
    OpenUrlAbstract/FREE Full Text
  31. ↵
    1. Liti G,
    2. et al.
    (2009) Population genomics of domestic and wild yeasts. Nature 458(7236):337–341.
    OpenUrlCrossRefPubMed
  32. ↵
    1. Wright S
    (1937) The distribution of gene frequencies in populations. Proc Natl Acad Sci USA 23(6):307–320.
    OpenUrlFREE Full Text
  33. ↵
    1. Thurman RE,
    2. et al.
    (2012) The accessible chromatin landscape of the human genome. Nature 489(7414):75–82.
    OpenUrlCrossRefPubMed
  34. ↵
    1. Stadler M,
    2. et al.
    (2011) DNA-binding factors shape the mouse methylome at distal regulatory regions. Nature 480(7378):490–495.
    OpenUrlCrossRefPubMed
  35. ↵
    1. Goodsell DS,
    2. Dickerson RE
    (1994) Bending and curvature calculations in B-DNA. Nucleic Acids Res 22(24):5497–5503.
    OpenUrlFREE Full Text
  36. ↵
    1. Percus JK
    (1976) Equilibrium state of a classical fluid of hard rods in an external field. J Stat Phys 15(6):505–511.
    OpenUrlCrossRef
  37. ↵
    1. Kellis M,
    2. Patterson N,
    3. Endrizzi M,
    4. Birren B,
    5. Lander ES
    (2003) Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423(6937):241–254.
    OpenUrlCrossRefPubMed
PreviousNext
Back to top
Article Alerts
Email Article

Thank you for your interest in spreading the word on PNAS.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Fitness landscape for nucleosome positioning
(Your Name) has sent you a message from PNAS
(Your Name) thought you would like to see the PNAS web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Citation Tools
Fitness landscape for nucleosome positioning
Donate Weghorn, Michael Lässig
Proceedings of the National Academy of Sciences Jul 2013, 110 (27) 10988-10993; DOI: 10.1073/pnas.1210887110

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Request Permissions
Share
Fitness landscape for nucleosome positioning
Donate Weghorn, Michael Lässig
Proceedings of the National Academy of Sciences Jul 2013, 110 (27) 10988-10993; DOI: 10.1073/pnas.1210887110
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Mendeley logo Mendeley

Article Classifications

  • Biological Sciences
  • Biophysics and Computational Biology
Proceedings of the National Academy of Sciences: 110 (27)
Table of Contents

Submit

Sign up for Article Alerts

Jump to section

  • Article
    • Abstract
    • Results
    • Discussion
    • Methods
    • Acknowledgments
    • Footnotes
    • References
  • Figures & SI
  • Info & Metrics
  • PDF

You May Also be Interested in

Setting sun over a sun-baked dirt landscape
Core Concept: Popular integrated assessment climate policy models have key caveats
Better explicating the strengths and shortcomings of these models will help refine projections and improve transparency in the years ahead.
Image credit: Witsawat.S.
Model of the Amazon forest
News Feature: A sea in the Amazon
Did the Caribbean sweep into the western Amazon millions of years ago, shaping the region’s rich biodiversity?
Image credit: Tacio Cordeiro Bicudo (University of São Paulo, São Paulo, Brazil), Victor Sacek (University of São Paulo, São Paulo, Brazil), and Lucy Reading-Ikkanda (artist).
Syrian archaeological site
Journal Club: In Mesopotamia, early cities may have faltered before climate-driven collapse
Settlements 4,200 years ago may have suffered from overpopulation before drought and lower temperatures ultimately made them unsustainable.
Image credit: Andrea Ricci.
Steamboat Geyser eruption.
Eruption of Steamboat Geyser
Mara Reed and Michael Manga explore why Yellowstone's Steamboat Geyser resumed erupting in 2018.
Listen
Past PodcastsSubscribe
Birds nestling on tree branches
Parent–offspring conflict in songbird fledging
Some songbird parents might improve their own fitness by manipulating their offspring into leaving the nest early, at the cost of fledgling survival, a study finds.
Image credit: Gil Eckrich (photographer).

Similar Articles

Site Logo
Powered by HighWire
  • Submit Manuscript
  • Twitter
  • Facebook
  • RSS Feeds
  • Email Alerts

Articles

  • Current Issue
  • Special Feature Articles – Most Recent
  • List of Issues

PNAS Portals

  • Anthropology
  • Chemistry
  • Classics
  • Front Matter
  • Physics
  • Sustainability Science
  • Teaching Resources

Information

  • Authors
  • Editorial Board
  • Reviewers
  • Subscribers
  • Librarians
  • Press
  • Site Map
  • PNAS Updates
  • FAQs
  • Accessibility Statement
  • Rights & Permissions
  • About
  • Contact

Feedback    Privacy/Legal

Copyright © 2021 National Academy of Sciences. Online ISSN 1091-6490