Skip to main content

Main menu

  • Home
  • Articles
    • Current
    • Special Feature Articles - Most Recent
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • List of Issues
  • Front Matter
    • Front Matter Portal
    • Journal Club
  • News
    • For the Press
    • This Week In PNAS
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Editorial and Journal Policies
    • Submission Procedures
    • Fees and Licenses
  • Submit
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Accessibility Statement
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian

User menu

  • Log in
  • My Cart

Search

  • Advanced search
Home
Home
  • Log in
  • My Cart

Advanced Search

  • Home
  • Articles
    • Current
    • Special Feature Articles - Most Recent
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • List of Issues
  • Front Matter
    • Front Matter Portal
    • Journal Club
  • News
    • For the Press
    • This Week In PNAS
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Editorial and Journal Policies
    • Submission Procedures
    • Fees and Licenses
  • Submit
Research Article

Stem cell differentiation as a many-body problem

Bin Zhang and Peter G. Wolynes
  1. Departments of aChemistry and
  2. cPhysics and Astronomy, and
  3. bCenter for Theoretical Biological Physics, Rice University, Houston, TX 77005

See allHide authors and affiliations

PNAS July 15, 2014 111 (28) 10185-10190; first published June 19, 2014; https://doi.org/10.1073/pnas.1408561111
Bin Zhang
Departments of aChemistry and
bCenter for Theoretical Biological Physics, Rice University, Houston, TX 77005
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Peter G. Wolynes
Departments of aChemistry and
bCenter for Theoretical Biological Physics, Rice University, Houston, TX 77005
cPhysics and Astronomy, and
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: pwolynes@rice.edu
  1. Contributed by Peter G. Wolynes, May 9, 2014 (sent for review March 25, 2014)

  • Article
  • Figures & SI
  • Info & Metrics
  • PDF
Loading

Significance

A molecular understanding of stem cell differentiation requires the study of gene regulatory network dynamics that includes the statistics of synthesizing transcription factors, their degradation, and their binding to the DNA. Brute force simulation for complex large realistic networks can be computationally challenging. Here we develop a sound approximation method built upon theories originating from quantum statistical mechanics to study the network for embryonic stem cell development. Mechanistic insight is provided on the role of the master regulator Nanog in safeguarding stem cell pluripotency. We also demonstrate the significant influence of DNA-binding kinetics, an aspect that often has been overlooked, on the transition rate between gene network steady states.

Abstract

Stem cell differentiation has been viewed as coming from transitions between attractors on an epigenetic landscape that governs the dynamics of a regulatory network involving many genes. Rigorous definition of such a landscape is made possible by the realization that gene regulation is stochastic, owing to the small copy number of the transcription factors that regulate gene expression and because of the single-molecule nature of the gene itself. We develop an approximation that allows the quantitative construction of the epigenetic landscape for large realistic model networks. Applying this approach to the network for embryonic stem cell development explains many experimental observations, including the heterogeneous distribution of the transcription factor Nanog and its role in safeguarding the stem cell pluripotency, which can be understood by finding stable steady-state attractors and the most probable transition paths between those attractors. We also demonstrate that the switching rate between attractors can be significantly influenced by the gene expression noise arising from the fluctuations of DNA occupancy when binding to a specific DNA site is slow.

  • gene network
  • most probable path
  • master equation

Understanding the underlying mechanisms of the differentiation of stem cells into many distinct cell types has long been a goal of developmental biology (1) and regenerative medicine (2). The epigenetic landscape, originally proposed by Waddington, has proved to be a useful metaphor for visualizing cellular dynamics (3). However, is it more than a metaphor? In this view, cell phenotypes are identified as attractors with well-defined patterns of robust gene expression, and differentiation occurs through transitions from the stem cell attractor to other attractors for the differentiated cells. It has become clear that for simple models, taking into account stochastic effects allows a well-defined landscape to be constructed (4⇓–6). This generalized potential landscape governs much of the gene regulatory network dynamics, such that corrections to such a landscape picture can also be defined and formalized (7, 8). It remains a challenge to construct such landscapes for realistically large network models.

Describing the stochastic dynamics of gene networks must include the statistics of synthesizing transcription factors, their degradation, and their binding to genes on the DNA. These all play crucial roles in shaping the epigenetic landscape (9). Due to the complexity of the assembly of the machinery involved for protein synthesis, it has often been assumed that protein translation is much slower than the diffusion-limited DNA-binding process. Under this view, the ensemble of DNA occupancies, i.e., the set of transcription factors bound at a gene’s regulatory elements, can be assumed to have achieved a quasi-equilibrium so that the expression dynamics of this gene can then be described as a birth–death process governed by an averaged protein production rate that depends nonlinearly on transcription factor concentrations (10). Although perhaps valid for some networks in prokaryotic cells, this assumption on timescales is probably not completely adequate for eukaryotic systems because the processes of chromatin decondensation and unwrapping of DNA from nucleosomes, both of which are required for proteins to bind, take time. The high-level architecture of chromatin can severely limit the access of transcription factors to the DNA, slowing down DNA-binding kinetics (11, 12) and resulting in gene expression noise deviating from Poisson statistics (13). Even when the transcription factors function as pioneers that can directly bind with the chromatin sites occupied by the nucleosome, slow DNA binding (14, 15) is still a good approximation to describe the effect of the progressive change of the chromatin structure and histone modification induced by the pioneer factors on gene regulation (16). As a result, DNA binding must be treated on equal footing together with protein synthesis and degradation to fully understand eukaryotic gene regulation (14⇓⇓⇓–18).

By increasing the dimensionality of the problem, investigating the effects arising from slow DNA-binding kinetics on gene network dynamics becomes computationally challenging. Although being in some aspects conceptually transparent, where one can directly simulate the chemical reactions involved in a gene network using a Monte Carlo algorithm (19), such a procedure quickly becomes inefficient as the system complexity increases and is even impractical for studying rare, transient switching events between steady states that occur on long timescales. In recognizing the close analogy between gene networks and magnetic systems, Sasai and Wolynes suggested that analytical approaches originating from quantum statistical mechanics could be used to study the epigenetic landscape of networks and discussed the steady states of a very stylized model network with many attractors (4). Although their approach was efficient in identifying steady-state solutions, it remained to show how this approach can characterize transitions between different attractors. In the adiabatic limit where DNA binding is fast, analytical approaches to the transition process based on large deviation theory have proved successful in studying noise-induced transitions (7, 20⇓–22). Here we show how to find equivalent approaches for large networks where DNA binding must be treated explicitly.

In this paper, we generalize a kinetic model originally proposed by Sasai and Wolynes (4) to explicitly model DNA binding along with protein synthesis and degradation in large gene networks with multifactorial and complex switches. An approximation method is further developed that allows the construction of the network’s epigenetic landscape via the determination of not only steady-state solutions, but also most probable transition paths and stochastic switching rates between steady states. When applied to a model of the core transcriptional regulatory network of embryonic stem cells, these approximations turn out to explain the observed fluctuations in the expression of the master pluripotent regulator Nanog and the important role it has in safeguarding stem cell pluripotency against differentiation. We also demonstrate the striking effect that DNA-binding kinetics have on the switching rate between steady states, making clear the importance of explicitly modeling fluctuations of DNA occupancy in studying the fate of gene networks.

Theoretical Approaches for Gene Networks

Viewed as a molecular process, gene regulation is complicated, with many actors as well as attractors, and spans a large spatial and temporal domain. Several approximations are needed to make tractable mathematical models. We use an admittedly simplified model but one that contains many of the crucial features of the real embryonic stem cell network.

Constructing Schematic Kinetic Models for Gene Networks.

Following Sasai and Wolynes (4), we model gene regulation at a level that includes explicit binding and unbinding of transcription factors to the DNA, protein translation, and protein degradation. Both the mRNA intermediary and the serial nature of macromolecular synthesis are neglected in the present model but are likely important. Fig. 1A illustrates how we can diagram a mutual repression motif between genes Cdx2 and Oct4. The gene Cdx2 can be transcribed to proteins at a rate g, which will change depending on whether the transcription factors produced by the Oct4 gene are bound at the Cdx2 gene’s regulatory element. Here, we assume Oct4 proteins will interact with the Cdx2 regulatory element in the form of a homodimer and that they will bind with the regulatory element at a rate h and unbind with a rate f. Similarly, the Cdx2 proteins can also interact with the Oct4 gene’s regulatory element and regulate the production rate of Oct4 proteins. Proteins in our model are also degraded with a rate k. We do not explicitly model nucleosome dynamics and covalent chromatin modifications such as DNA methylation and histone acetylation and methylation and assume these affect gene expression only via regulating the kinetics of transcription factor binding to the DNA. Accounting for such epigenetic modifications may be necessary to fully appreciate the complexity of eukaryotic gene regulation, as emphasized by Sasai et al. (16). Simplified as arrows, this mutual repression motif can be represented as the diagram shown in Fig. 1A, Inset. Including interactions with other genes, a relatively realistic network for the embryonic stem cell can be diagrammed as in Fig. 1B. We emphasize that every link in Fig. 1B includes the series of reactions illustrated in Fig. 1A.

Fig. 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 1.

A schematic gene regulatory network for the embryonic stem cell. (A) Schematics for the detailed chemical reactions included in modeling the mutual repression between genes Oct4 and Cdx2. Inset simplifies the reactions into a diagram that is useful for illustrating the topology of complex networks. (B) Topology of the stem cell gene network. Each node represents either a transcription factor (single word) or a protein complex (two words) involved in early stem cell differentiation. Arrows that end at transcription factors represent transcriptional regulations with the mechanism shown in A. The green color is used for activation and the red color indicates repression. Arrows targeting protein complexes emerge from the monomers involved in forming the complex. Gene markers found in cells being in pluripotent states are colored in red and orange, and those found for differentiated state phenotypes are colored in blue.

Obviously many quantitative molecular details of even this simplified scheme are still unknown. In building a kinetic model for the gene network, further approximations are thus necessary and worthy of clarification. The resulting model still can thus be deemed only “stylized.” First, we assume all of the transcription factors, except the Oct4–Sox2 protein complex, will interact with the DNA sequence in the form of homodimers. Although many transcription factors are indeed found to function as dimers (23), it is unlikely that this is true for all of the proteins included in our current network. Second, many genes in the network are regulated by a multiplicity of transcription factors. Again, for simplicity and generality, we assume all of the different transcription factors occupy unique, nonoverlapping regulatory elements. Finally, because many of the kinetic parameters are not available at this point, all of the genes in the network are assumed to function with the same set of rates. We see then this is something like an “Ising model” for gene networks. Despite these simplifications, the significance of having such a specific yet stylized model cannot be overemphasized. As shown below, many concrete conclusions that appear insensitive to these simplifications can be drawn due to the robustness of the underlying gene network.

Master Equation for Network Dynamics.

As shown in ref. 4, to fully specify a gene’s state, knowledge of both the occupancy of the DNA site and the concentration of the proteins is required. For a gene that is regulated by only one transcription factor, the joint probability of the DNA occupancy and the number of proteins n can be conveniently written as a two-component column vector P(n,t)=(P1(n,t),P2(n,t))T, with T as the transpose operator. The subscripts 1 and 2 refer to the two DNA occupancy states encoding the gene’s regulatory element being either empty or bound with the transcription factor, respectively. When the gene is regulated via a multiplicity of transcription factors as is the case for most realistic large networks, describing the DNA occupancy states also requires more dimensions. Due to the assumed independent binding at the gene’s regulatory elements, there will be a total of N=2NTF numbers of DNA occupancy states for a gene regulated by NTF different transcription factors. Generalizing from the two-state case, the joint probability for the state of such a gene can be represented with an N-component column vector P(n,t)=(P1(n,t),P2(n,t),⋯,PN(n,t))T. For a network consisting of M genes, we use the self-consistent proteomic field approximation (24) to construct the probability function of the entire network as a simple product for its component individual genes P(n1,⋯,nM;t)=P1(n1,t)⊗P2(n2,t)⋯⊗PM(nM,t).

Including the chemical reactions illustrated in Fig. 1A, the self-consistent field regulatory dynamics for a given gene m can be described via the following master equation:∂∂tPm(n,t)=G{Pm(n−1,t)−Pm(n,t)}+ K{(n+1)Pm(n+1,t)−nPm(n,t)}+ WPm(n,t).[1]G and K are diagonal matrices with the diagonal elements Gjj and Kjj corresponding to the protein synthesis and degradation rate at the DNA occupancy state j. W is the transition rate matrix for exchanging probability among DNA occupancy states. Detailed definitions for the rate matrices are provided in SI Text.

Steady States and Most Probable Transition Paths.

We follow the approach of Doi (25) and Peliti (26) to rewrite the master equation in an operator form. Defining creation and annihilation operators such that a†|n〉=|n+1〉 and a|n〉=n|n−1〉 and the state vector |ψ(t)〉=∑nP(n,t)|n〉, the master equation is written as∂t|ψ(t)〉=Ω|ψ(t)〉.[2]Ω is thus a non-Hermitian “Hamiltonian” operator defined in terms of a† and a, whose explicit form is provided in SI Text.

From Eq. 2, the transition probability P(nf,τ|ni,0) of finding the protein concentrations nf=(nf1,⋯,nfM) at a final time t=τ from ni=(ni1,⋯,niM) at t=0 can be written as P(nf,τ|ni,0)=〈nf|exp(Ωτ)|ni〉. Next, following Zhang et al. (27), using a resolution of the identity generalized from the coherent state basis set, the transition probability can be further represented in a path integral form involving an action that follows from the master equationP(nf,τ|ni,0)∝∫∏mDxm∏mDcm×exp(−∫dt∑m[pmq˙m−ℋ(qm,pm)]),[3]where qm=(c1x1,⋯,cNxN,(cN−c1)/2,⋯,(cN−cN−1)/2) and pm=(p1x,⋯,pNx,p1c,⋯,pN−1c). Here cj refers to the probability of finding the gene m in the DNA occupancy state j with an average protein number xj and pjc and pjx are the corresponding conjugate momenta. The separability of the Hamiltonian ℋ(qm,pm) over the set of genes is a consequence of the self-consistent proteomic field approximation.

The steady states of the master equation can be found as attractors of the following deterministic dynamics that would extremize the action:dqmdt=∂ℋ∂pm|pm=0.[4]From the steady-state solutions, the average level of protein expression xm and the probability distribution Pm(n) for the gene m can be calculated asxm=∑j=1Ncjxj, Pm(n)=∑j=1Ncj(xj)ne−xjn!.[5]

The most probable transition paths between steady states are determined from the Hamiltonian equations (28)dqmdt=∂ℋ∂pm, dpmdt=−∂ℋ∂qm,[6]subject to the two-ended boundary conditions qm(0)=qmst0 and qm(τ)=qmst1. qmst0 and qmst1 are the two steady-state solutions of interest. Many numerical schemes have been proposed to solve such two-sided first-order equations (21, 29, 30), and here we use the geometric minimum action method (gMAM) (29) because of its robustness and computational efficiency. From the most probable transition path, we can estimate the transition rate k∝exp[−S] (31), with the transition action S defined asS=∫dt∑mpmq˙m.[7]A demonstration of the accuracy of the developed approximation method in predicting the transition rate between steady states is provided in Fig. S1.

To highlight the effects of DNA-binding kinetics, i.e., the rate of fluctuations of DNA occupancy, on the epigenetic landscape, it is often convenient to compare results obtained in the adiabatic limit, in which DNA binding is considered to be fast, with the full results obtained directly from Eqs. 4 and 6. In the adiabatic limit, the state probabilities ci are approximated with an equilibrium partition, and simplified versions of Eqs. 4 and 6 can be derived as shown in SI Text.

Cell Types as Steady States of Gene Networks

Realistic models of the core transcriptional regulatory network for the embryonic stem cell have already been constructed by others (16, 32, 33). We illustrate the network used in this study in Fig. 1B. This core model consists of eight unique transcription factors and their genes interconnected via transcriptional activation (green) and repression (red) inputs. The triad of master pluripotency regulators, Oct4, Sox2, and Nanog, is highlighted in red. These regulators maintain stem cell pluripotency by activating a set of genes crucial for self-renewal and pluripotency (orange) while simultaneously suppressing differentiation genes (blue). Using the kinetic parameters given in SI Text, we determine the steady-state solutions of such a network and compare them with the gene expression patterns of known cell phenotypes.

Fig. 2A presents the steady-state gene expression profiles calculated in the adiabatic limit and labels these profiles using abbreviations of corresponding cell types, with SC standing for stem cell, PE for primitive endoderm, TE for trophectoderm, and DC for differentiated cell type. For the two steady states SC1 and SC2 found in the bottom of Fig. 2A, proteins for the pluripotency gene markers Oct4, Sox2, and Klf4 are expressed in high amounts, thus defining the corresponding phenotypes as stem cells. Compared with the steady-state SC2, SC1 has a much higher level of Nanog gene expression. The prediction of two distinct levels of Nanog gene expression in stem cell steady states agrees with the experimentally observed heterogeneous distribution of Nanog proteins from a population of stem cells (34⇓–36). In the remaining three steady states, the pluripotency gene markers are turned off, allowing the expression of differentiation genes such as Gata6 and Cdx2. These steady states thus have expression profiles characteristic of differentiated cells. In particular, consistent with primitive endoderm cells, the Gata6 gene is turned on in the steady-state PE. And consistent with trophectoderm cells, the Cdx2 protein is highly expressed in the steady-state TE. Finally, the network also supports the steady-state DC, in which both Gata6 and Cdx2 proteins are produced in large amounts. A similar expression pattern to this predicted steady state has been observed in Oct4 knockdown embryonic stem cells (37). Stability of the steady-state DC might be compromised in the presence of cell–cell communication, which is known to lead to mutual repression between Gata6 and Cdx2 genes (16). We note gene expression profiles consistent with the five predicted steady states in Fig. 2A have been observed in previous studies from the analysis of embryonic stem cell networks (15, 16, 32).

Fig. 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 2.

Steady-state solutions for the embryonic stem cell network. (A) Gene expression levels in the steady-state solutions identified for the network when its dynamics are modeled in the adiabatic limit. (B) Dependence of the number of steady-state solutions, Nss, on the DNA-unbinding rate f. (C) The probability distribution of Nanog expression at different DNA-unbinding rates.

The observed five steady-state gene expression profiles are consistent with the network topology shown in Fig. 1B. The stem cell network contains three self-activating genes Nanog, Cdx2, and Gata6. The self-activation of Nanog arises indirectly via the intermediate gene Pbx1. Because it is known that bistability can arise in certain regimes for self-activating motifs (38), a total of three of these motifs will in principle give rise to N=23=8 steady states if they are independent. However, mutual antagonism exists between the Nanog (Oct4) and the Gata6 (Cdx2) genes in the network. These counteracting interactions lead to the silencing of the differentiation genes when the Nanog gene is on and render inaccessible the three potential steady states having either only one of the differentiation genes on or both of them on. Therefore, the total number of steady states is reduced from a potential eight to five.

The steady-state solutions shown in Fig. 2A are obtained from calculations performed in the adiabatic limit, and the DNA occupancy states are assumed to achieve a conditional equilibrium. This assumption is, however, not always valid for eukaryotic cells because of slow chromatin dynamics (12, 14⇓–16). To investigate the effect of DNA-binding kinetics on the epigenetic landscape, we determined the number of steady-state solutions for the stem cell network over a wide range of DNA-unbinding rates. The results are presented in Fig. 2B.

Fig. 2B demonstrates that from the weakly adiabatic regime (f∼10) to the strongly adiabatic regime (f∼104), the network exhibits five steady states. Fig. S2 further confirms that these steady states are similar to the completely adiabatic solutions, and the average steady-state gene expressions are quantitatively comparable. As the unbinding rate slows down, however, the two stem cell steady states collapse together (f=2.0). From the probability distribution of gene expression shown in Fig. 2C, it is clear that the heterogeneous distribution for Nanog proteins is preserved for this collapsed situation. When the unbinding rate is still further decreased, the differentiated state disappears altogether, and only one stem cell state remains (f=1.0). Finally, in the limit of extremely slow DNA kinetics, the regulatory network fails to function at all and all of the genes are minimally expressed at a basal translation rate. The sensitivity of the number of steady states to DNA-unbinding rate argues for the importance of explicit modeling of DNA kinetics when studying epigenetic landscapes.

Most Probable Transition Pathways

The excellent agreement between the steady-state gene expression levels shown in Fig. 2A and the gene expression levels of cell phenotypes known in the laboratory strongly suggests that cell types indeed do correspond to attractors on an epigenetic landscape. We next set out to test whether one can understand aspects of stem cell differentiation by studying the stochastic transition between attractors. In particular, we focus on studying the differentiation of stem cells into primitive endoderm cells and the role of the Nanog gene in regulating this differentiation process. We first determine the most probable stochastic transition path from the stem cell steady-state SC1 to the primitive endoderm steady-state PE in the adiabatic limit; then we discuss the effect of DNA-binding kinetics on the mechanism of this transition in the next section. Details regarding these calculations are provided in the SI Text and Tables S1–S3.

Fig. 3A presents the changing gene expression patterns along the most probable transition pathway from the steady-state SC1 to the steady-state PE. Starting from SC1, the pathway first transits to the steady-state SC2. During the transition from SC1 to SC2, the Nanog gene switches from an on state to an off state, accompanied by a coherent down-regulation in the expression of the Pbx1 gene. Little change is observed for the expression of other genes. As the pathway exits from the steady-state SC2, the expression level of the differentiation gene Gata6 starts to increase along with its antagonizing gene Nanog being turned off. Accumulation of Gata6 proteins leads to the inhibition of pluripotency genes such as Oct4 and Sox2, while in the meantime the decreased expression of pluripotency genes also diminishes their repressive effect on the Gata6 gene. This positive reinforcement leads to the final stable expression of the Gata6 gene in the steady-state PE.

Fig. 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 3.

The switching mechanism between steady states inferred from the most probable transition pathway. (A) Number of protein molecules along the most probable transition path from the stem cell steady-state SC1 to the primitive endoderm steady-state PE. The x axis indicates the progression along the transition and the y axis refers to different transcription factors. (B) The most probable differentiation pathway as in A plotted through the number of Nanog and Gata6 protein molecules. The steady-state solutions are shown as red circles. (C) Contributions to the action from individual genes along the most probable differentiation pathway from the steady-state SC1 to SC2 (Upper) and from the steady-state SC2 to PE (Lower).

The changing gene expression patterns shown in Fig. 3A suggest that the transition from the steady-state SC1 to PE occurs in two steps mediated by the intermediate-state SC2. This two-step transition mechanism becomes more apparent when the differentiation pathway is plotted using the expression levels of the Nanog and Gata6 genes, as shown in Fig. 3B. The three red circles correspond to gene expression levels in each of the three steady states, respectively. The transition pathway clearly transits through the steady-state SC2 before committing to the steady-state PE.

To further quantify the role of individual genes along the transition pathway, we decompose the transition action defined in Eq. 7 and use Sm=∫dtpmq˙m to assess the contributions coming from each given gene m. This additive decomposition is possible owing to the self-consistent proteomic field approximation. Fig. 3C, Upper presents the change of the transition actions for various genes along the transition path from the steady-state SC1 to SC2. The two genes Nanog and Pbx1 contribute most to this transition, which is consistent with the dramatic change in the expression of the two genes and minimal change of the others shown in Fig. 3A. Fig. 3C, Lower further characterizes the transition from the steady-state SC2 to PE. Major contributions again originate from genes with significant expression changes. The transition action for the Gata6 gene, however, emerges first along the pathway, indicating its important role in initiating the transition. We note in Fig. 3C, Upper and Lower the transition actions increase only up to a point and then plateau afterward. This behavior arises because the most probable transition path connecting two steady states travels through a saddle point. The transition action will increase only along the path from the locally probable starting steady state to the less probable saddle point. The path segment from the saddle point to the end steady state coincides with the zero-momentum deterministic path and makes no contribution to the transition action (28).

The stochastic transition path shown in Fig. 3 provides a mechanistic understanding of the important role of the Nanog gene in safeguarding stem cell pluripotency. The calculations suggest that for a stem cell to differentiate, the network must transit through the intermediate steady-state SC2 by turning off the Nanog gene. Because the Nanog off state acts as a bridge that connects the stem cell attractor SC1 to the differentiated cell attractor PE, it is reasonable to expect that stem cells constructed with low expression of Nanog will be more likely to differentiate compared with those constructed having high expression of Nanog, as has been observed in various experiments (34, 35). A similar analysis on the transition from the steady-state SC1 to TE, as shown in Fig. S3, leads to consistent conclusions. We note transition through an intermediate state with low Nanog expression along the differentiation pathway has also been observed in previous studies (8, 14⇓–16, 39).

Effect of DNA-Binding Kinetics on Stochastic Switching

To quantify the effect of DNA-binding kinetics on the stochastic transition mechanism and transition rate, we calculate the most probable transition path between the steady-state SC1 and PE for various DNA-unbinding rates.

Fig. 4A compares the average expression of the two genes Nanog and Gata6 along the most probable transition pathways calculated at several different DNA-unbinding rates. The blue line is the path calculated in the adiabatic limit and is the same as the transition path shown in Fig. 3B. The yellow and red lines are obtained at the DNA-unbinding rates f=104 and f=10, respectively. Despite the three orders of magnitude change in the DNA-unbinding rate, there is little difference in the paths for gene expression along any of the three pathways.

Fig. 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 4.

The effect of DNA-binding kinetics on the switching pathway and switching rate between steady states. (A) Comparison of the most probable differentiation pathways from the steady-state SC1 to PE calculated at several different DNA-unbinding rates. The blue line is the same as the adiabatic result also shown in Fig. 3B. (B) Dependence of the transition action on the DNA-unbinding rate f. The black solid line is a numerical fit to the calculated actions (red circles), using the expression a/(1/f+b), with a and b as fitting parameters. The dashed line indicates the action value obtained in the adiabatic limit.

On the other hand, Fig. 4B further compares the magnitudes of the transition actions along the most probable pathways calculated at different DNA-unbinding rates. The dashed line indicates the action value in the adiabatic limit. The black solid line is a numerical fit to the transition actions (red circles), using the expression a/(1/f+b), in which f is the DNA-unbinding rate and a and b are two fitting parameters. As expected, the transition action approaches the adiabatic value for large DNA-unbinding rates (f>104). Unlike the average gene expression paths shown in Fig. 4A that exhibit little dependence on the DNA-unbinding rate, the transition action itself changes by nearly two orders of magnitude as the DNA-unbinding rate decreases from 104 to 10. As shown in Fig. S4, transition actions between other steady states exhibit similar strong dependences on DNA-binding kinetics. Because the transition rate is exponentially proportional to the transition action, these results suggest that noise-driven switching between attractors can occur only in reasonable timescales in the weakly adiabatic regime.

The results shown in Fig. 4 A and B can be understood from the “churning mechanism” proposed in ref. 38. In the weakly to strongly adiabatic regime, the DNA occupancy responds quickly to the change of the proteomic atmosphere and reaches a local steady state before the protein number changes by a large amount. The mean expression of a given gene is mostly determined using an effective synthesis rate averaged over the ensemble of DNA occupancy states and is relatively insensitive to the DNA-binding kinetics as shown in Fig. 4A. Unlike the mean, the fluctuations of gene expression, however, strongly depend on the DNA-binding kinetics. This fluctuation in gene expression can be quantified via a local diffusion rate in protein number D. In the adiabatic limit, the local diffusion DBD comes purely from the stochastic birth and death of proteins and is independent of the DNA-binding kinetics. As the DNA-binding rate slows down in the weakly adiabatic regime, protein number diffusion arising from the fluctuation of DNA occupancy states becomes more and more significant. Indeed, the binding and unbinding of transcription factors onto the DNA will churn the protein number like a turbulent surf, as suggested in ref. 38. This additional diffusion in protein number caused by the churning mechanism Dchurn scales as 1/f. The total diffusion rate is therefore a sum of the two effects D=DBD+Dchurn. Finally, because the transition barrier scales as 1/D, the strong dependence of the transition action on the DNA-binding kinetics shown in Fig. 4B is understood.

Discussion

The strong dependence of transition rates between steady states on the DNA-binding kinetics suggests a potential mechanism for the heterogeneous distribution of gene expression levels in embryonic cells (40), adult stem cells (41) and cancer cells (42). Many kinetic studies suggest that the stem cell gene network may function in a weakly adiabatic regime (16, 43), which according to Fig. S4 will allow fast switching between different steady-state gene expression profiles. Therefore, the experimentally measured gene expression levels from a population of cells may in fact represent, instead of gene expression from a single attractor, an ensemble mixture of different steady-state gene expressions, which naturally leads to a heterogeneous distribution of expression levels. In that respect, it is particularly encouraging to note that stem cells can be classified into distinct clusters, in different ones of which the gene expression profiles bear close resemblance to those found for restricted lineages (41, 44). The fast switching between attractors in the weakly adiabatic regime may be beneficial to allow developmental plasticity of stem cells. However, this tendency of switching can also be detrimental and can compromise the phenotypic robustness of differentiated cells. We therefore expect that cell fate may be fixed by a transition of gene network dynamics from the weakly to the strongly adiabatic regime during cell differentiation to suppress the switching probability, possibly via covalent modifications of the chromatin such as methylation and acetylation.

Acknowledgments

We thank Dr. Davit Potoyan, Dr. Masaki Sasai, Dr. Aleksandra Walczak, and Dr. Jin Wang for critical reading of the manuscript. B.Z. acknowledges help from Dr. Matthias Heymann on the implementation of gMAM for finding the most probable path. We acknowledge financial support by the D. R. Bullard-Welch Chair at Rice University and by the Center for Theoretical Biological Physics sponsored by the National Science Foundation (Grant PHY-0822283).

Footnotes

  • ↵1To whom correspondence should be addressed. E-mail: pwolynes{at}rice.edu.
  • Author contributions: B.Z. and P.G.W. designed research, performed research, analyzed data, and wrote the paper.

  • The authors declare no conflict of interest.

  • This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1408561111/-/DCSupplemental.

Freely available online through the PNAS open access option.

References

  1. ↵
    1. Davidson E
    (2006) The Regulatory Genome: Gene Regulatory Networks in Development and Evolution (Academic, San Diego).
  2. ↵
    1. Cherry AB,
    2. Daley GQ
    (2012) Reprogramming cellular identity for regenerative medicine. Cell 148(6):1110–1122.
    OpenUrlCrossRefPubMed
  3. ↵
    1. Waddington CH
    (1957) The Strategy of the Genes (Allen & Unwin, London).
  4. ↵
    1. Sasai M,
    2. Wolynes PG
    (2003) Stochastic gene expression as a many-body problem. Proc Natl Acad Sci USA 100(5):2374–2379.
    OpenUrlAbstract/FREE Full Text
  5. ↵
    1. Wang J,
    2. Xu L,
    3. Wang E
    (2008) Potential landscape and flux framework of nonequilibrium networks: Robustness, dissipation, and coherence of biochemical oscillations. Proc Natl Acad Sci USA 105(34):12271–12276.
    OpenUrlAbstract/FREE Full Text
  6. ↵
    1. Lv C,
    2. Li X,
    3. Li F,
    4. Li T
    (2014) Constructing the energy landscape for genetic switching system driven by intrinsic noise. PLoS ONE 9(2):e88167.
    OpenUrlCrossRefPubMed
  7. ↵
    1. Wang J,
    2. Zhang K,
    3. Wang E
    (2010) Kinetic paths, time scale, and underlying landscapes: A path integral framework to study global natures of nonequilibrium systems and networks. J Chem Phys 133(12):125103.
    OpenUrlCrossRefPubMed
  8. ↵
    1. Wang J,
    2. Zhang K,
    3. Xu L,
    4. Wang E
    (2011) Quantifying the Waddington landscape and biological paths for development and differentiation. Proc Natl Acad Sci USA 108(20):8257–8262.
    OpenUrlAbstract/FREE Full Text
  9. ↵
    1. Kaern M,
    2. Elston TC,
    3. Blake WJ,
    4. Collins JJ
    (2005) Stochasticity in gene expression: From theories to phenotypes. Nat Rev Genet 6(6):451–464.
    OpenUrlCrossRefPubMed
  10. ↵
    1. Segal E,
    2. Widom J
    (2009) From DNA sequence to transcriptional behaviour: A quantitative approach. Nat Rev Genet 10(7):443–456.
    OpenUrlCrossRefPubMed
  11. ↵
    1. Mariani L,
    2. et al.
    (2010) Short-term memory in gene induction reveals the regulatory principle behind stochastic IL-4 expression. Mol Syst Biol 6:359.
    OpenUrlPubMed
  12. ↵
    1. Miller-Jensen K,
    2. Dey SS,
    3. Schaffer DV,
    4. Arkin AP
    (2011) Varying virulence: Epigenetic control of expression noise and disease processes. Trends Biotechnol 29(10):517–525.
    OpenUrlCrossRefPubMed
  13. ↵
    1. Raj A,
    2. van Oudenaarden A
    (2009) Single-molecule approaches to stochastic gene expression. Annu Rev Biophys 38:255–270.
    OpenUrlCrossRefPubMed
  14. ↵
    1. Feng H,
    2. Wang J
    (2012) A new mechanism of stem cell differentiation through slow binding/unbinding of regulators to genes. Sci Rep 2:550.
    OpenUrlPubMed
  15. ↵
    1. Li C,
    2. Wang J
    (2013) Quantifying Waddington landscapes and paths of non-adiabatic cell fate decisions for differentiation, reprogramming and transdifferentiation. J R Soc Interface 10(89):20130787.
    OpenUrlAbstract/FREE Full Text
  16. ↵
    1. Sasai M,
    2. Kawabata Y,
    3. Makishi K,
    4. Itoh K,
    5. Terada TP
    (2013) Time scales in epigenetic dynamics and phenotypic heterogeneity of embryonic stem cells. PLoS Comput Biol 9(12):e1003380.
    OpenUrlCrossRefPubMed
  17. ↵
    1. Feng H,
    2. Han B,
    3. Wang J
    (2012) Landscape and global stability of nonadiabatic and adiabatic oscillations in a gene network. Biophys J 102(5):1001–1010.
    OpenUrlCrossRefPubMed
  18. ↵
    1. Potoyan DA,
    2. Wolynes PG
    (2014) On the dephasing of genetic oscillators. Proc Natl Acad Sci USA 111(6):2391–2396.
    OpenUrlAbstract/FREE Full Text
  19. ↵
    1. Gillespie DT
    (1977) Exact stochastic simulation of coupled chemical reactions. J Phys Chem 81:2340–2361.
    OpenUrlCrossRef
  20. ↵
    1. Aurell E,
    2. Sneppen K
    (2002) Epigenetics as a first exit problem. Phys Rev Lett 88(4):048101.
    OpenUrlCrossRefPubMed
  21. ↵
    1. Roma DM,
    2. O’Flanagan RA,
    3. Ruckenstein AE,
    4. Sengupta AM,
    5. Mukhopadhyay R
    (2005) Optimal path to epigenetic switching. Phys Rev E Stat Nonlin Soft Matter Phys 71(1 Pt 1):011902.
    OpenUrlCrossRefPubMed
  22. ↵
    1. Wang P,
    2. et al.
    (2014) Epigenetic state network approach for describing cell phenotypic transitions. Interface Focus 4(3):20130068.
  23. ↵
    1. Chapman K,
    2. Higgins S
    (2001) Regulation of Gene Expression (Portland Press, London).
  24. ↵
    1. Walczak AM,
    2. Sasai M,
    3. Wolynes PG
    (2005) Self-consistent proteomic field theory of stochastic gene switches. Biophys J 88(2):828–850.
    OpenUrlCrossRefPubMed
  25. ↵
    1. Doi M
    (1976) Second quantization representation for classical many-particle system. J Phys A: Math Gen 9:1465–1477.
    OpenUrlCrossRef
  26. ↵
    1. Peliti L
    (1985) Path integral approach to birth-death processes on a lattice. J Phys 46:1469–1483.
    OpenUrlCrossRef
  27. ↵
    1. Zhang K,
    2. Sasai M,
    3. Wang J
    (2013) Eddy current and coupled landscapes for nonadiabatic and nonequilibrium complex system dynamics. Proc Natl Acad Sci USA 110(37):14930–14935.
    OpenUrlAbstract/FREE Full Text
  28. ↵
    1. Dykman MI,
    2. Mori E,
    3. Ross J,
    4. Hunt PM
    (1994) Large fluctuations and optimal paths in chemical-kinetics. J Chem Phys 100:5735–5750.
    OpenUrlCrossRef
  29. ↵
    1. Heymann M,
    2. Vanden-Eijnden E
    (2008) The geometric minimum action method: A least action principle on the space of curves. Commun Pure Appl Math 61:1052–1117.
    OpenUrlCrossRef
  30. ↵
    1. Lindley BS,
    2. Schwartz IB
    (2013) An iterative action minimizing method for computing optimal paths in stochastic dynamical systems. Physica D 255:22–30.
    OpenUrlCrossRef
  31. ↵
    1. Caroli B,
    2. Caroli C,
    3. Roulet B
    (1981) Diffusion in a bistable potential: The functional integral approach. J Stat Phys 26(1):83–111.
    OpenUrlCrossRef
  32. ↵
    1. Chickarmane V,
    2. Peterson C
    (2008) A computational model for understanding stem cell, trophectoderm and endoderm lineage determination. PLoS ONE 3(10):e3478.
    OpenUrlCrossRefPubMed
  33. ↵
    1. Chang R,
    2. Shoemaker R,
    3. Wang W
    (2011) Systematic search for recipes to generate induced pluripotent stem cells. PLoS Comput Biol 7(12):e1002300.
    OpenUrlCrossRefPubMed
  34. ↵
    1. Chambers I,
    2. et al.
    (2007) Nanog safeguards pluripotency and mediates germline development. Nature 450(7173):1230–1234.
    OpenUrlCrossRefPubMed
  35. ↵
    1. Kalmar T,
    2. et al.
    (2009) Regulated fluctuations in nanog expression mediate cell fate decisions in embryonic stem cells. PLoS Biol 7(7):e1000149.
    OpenUrlCrossRefPubMed
  36. ↵
    1. Miyanari Y,
    2. Torres-Padilla ME
    (2012) Control of ground-state pluripotency by allelic regulation of Nanog. Nature 483(7390):470–473.
    OpenUrlCrossRefPubMed
  37. ↵
    1. Hay DC,
    2. Sutherland L,
    3. Clark J,
    4. Burdon T
    (2004) Oct-4 knockdown induces similar patterns of endoderm and trophoblast differentiation markers in human and mouse embryonic stem cells. Stem Cells 22(2):225–235.
    OpenUrlCrossRefPubMed
  38. ↵
    1. Walczak AM,
    2. Onuchic JN,
    3. Wolynes PG
    (2005) Absolute rate theories of epigenetic stability. Proc Natl Acad Sci USA 102(52):18926–18931.
    OpenUrlAbstract/FREE Full Text
  39. ↵
    1. Li C,
    2. Wang J
    (2013) Quantifying cell fate decisions for differentiation and reprogramming of a human stem cell network: Landscape and biological paths. PLoS Comput Biol 9(8):e1003165.
    OpenUrlCrossRefPubMed
  40. ↵
    1. Martinez Arias A,
    2. Brickman JM
    (2011) Gene expression heterogeneities in embryonic stem cell populations: Origin and function. Curr Opin Cell Biol 23(6):650–656.
    OpenUrlCrossRefPubMed
  41. ↵
    1. Moignard V,
    2. et al.
    (2013) Characterization of transcriptional networks in blood stem and progenitor cells using high-throughput single-cell gene expression analysis. Nat Cell Biol 15(4):363–372.
    OpenUrlCrossRefPubMed
  42. ↵
    1. Gupta PB,
    2. et al.
    (2011) Stochastic state transitions give rise to phenotypic equilibrium in populations of cancer cells. Cell 146(4):633–644.
    OpenUrlCrossRefPubMed
  43. ↵
    1. Teles J,
    2. et al.
    (2013) Transcriptional regulation of lineage commitment—a stochastic model of cell fate decisions. PLoS Comput Biol 9(8):e1003197.
    OpenUrlCrossRefPubMed
  44. ↵
    1. Chang HH,
    2. Hemberg M,
    3. Barahona M,
    4. Ingber DE,
    5. Huang S
    (2008) Transcriptome-wide noise controls lineage choice in mammalian progenitor cells. Nature 453(7194):544–547.
    OpenUrlCrossRefPubMed
PreviousNext
Back to top
Article Alerts
Email Article

Thank you for your interest in spreading the word on PNAS.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Stem cell differentiation as a many-body problem
(Your Name) has sent you a message from PNAS
(Your Name) thought you would like to see the PNAS web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Citation Tools
Stem cell differentiation as a many-body problem
Bin Zhang, Peter G. Wolynes
Proceedings of the National Academy of Sciences Jul 2014, 111 (28) 10185-10190; DOI: 10.1073/pnas.1408561111

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Request Permissions
Share
Stem cell differentiation as a many-body problem
Bin Zhang, Peter G. Wolynes
Proceedings of the National Academy of Sciences Jul 2014, 111 (28) 10185-10190; DOI: 10.1073/pnas.1408561111
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Mendeley logo Mendeley

Article Classifications

  • Biological Sciences
  • Biophysics and Computational Biology
  • Physical Sciences
  • Chemistry
Proceedings of the National Academy of Sciences: 111 (28)
Table of Contents

Submit

Sign up for Article Alerts

Jump to section

  • Article
    • Abstract
    • Theoretical Approaches for Gene Networks
    • Cell Types as Steady States of Gene Networks
    • Most Probable Transition Pathways
    • Effect of DNA-Binding Kinetics on Stochastic Switching
    • Discussion
    • Acknowledgments
    • Footnotes
    • References
  • Figures & SI
  • Info & Metrics
  • PDF

You May Also be Interested in

Smoke emanates from Japan’s Fukushima nuclear power plant a few days after tsunami damage
Core Concept: Muography offers a new way to see inside a multitude of objects
Muons penetrate much further than X-rays, they do essentially zero damage, and they are provided for free by the cosmos.
Image credit: Science Source/Digital Globe.
Water from a faucet fills a glass.
News Feature: How “forever chemicals” might impair the immune system
Researchers are exploring whether these ubiquitous fluorinated molecules might worsen infections or hamper vaccine effectiveness.
Image credit: Shutterstock/Dmitry Naumov.
Venus flytrap captures a fly.
Journal Club: Venus flytrap mechanism could shed light on how plants sense touch
One protein seems to play a key role in touch sensitivity for flytraps and other meat-eating plants.
Image credit: Shutterstock/Kuttelvaserova Stuchelova.
Illustration of groups of people chatting
Exploring the length of human conversations
Adam Mastroianni and Daniel Gilbert explore why conversations almost never end when people want them to.
Listen
Past PodcastsSubscribe
Panda bear hanging in a tree
How horse manure helps giant pandas tolerate cold
A study finds that giant pandas roll in horse manure to increase their cold tolerance.
Image credit: Fuwen Wei.

Similar Articles

Site Logo
Powered by HighWire
  • Submit Manuscript
  • Twitter
  • Facebook
  • RSS Feeds
  • Email Alerts

Articles

  • Current Issue
  • Special Feature Articles – Most Recent
  • List of Issues

PNAS Portals

  • Anthropology
  • Chemistry
  • Classics
  • Front Matter
  • Physics
  • Sustainability Science
  • Teaching Resources

Information

  • Authors
  • Editorial Board
  • Reviewers
  • Subscribers
  • Librarians
  • Press
  • Cozzarelli Prize
  • Site Map
  • PNAS Updates
  • FAQs
  • Accessibility Statement
  • Rights & Permissions
  • About
  • Contact

Feedback    Privacy/Legal

Copyright © 2021 National Academy of Sciences. Online ISSN 1091-6490