# Intrinsic noise, dissipation cost, and robustness of cellular networks: The underlying energy landscape of MAPK signal transduction

See allHide authors and affiliations

Edited by José N. Onuchic, University of California at San Diego, La Jolla, CA, and approved January 31, 2008 (received for review September 13, 2007)

## Abstract

We develop a probabilistic method for analyzing global features of a cellular network under intrinsic statistical fluctuations, which is important when there are finite numbers of molecules. By making a self-consistent mean field approximation of splitting the variables in order to reduce the large number of degrees of freedom, which is reasonable for a not very strongly interacting network, we discovered that the underlying energy landscape of the mitogen-activated protein kinases (MAPKs) signal transduction network (with experimentally measured or inferred parameters such as chemical reaction rate coefficients in the network) is funneled toward a global minimum characterized by the nonequilibrium steady-state fixed point of the system at the end of the signal transduction process. For this system, we also show that the energy landscape is robust against intrinsic fluctuations and random perturbation to the inherent chemical reaction rates. The ratio of the slope versus the roughness of the energy landscape becomes a quantitative measure of robustness and stability of the network. Furthermore, we quantify the dissipation cost of this nonequilibrium system through entropy production, caused by the nonequilibrium flux in the system. We found that a lower dissipation cost corresponds to a more robust network. This least dissipation property might provide a design principle for robust and functional networks. Finally, we find the possibility of bistable and oscillatory-like solutions, which are important for cell fate decisions, upon perturbations. The method described here can be used in a variety of biological networks.

The ultimate goal of biology is to understand the function of specific systems. At the cell level, the function of the system is realized through the network of interactions between molecules. Unraveling the underlying mechanisms and global principles of the cellular network is the current challenge in systems biology.

There has been some experimental investigation of these networks. The experimental progress has been made through large-scale genetic mutations and screenings, gene expressions, yeast two-hybrids, pull-down, and immunoprecipitation (1⇓⇓–4). These experiments find that cellular networks, in general, are quite robust and perform their biological functions in the midst of environmental perturbations. Synthetic biology is aiming to follow in the footsteps of electronic design by piecing together small modular components to build a much larger network and, by doing this, study the network from a bottom-up perspective (5). This approach is just at the beginning stage and seems to have a promising future.

On the theoretical side, the topological structures of the networks have been investigated recently (6). The scale-free properties and hierarchical architectures for the networks have been shown (7⇓–9). The highly connected nodes in the network, called hubs, might play an important role in the robustness of the network. From the engineering perspective, efforts have been made to understand the network from control perspectives with robust yet fragile natures (10). There have been a number of studies attempting to determine why networks are robust in their biological function among perturbations (11⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓–24).

The most common way of modeling these systems is through chemical rate equations. From this, deterministic trajectories can be found, and from these trajectories inference can be drawn to the possible biological function. These trajectories can only span a small piece of the whole phase space. To determine more of the phase space and global features of this space under a set of parameters, one would need to examine all of the initial conditions, a hopeless task in most systems. To understand the network's global nature under different external conditions, one must change the inherent parameters, thus further complicating the task. Because this will generally introduce a high dimensionality into the phase space, the issues of stability, robustness, and other global properties become very hard to address and examine.

A major problem is that the typical chemical rate equations ignore inherent statistical fluctuations in the number of particles, *N*. This is not a problem when *N* ≫ 1 because fluctuations go as
^{3} or less. The intrinsic statistical fluctuations, which are negligibly small in the bulk, become very important. Thus, it is necessary to move from a deterministic model to a statistical model. The study of the influence under external noise from the cellular environments for the MAPK system was carried out in an earlier article (19). Here, we will focus on the intrinsic noise from number fluctuation.

We can begin this statistical description by noticing that the chemical rate equations are only approximations up to the average concentration level. In the cell, there are statistical fluctuations around this concentration, coming from internal and external sources. The internal noise comes from the finite number of molecules, as we mentioned above, whereas the external noise comes from the highly dynamical and inhomogeneous environments of the interior of the cell (25⇓⇓⇓⇓–30). Both of these sources of noise are important in determining properties of the network.

Although there are huge number of states in the phase space that are hard to explore globally, the probability of each state is not uniformly distributed. It might be that most of the states only have small probabilities, but only a finite number of states have larger probabilities. The states with higher probabilities are biologically more relevant and should be the focus rather than the states with lower weights. Thus, an approach to determining the probability of each state will allow us to identify the important states for biological function and furthermore study their stability and robustness.

Noisy conditions thus play a very important role in these chemical reaction networks and are much more realistic than the average mean concentrations. To describe the system under noisy conditions, we will define a potential energy function that is derived from the steady-state probability of the network. After this landscape is determined, the probability of each state is known and we can begin to analyze global features (19, 22, 23). This is reminiscent of equilibrium systems, where one can analyze the phase diagram and other global features. Once the network problem is put into terms of the generalized potential energy function or potential energy landscape, the issue of the global stability or robustness is much easier to address. The purpose of this article was to study the global robustness or stability against intrinsic fluctuations and random perturbation to the inherent chemical reaction rates directly from the properties of the potential energy landscape of the network. We will focus on the network landscape under the intrinsic statistical fluctuations due to the finite number of molecules in the cell.

Because the cell is an open system, the network is a far-from-equilibrium system. The dynamics is not only determined by the underlying landscape but also by nonzero flux due to the lack of detailed balance. This can lead to a more complete description with potential and the flux for the network system in analogy to the electronic circuit with voltage and current. As an open nonequilibrium system, there will be heat dissipation and entropy production in analogy from the electric circuit with voltage and current generating heat losses (23, 31, 32). This entropy production can be used to characterize the global features of the system.

To explore the nature of the underlying potential energy landscape of the cellular networks, we will study the relatively simple yet important MAPK signal transduction network (Fig. 1). Mitogen-activated protein kinases (MAPKs) belong to a family of serine/threonine protein kinases widely conserved among eukaryotes and are involved in many cellular programs such as proliferation, differentiation, movement, and cell death. MAPK signaling cascades are organized hierarchically into the three-tiered modules. MAPKs are phosphorylated and activated by MAPK kinases (MAPKKs), which in turn are phosphorylated and activated by MAPKK kinases (MAPKKKs). The MAPKKK is in turn activated by interaction with a family of small GTPases and/or other protein kinases connecting the MAPK module to the cell-surface receptor or external stimuli (2, 3) (Fig. 1). We will examine the MAPK network by uncovering the underlying energy landscape under intrinsic statistical fluctuations through the master equations of the system.

## Results and Discussions

Once the probability distribution function, *P*(**x**, *t*), for the state variable **x** (**x** represents the protein concentrations in the MAPK signal transduction network case) is solved, the corresponding steady-state probability *P*_{SS}(**x**) can be obtained by taking the long time limit. The potential energy function *U*(**x**) (14, 18⇓⇓⇓⇓–23, 33) can be related to steady-state probability:
with the partition function *Z* = ∫*d ^{d}x* exp{−

*U*(

*x*)}. From this, we can recognize

*U*as a potential energy function for the network system. Once the potential energy landscape is determined, we can examine the global properties of the protein cellular networks.

The potential energy is a multidimensional function in concentration vector **x** space with each component of which representing the concentration of each type of proteins. For certain configurations of concentrations, the network adopts a certain potential energy (or the corresponding probability). The dimensionality of the configurational state space is huge. We are interested, first of all, in the most probable configuration that corresponds to the lowest energy state. We found that the lowest energy state or the most probable configuration is the one at the end stage (ground state) of the MAPK signal transduction, which corresponds to the fixed-point steady-state solution of the averaged chemical rate equations for the MAPK network. However, this distribution is 22-dimensional and thus very difficult to visualize or analyze directly. Because of this, we will have to consider lower dimensional projections of this free energy distribution.

First, let us consider the 0th-dimension projection, the histogram (Fig. 2*A*). The minimum of this projection corresponds to the steady state of the system, which also corresponds to the traditional solution of the chemical rate equations. This correspondence comes about because all reactions only involve single reactants. Fig. 2*B* shows the potential energy spectrum for our system. Notice that the global minimum of the potential energy is significantly separated from the rest of the spectrum.

However, we need a quantifiable measure for this projection. To arrive at this, we will consider a dimensionless quantity, which we call the robustness ratio of the distribution, Λ = δ*U*/Δ*U*, where δ*U* = 〈*U*〉 − *U*_{min}, and Δ*U* is the half spread of the histogram. The δ*U* is a measure of the forcing and bias toward the global minimum of the potential energy, whereas Δ*U* is a measure of the roughness and possibility of being locally trapped in the potential energy landscape. When Λ is significantly larger than 1, the bias toward the minimum is much stronger than the probability of local trapping; thus, the global minimum is well separated and distinct from the rest of the network potential energy spectrum. The robustness ratio for this network is 2.45. This signifies that the MAPK network is robust under intrinsic statistical fluctuations, which is not surprising because the network is required by evolutionary concerns to be robust.

Now, let us consider one-dimensional projections. The two one-dimensional coordinates we will consider are the RMS distance from global minimum, that is
*Q* = (*x*_{min}·*x*)/|*x*_{min}||*x*|, where *x*_{min} is the minimum energy state of the system. Because this inner product is defined in the normal way for *R ^{n}*, the

*Q*value is equivalent to cos θ, where θ is the angle between these two state phase space vectors. Thus, a value of

*Q*= 0 describes orthogonal vectors with no overlap and

*Q*= 1 describes parallel vectors with complete overlap. From Fig. 2

*C*, we see a downhill trend toward

*Q*= 1, implying a tendency to align with the global minimum of the potential energy landscape. This tendency shows that there does exist a funnel in the potential energy landscape. Another coordinate, the RMS distance (RMSD), shows the overall phase space distance separation of the two states. Fig. 2

*D*shows a similar downhill slope and overall funneled landscape toward the global minima for the RMSD projection.

It is also important to examine the parameter space. To do this, we varied the reaction rates of the system. Specifically, the reaction rates were taken from a probability distribution with a mean of the unperturbed rate, *c*_{μ}, and a standard deviation given by σ = *l*_{p} × *c*_{μ}, where *l*_{p} is the level of perturbation applied to the system. From this, we examine what happens to the system at a different location in the parameter space.

Fig. 3*A* shows the robustness ratio Λ of the MAPK network versus the energy of the ground state. There is a monotonic relationship between the ground state energy and the robustness ratio Λ. When Λ is larger (smaller), the landscape is more (less) robust, and the network is more (less) stable with ground state dominating (less significant). Therefore, Λ is indeed a robustness measure for the network. We observe that the system is stable under most of the perturbations through changing the rates. There are rare events that cause big changes that could significantly destabilize the system and lower both Λ and the probability of the ground state.

Fig. 3*B* shows the robustness ratio Λ versus the perturbation level *l*_{p}. We see again that with small variances of perturbations of the chemical reaction rates, the network is with a large Λ characterizing the funneled landscape and stability of the system. With larger variances of perturbations of chemical reaction rates, the networks typically have smaller Λ and smaller probability of ground state compared with the biological one. This means that large perturbative states are less stable than the biological one. The biological functioning network is quite different from the random ones in terms of the underlying energy landscape and stability. This shows that as the variance of the chemical rate increases, the likelihood of these rare events that cause disruption of the system becomes greater.

Fig. 3*C* shows the standard deviation in the robustness ratio Λ versus the perturbation level *l*_{p}. We observe that the standard deviation of Λ increases with the amount of perturbation. This indicates that not only does the higher perturbation make degradation of the system more likely, but that these events become possibly more severe. This greater severity causes lower Λ to occur and thus increases the variance in the robustness ratio.

We can also examine a measure of the characteristic kinetic time τ it takes to reach steady state. We will do this by considering the same initial conditions while perturbing the rates at different variance levels of the system. From Fig. 4*A*, the average τ does increase with an increase in the perturbation level. This is also matched by an increase in the energy spread of the system, 〈Δ*U*〉 (Fig. 4*B*). This increase in 〈Δ*U*〉 indicates that there is more local trapping in the system, which indicates a longer time being “stuck” in the local roughness of the system and thus the increase in τ.

The MAPK system is open and reaches a nonequilibrium state. The steady-state solution of the system is not an equilibrium state. This come about because the flux (*F*_{ijsteady-state} = −*T*_{ij}*P*_{isteady-state} + *T*_{ji}*P*_{jsteady-state}), where *T _{ij}* is the transition probability from state

*i*to state

*j*, of the system at steady state is not zero. In an equilibrium state, the flux would reach zero. Therefore, to describe a nonequilibrium network system, we need the steady-state probability or the underlying landscape as well as the flux. Because of this flux, the nonequilibrium steady state will dissipate energy, resulting in heat loss. This dissipation will be equal to the entropy production rate (23, 31, 32) in steady state. The entropy production rate for the whole network (

*Ṡ*) includes the contribution from the system (

*Ṡ*

_{sys}) (equal to zero in steady state) and the dissipation from environments (

*Ṡ*

_{dis}): Because this entropy production is a feature of the global properties of the network, we can use it to analyze global features of the network, such as stability and robustness.

In Fig. 5*A*, we plotted the entropy production (per unit time) or the dissipation cost of the network in steady state, *Ṡ*, versus Λ for certain variance of the chemical reaction rates. We can see the entropy production rate decreases as Λ increases. Thus, we can suggest that the most robust systems produce the least amount of entropy. The fact that robustness is linked with the entropy production rate may reflect the fact that fewer fluctuations and perturbations in rates lead to a more robust and stable network—and more energy saving—and therefore fewer costs in the mean time. This might provide us with a design principle of optimizing the connections of the network with minimum dissipation cost for the network.

In Fig. 5*B*, we can see that the entropy production rate increases as the variances of the inherent chemical reaction rates increase. This implies that the lesser the variation in the underlying chemical reactions rates, the more robust the network and the less entropy production or heat loss for the network. This can be very important for the network design. This implies that nature might evolve such that the network is robust against internal (intrinsic) and environmental perturbations, and performs specific biological functions with minimum dissipation cost. In our study, this is also the equivalent of optimizing the robustness or stability of the network.

For most of the perturbations, we observed the strong funnel unchanging and thus a monostable solution. However, under some of the perturbations there did arise bistable or oscillatory solutions. For the bistable solutions, there was generally a strong minima and a weak minima, with the weak minima being at a higher energy and the weak minima occupying a much smaller region in phase space. (See Fig. 6 for a two-dimensional projection of a bistable state.) We speculate that this bistability is formed by the different rates being such values as to form a feed-forward-like connection between the middle of the cascade and the end. Under the set of perturbations that we ran, we found that monostability occurred 84% of the time, whereas bistability and oscillatory-like solutions occurred 12% and 4% of the time. Most of the perturbations that cause the change to the global features are related to sections of the network that connect different subsections of the network. Notice also that that these important nodes, which connect different subsections, are also the hubs of the network. We must be careful in calling some of these solutions oscillatory; because we are working in a high-dimensional space, it is possible that these oscillatory-looking solutions are actually strange attractors. This is why we classify the solutions as oscillatory-like. Thus, under most perturbations, the system is in a strong robust format. The possible bistability is important toward making cell fate decisions (34). This bistability is especially important because it occurs without imposing an artificial feedback loop on the system.

## Conclusions

We used the experimentally inferred rate parameters to prove that the network is funneled in configurational space of protein concentrations toward the ground nonequilibrium steady-state fixed point under the intrinsic statistical fluctuations. As already mentioned, the parameter space of the cellular network is huge, in fact exponential (*M ^{N}*, where

*M*is the number of specific type of proteins and

*N*is the number of different protein types). It is impossible to search all of the parameter space to explore the global stability or robustness with dynamical equations. Our study provides a self-consistent mean field method to reduce the dimensionality to polynomial (

*M*×

*N*) so that large networks can be studied.

We show that natural evolution might only select certain parameter space with the funneled underlying energy landscape. The other part of the parameter space that generates the rough potential landscape cannot guarantee the global robustness and therefore is not able to appropriately perform the specific biological function required for efficient transformation of the information signals. They are more likely to phase out from evolution. The funneled landscape, therefore, may be a realization of the Darwinian principle of natural selection at the cellular network level for efficient transformation of the information (signal transduction). As we see, the funneled landscape provides an optimal criterion to select the suitable parameter subspace of cellular networks, guarantee the robustness, cost the least dissipations, and perform specific biological functions. This will lead to an algorithm for the network connections that is potentially useful for network design.

Under perturbations, the system can be made into a bistable system. It is also important to note that this is done without directly introducing a feed-back or feed-forward interaction into the network. This bistability will be important in the determination of the cell fate decisions important for this network.

It is worth pointing out that the approach described here is general and can be applied to many cellular networks such as signaling transduction networks (2), metabolic networks (35), cell cycle networks (11), and gene regulatory networks (12, 15, 21, 22).

## Methods

Most theoretical models of chemical networks involve the use of chemical rate equations. As mentioned before, there are problems with this approach.

We will first start with a description of the MAPK cascade (see Fig. 1). In the first step, the activation of MKKK is catalyzed by an enzyme E1. In the second step, the reverse is catalyzed by an enzyme E2. The third and fourth steps represent the two-collision, nonprocessive mechanism for the double phosphorylation of MKK; these steps are catalyzed by MKKK-P, whereas their reverses are catalyzed by a phosphatase KKPase. Likewise, the seventh and eighth steps represent the double phosphorylation of MAPK catalyzed by the active MKK-PP, and the reverse steps are catalyzed by a phosphatase KPase.

Based on the Michaelis–Menten enzyme kinetic equation, one can derive a set of differential equations that describe the average rate process for concentrations in the cascade (2, 3). The associated chemical reaction rate coefficients are measured or inferred from the experiments in *Xenopus* oocyte (2, 3).

The probabilistic evolution of this network due to the intrinsic fluctuations from a finite number of molecules follows the master equation in high dimensional protein concentration space. Thus, the basic equations of model become *dP*(*x _{i}*)/

*dt*= Σ

_{μ}

*a*

_{μ}(

*x*− Δ

_{i}*x*

_{i,μ})

*P*(

*x*− Δ

_{i}*x*

_{i,μ}) −

*a*

_{μ}(

*x*)

_{i}*P*(

*x*), where

_{i}*a*

_{μ}is the reaction rate for reaction μ, and Δ

*x*

_{i,μ}is the change of

*x*during the reaction. There are some basic assumptions with this model, primarily that the system is made of Markov processes.

_{i}This is still very difficult if there are many different types of particles, since the number of equations of this form will grow as *n*_{1} × *n*_{2} × *n*_{3} × … We will be making a self-consistent mean field approximation (12, 21, 22), that is *P*(*x _{i}*,

*x*

_{j}, …) = Π

_{i}

*P*

_{i}(

*x*). This reduces the number of equations to

_{i}*n*

_{1}+

*n*

_{2}+ …

Because the individual probability distributions are now independent, we can describe each *P*_{i}(*x _{i}*) by a series of moments: 〈

*x*〉, 〈

_{i}*x*

_{i}

^{2}〉, …, 〈

*x*〉, … Thus, we can convert from a series of differential equations describing the probability to a series of moment differential equations. For example, we can calculate the first moment to infer the approximate Poisson distribution, we can calculate the second moment to infer the approximate Gaussian distribution for individual probability

_{i}^{n}*P*(X

_{i}). The moment equations are listed in supporting information (SI)

*Appendix*. We notice that these moment equations are themselves closed (they do not depend on higher order moments) for MAPK, so they can be solved exactly. We did so numerically up to second order moment to infer the underlying distribution. We only present the results in the main text.

## Acknowledgments

This work was partially supported by a National Science Foundation Career Award, the American Chemical Society Petroleum Research Fund, and the National Science Foundation of China.

## Footnotes

- ↵
^{‡}To whom correspondence should be addressed. E-mail: jin.wang.1{at}stonybrook.edu

Author contributions: J.W. designed research; S.L., B.H., and J.W. performed research; J.W. contributed new reagents/analytic tools; S.L., B.H., and J.W. analyzed data; and S.L. and J.W. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/cgi/content/full/0708708105/DC1.

- Received September 13, 2007.

- © 2008 by The National Academy of Sciences of the USA

## References

- ↵.
- Davidson EH,
- et al.

- ↵.
- Huang CY,
- Ferrell JE Jr

- ↵
- ↵.
- Ideker T,
- et al.

- ↵
- ↵
- ↵
- ↵.
- Maslov S,
- Sneppen K

- ↵.
- Milo R,
- et al.

- ↵
- ↵
- ↵.
- Sasai M,
- Wolynes PG

- ↵.
- Li F,
- Long T,
- Lu Y,
- Ouyang Q,
- Tang C

- ↵
- ↵
- ↵
- ↵.
- Qian H,
- Reluga TC

- ↵.
- Hornos JEM,
- et al.

- ↵.
- Wang J,
- Huang B,
- Xia XF,
- Sun ZR

- ↵.
- Wang J,
- Huang B,
- Xia XF,
- Sun ZR

- ↵.
- Kim K,
- Lepzelter D,
- Wang J

- ↵.
- Kim K,
- Wang J

- ↵
- ↵
- ↵.
- McAdams HH,
- Arkin A

- ↵
- ↵.
- Swain PS,
- Elowitz MB,
- Siggia ED

- ↵.
- Thattai M,
- van Oudenaarden A

- ↵.
- Vilar JMG,
- Guet CC,
- Leibler S

- ↵
- ↵
- ↵.
- Qian H

- ↵.
- Van Kampen NG

- ↵
- ↵