Statistics of cellular signal transduction as a race to the nucleus by multiple random walkers in compartment/phosphorylation space
- Ting Lu*,†,
- Tongye Shen†,‡,
- Chenghang Zong†,‡,
- Jeff Hasty§,¶, and
- Peter G. Wolynes*,‡,†,‖
- Departments of *Physics,
- ‡Chemistry and Biochemistry, and
- §Bioengineering,
- ¶Institute for Nonlinear Science, and
- †Center for Theoretical Biological Physics, University of California at San Diego, La Jolla, CA 92093
-
Contributed by Peter G. Wolynes, September 8, 2006
Abstract
Cellular signal transduction often involves a reaction network of phosphorylation and transport events arranged with a ladder topology. If we keep track of the location of the phosphate groups describing an abstract state space, a simple model of signal transduction involving enzymes can be mapped on to a problem of how multiple biased random walkers compete to reach their target in the nucleus yielding a signal. Here, the first passage time probability and the survival probability for multiple walkers can be used to characterize the response of the network. The statistics of the first passage through the network has an asymmetric distribution with a long tail arising from the hierarchical structure of the network. This distribution implies a significant difference between the mean and the most probable signal transduction time. The response patterns for various external inputs generated by our model agree with recent experiments. In addition, the model predicts that there is an optimal phosphorylation enzyme concentration for rapid signal transduction.
Achieving a quantitative understanding of the reaction networks that transduce cellular signals is one of the major challenges in biology. Signaling networks are found in a diverse set of organisms, ranging from prokaryotes to eukaryotes, and provide mechanisms for fundamental processes such as gene-regulatory control and cellular communication. Qualitative descriptions of the biomolecular components and mechanisms of cellular signaling have greatly improved our understanding of how cells function and have given insights into how to intervene therapeutically when such signals are miscommunicated. Experimental advances now allow quantitative studies of signal transduction and thereby inspire theoretical treatments. Many networks of nonlinear reactions exhibit interesting behavioral features as ultrasensitivity, adaption, robustness, and discrete “all-or-none” response, which have been quantitatively explored (1–8).
A commonly occurring network topology is the reaction ladder network. This network may be viewed as a generalization of multiple-site phosphorylation/dephosphorylation cascades, such as the pathway governing nuclear factor activation of T cells (NFAT), which regulates the response of T cells to antigen signaling (9–13). To stimulate T cells, NFAT must be transported to the nucleus. This transition occurs in response to a conformational change that exposes a nuclear localization sequence (NLS), which is normally buried in the protein interior in the inactive conformation and thus makes the NFAT inaccessible to transport by importin. The NLS becomes exposed in response to the progressive dephosphorylation of specific serine residues in its regulatory domain. This dephosphorylation occurs in response to an increase of intracellular calcium ions that activate calcineurin, which then dephosphorylates the masking residues. Once a sufficient number of sites have been dephosphorylated, conformational changes expose the NLS so that it can now be transported into the nucleus by the importin. This is not a one-way process. Inside the nucleus, the NFAT may be progressively rephosphorylated by kinases and subsequently exported to the cytoplasm by the exportin Crm1 (10, 11). To summarize this network generically, the NFAT can exist in a variety of phosphorylation states and at various locations within the cell. Transitions between these phosphorylation/compartmentalization states can be described as a network of reactions consisting of two groups of species Ci and Ui, where the Ci species reside in the cytoplasm and the Ui species reside in the nucleus (Fig. 1). On each side of the cytonuclear barrier, there are M + 1 species having different levels of phosphorylation. This network topology can generally be interpreted as representing either processive or distributive mechanisms of phosphorylation (14, 15). If the subscript i labels a specific order of phosphorylation, e.g., phosphorylation of residue A, followed by residue B, followed by residue C, …, the network describes the processive (de)phosphorylation; if i represents for the number of (de)phosphorylated residues, the network describes the distributive phosphorylation mechanism, e.g., one residue is first phosphorylated, followed by two residues, then three residues, etc. Of course, the rates connecting i and i + 1 will be different for the two mechanisms, but the network topology remains the same. In the following, we study the processive phosphorylation of residues.
The reaction ladder network. The C and U represent two distinct compartmental locations of the signaling molecules, say, cytoplasmic and nuclear regions. The subscript i (i = 0, 1, 2, …, M) indicates the dephosphorylation states. The C̄i and Ĉi (Ūi and Ûi) are the signaling protein–enzyme complex forms. The subscripts f, b, and c represent forward, backward, and catalyzed rates of each reaction. The k + i and k − i are the transport rates for transitions between Ci and Ui.
The rates of both phosphorylation (by kinases) and dephosphorlyation (by phosphatases) are naturally modeled as Michaelis–Menten reactions with rates that depend on the availability of enzymes. Although dephosphorylations dominantly occur in the cytoplasm and phosphorylations dominantly occur in the nucleus in the case of NFAT signaling (9–11), in our model we allow them both to occur in either environment with different rates. The protein phosphorylation state by affecting the conformation of that protein determines how easily it is translocated into or out of the nucleus. In our model k + i represents the reaction rate from Ci to Ui and k − i is the rate of going from Ui to Ci. An individual NFAT molecule thus makes random walks through its prosphorylation/location space according to these microscopic reaction rates.
The NFAT signaling network shares its topology with many other networks in the cell. It simplifies to a typical two-state system if there is only a single phosphorylation state; the network resembles an enzymatic futile cycle when there are two phosphorylation states but no compartmental transport (16); and the network is also similar to the Monod–Wyman–Changeux model (17). The reaction ladder network thus presents a paradigm for the interplay of spatial heterogeneity and posttranscriptional modification in the flow of biological information. The modification reactions, conformational changes and intercompartmental transports are intrinsically stochastic events. On the ladder network, each signaling molecule follows a path through the network, causing transcription to occur at random times. When the modifying enzymes are abundant, the network is effectively linear and each molecule walks through location/phophorylation space independently. When enzymes are limited in number, the network becomes nonlinear and the walks of different signaling molecules interact by competing for enzymes. Ultimately, it takes only one NFAT to turn on its target gene. Thus we can say, somewhat anthropomorphically, that the individual NFAT's are competing in a race to the DNA. We thus have a problem of determining the statistics of mean first passage for multiple walkers.
The statistical problem of calculating the mean first passage time (FPT) for random walkers has a distinguished history (18–22). Here we will find the first passage time and the survival probability in terms of a dynamic probability distribution. We will then show that the exact solution of this problem of multiple random walkers having the same goal does indeed agree with the results of Monte Carlo simulation of the network when enzymes are unlimited. The mean FPT distribution is found to be asymmetric and has a long tail. The solution also shows there is an optimal forward reaction rate yielding the most rapid arrival of a viable signaling protein to the target.
Distribution of FPTs and Survival Probability
Different phosphorylation and dephosphorylation processes are catalyzed by specific enzymes (23, 24). For simplicity of treatment, we assume a universal kinase and a universal phosphatase, Kc and Fc, in the cytoplasm and another set, Ku and Fu, in the nucleus. Also for simplicity we assume only apo proteins can be transported between the cytoplasm and the nucleus. The enzymes (Kc, Fc, Ku, and Fu) as well as the signaling protein phosphatase complexes (C̄i or Ūi) and the signaling protein kinase complexes (Ĉi or Ûi) cannot be transported.
Suppose that there are ci, c̄i, and ĉi proteins in the Ci, C̄i, and Ĉi states and ui, ūi, and ûi proteins in the Ui, Ūi, and Ûi states respectively (i = 0, 1, 2, …, M). Then, the numbers of proteins in each state is described by a 6M + 6 dimensional vector n⃗ ≡ (ĉ 0, c 0, c̄ 0, û 0, u 0, ū 0; …; ĉM, cM, ūM, ûM, uM, ūM), where ĉ 0, ū 0, û M, and c̄ M are zeros due to the boundaries of the network. The numbers of the enzymes in the system are defined by a four dimensional vector E⃗ = (Fc, Kc, Fu, Ku). We define a state |Ψ(t)〉 as |Ψ(t)〉 ≡ Σn⃗,E⃗ P(n⃗, E⃗, t)|n⃗, E⃗〉, where P(n⃗, E⃗, t) is the probability having n⃗ and E⃗ numbers of proteins and enzymes in the network. The master equation describing the network can then be written as ∂/∂t|Ψ(t)〉 = W|Ψ(t)〉, where W is the transition rate matrix whose dimension depends on the total numbers of enzymes and substrates.
Generally solving this master equation represents a challenging many-body problem. However, when the numbers of enzymes in the cytoplasm and the nucleus are very large compared with the total number of signaling proteins, as often happens in real biological systems, the phosphorylation and dephosphorylation processes that lead to the transitions of the signaling molecules are uncorrelated. Each protein can then be modeled as an independent random walker. Our assumption of an enzyme-saturated situation makes the mathematics of the network relative simple and the problem of multiple but independent random walkers can be solved exactly. This exact solution allows several interesting properties of the network to be explored.
A key aspect characterizing signaling pathways is the time to achieve a response after receiving an upstream signal, i.e.,
the typical delay time between a stimulus and the corresponding response. This is a stochastic quantity. The response occurs
when one of the random walkers successfully binds to the DNA. To quantify this, we may consider the FPT for a random walker
starting from the initial position r⃗i, arriving at the final position r⃗f for the first time. F(r⃗i, r⃗f, t) is the probability distribution of such a random walker, initially in r⃗i, whose FPT of reaching the final position r⃗f is time t. F(r⃗i, r⃗f, t) is related to the occupancy probability P(r⃗i, r⃗f, t), which is the probability that a particle is found at the position r⃗f at time t irrespective of when it arrived. This relation is
Both the FPT probability F(r⃗i, r⃗f, τ) and occupancy probability P(r⃗f, r⃗f, t) are normalized through ∫ dtF(r⃗i, r⃗f, t) = 1 and ∫ dr⃗fP(r⃗i, r⃗f, t) = 1. P
(s)(r⃗, r⃗, t) is the occupancy probability with the identical initial and final position, i.e., the chance of a particle staying at and
returning to the same position r⃗ after time t. In terms of Laplace transforms, the above equation can then be rewritten as
The survival probability is the probability that, up to time t, the random walker still has never reached the target position r⃗f, which is represented as S(r⃗i, r⃗f, t). By the definition, the survival probability S is
The above FPT probability and survival probability are formulated for a single particle; however, it is straightforward to
expand this to the multiparticle case if there is no interaction between random walkers. In the case of large number of enzymes,
multiple particles move independently and thus the probability for having all N particles can be obtained by multiplying the survival probabilities for each single particle, i.e.
where F(r⃗i,r⃗f, t; 1) and S(r⃗i,r⃗f, t; 1) are the FPT and the survival probability for a single particle. The probability of having exactly z of total N particles in the position r⃗f at time t irrespective of their arrivals is
One defines the accumulated FPT probability Fac(r⃗i, r⃗f, t; N, z) as the probability that at time t
z of the total N particles have all arrived at the destination for the first time by time t. The expression is
We may also define the simultaneous FPT probability Fsi(r⃗i, r⃗f, t; N, z), which is the probability that at time t
z of the total N particles simultaneously arrived at the destination for the first time. The corresponding expression is the same as that
for single particle case, i.e.,
Results
It is easy to assign different rates to each step and carry out the calculations. For simplicity, we will first assume uniform forward reaction rates αf as well as uniform backward rates αb and catalyzed rates αc for all phosphorylation and dephosphorylation events. We also assume that the transportation rates k + i increase evenly and k − i decrease evenly with the increase of the number of unphosphorylated sites i, i.e., k + i = k + M(γ+)i−M, k − i = k − 0(γ−)−i (i = 0, 1, 2, …, M). This assumption captures the empirical observation that fully dephospharylated NFAT is much easier to transport from the cytoplasm into the nucleus than phospharylated NFAT. In the nucleus, the fully phospharylated NFAT is most easily transported to the cytoplasm (23).
Comparison of the Exact Solution with Simulation.
Fig. 2 illustrates three trajectories taken from a Monte Carlo simulation of a signaling protein traveling from the initial fully phospharylated state C 0 in the cytoplasm to the fully dephosphorylated state UM in the nucleus (25, 26). The red diamonds in Fig. 2 indicate that the protein is found in the cytoplasm regardless of its specific form (Ci, C̄i, or Ĉi), whereas the green dots indicate that the protein is found in the nucleus regardless of its specific form (Ui, Ūi, or Ûi). Transitions between red and green sites indicate a transversal across the cytoplasm–nucleus barrier, whereas up and down transitions indicate phosphorylation and dephosphorylation events.
Three typical trajectories of a random walker traveling from C 0 to UM are plotted as time vs. site number, i.e., the state of dephospharylation. The red diamonds label a protein in the cytoplasmic form (Ci, C̄i or Ĉi), whereas the green dots label the nuclear form (Ui, Ūi, or Ûi). The parameters are chosen as follows: the phosphorylation site number M = 5, the forward reaction rates α′f = βf = β′f = αf = 0.2, the backward rates α′b = βb = β′b = αb = 1.0, the catalyzed rates α′c = βc = β′c = αc = 1.0, the transport rates k + M = k− 0 = 0.2, and the ratio of transport rates γ+ = γ− = 2.718.
In Fig. 3, we compare the mean FPT probability and survival probability from the exact solution with those from the Monte Carlo simulations. Fig. 3 Left shows several distributions of the FPT probability computed from the exact solutions (solid lines) and those computed from the stochastic simulations (broken lines) with the same parameters; Fig. 3 Right shows the corresponding survival probabilities from the exact solutions (red broken lines) and those from simulations (crosses). These simulation results agree very well with the exact solutions.
Comparisons of the probabilities of the FPT and survival probabilities. (Left) Probability vs. the FPT. (Right) The survival probability vs. time. Parameters are the same as those in Fig. 2.
An exact solution can be found not only for the steady state distribution but also for the dynamics away from the steady states. Table 1 shows the occupancy probability distribution at time t = 50 s when the network is far from the steady state. The difference between the exact solution and simulation is ≈2%, which is essentially the sampling error. Fig. 4 further explores the network dynamics. For a network with size M = 5, the walker initially resides in the C 0 state (t = 0 s) but propagates to other states by time t = 10 s and t = 50 s. Eventually the probability reaches a steady profile. The last time shown, t = 1,500 s, is much later than the mean FPT, 180 s.
Probability distribution of a random walker at time t = 50
Dynamic propagation of the probability distribution. In the site axis, the C(i) and U(i) (i = 0, 1, …, 5), defined as C(i) = Ci + C̄i + Ĉi and U(i) = Ui + Ūi + Ûi, are collected terms representing the probabilities of a protein at the i phosphorylation state in the C and U group, respectively. Four time slices t = 0, 10, 50, and 1,500 are chosen to show the dynamic propagation. The parameters are chosen the same as those in Fig. 2.
We can also compare the model's predications with laboratory experiments carried by Dolmetsch et al. (27). These experiments measure differential NFAT activation as a function of the amplitude and duration of a calcium stimulus. Their work uncovered three different response patterns of the nuclear fraction of total NFATs that result from different stimulus (spike followed by plateau, a single spike and a low-level plateau). In Fig. 5, stimuli similar to experimentally used inputs (Fig. 5 a) are entrained to our model and are also shown to result in three response patterns (Fig. 5 b). The predicted patterns agree well with those seen in the experimental studies (Fig. 5 b Inset).
Comparisons of experiments with the model. (a) The concentration of phosphatases in cytoplasm is as the input with three different patterns: spike followed by plateau (solid line), a single spike (dotted line), and a low-level plateau (broken line), which are the same as experimental stimuli (27). (b) Fractions of proteins in nuclear forms predicted from the model and those from experiments. Different lines correspond to different stimuli. The curves in the inset are experimental results from Dolmetsch et al. (27). Parameters are chosen as: the phosphorylation site number M = 5, the forward reaction rates βf = 1.0, α′f = β′f = 0.1, the backward rates αb = βb = 1.0, α′b = β′b = 0.1, the catalyzed rates αc = βc = 10.0, α′c = β′c = 10.0, the transport rates k + M = 0.6, k − 0 = 0.1, and the ratio of transport rates γ+ = γ− = 2.718.
An Optimal Forward Reaction Rate Favors the Passage.
Many studies have highlighted the efficiency, sensitivity, and robustness of signal transduction networks (1, 4, 28). In this regard, the ladder network exhibits an interesting property, the existence of an optimal value for the forward reaction rates. Fig. 6 shows the mean, the most probable, and the root-mean-square of the FPT of a signaling protein from the C 0 state to the U M. Clearly, there is an optimal forward reaction rate for the passage: The optimum occurs at αf ≃ 1 for the mean FPT, but the optimal values of αf for the most probable and the root-mean-square passage times are ≈2.
An optimal forward reaction rate αf favors efficient passage. The horizontal axis is the forward reaction rate swept from 0.05 to 50. The three curves represent the mean FPT, the root mean square FPT, and the most probable FPT according to the forward rate αf. Other parameters are fixed as the phosphorylation site number M = 5, the forward reaction rates α′f = βf = β′f = αf, the backward rates α′b = βb = β′b = αb = 1.0, the catalyzed rates α′c = βc = β′c = αc = 1.0, the transport rates k + M = k − 0 = 0.2, and the ratio of transport rates γ+ = γ− = 2.718.
The existence of this optimum may seem to conflict with the intuition that the higher enzyme concentration is, the shorter the passage time is. When the forward reaction rates are very slow, the forward reactions do indeed constitute a bottleneck. However, with increasing forward reaction rates, the walkers will be found more and more often in the signaling protein–enzyme complex forms; this helps signaling proteins move toward dephosphorylated states. Eventually, when the forward reaction rate is too large compared with the transportation rates k + and k −, the signaling proteins will then spend most of their time in the transport incompetent complex forms (C̄i, Ĉi, Ūi, or Ûi) rather than in the apo forms required for transport. This ultimately leads to slower transport between the cytoplasm and the nucleus.
This intriguing phenomenon could also be qualitatively explained as follows. The passage time of a signaling protein from
the fully phosphorylated state in the cytoplasm (C
0) to the fully dephosphorylated state in the nuclear (UM) consists of two parts: the time for dephosphorylation (Td) and the time for transport (Tt). The typical time from site A to its neighbor B can be estimated by adding the inverse rates using the “one-step-forward”
approximation. Assuming there are total l neighbors Ci, i = 1, …, l (including B = Cj) that A can directly jump to with rate ki, the typical time is ((kj/Σki) × kj)−1. For each dephosphorylation step from state i to i + 1, this estimate yields a time (2αf + k
+
i)/αf
2 + (αb+αc)/αc
2. The total dephosphorylation time is Td = M × [(2αf + k̄
+)/αf
2 + (αb+αc)/αc
2], where k̄
+ is the average of k
+
i values. The transport time from Ci to Ui can be approximated as (2αf + k
+
i)/k
+i
2 based on similar approximations. Because the transport could happen in any Ci state, the mean transportation time requires averaging over all i, which results in a time Tt = (2αf + k̄
+)/k̄
+
2. The total mean passage time can therefore be approximated as
The existence of an optimal forward reaction rate requires the existence of a solution to the equation ∂T/∂αf = 0 has solution(s). As a result, we must have the equation αf
3 − Mk̄
+
2αf − Mk̄
+
3 = 0. With the parameters shown in Fig. 6, the optimal forward rate is estimated to be 0.49, which is comparable with the exact solution (≃1) as shown in Fig. 6.
Asymmetry of the Reaction Network Affects the Passage.
Although the ladder network is symmetrical in topology, the reaction rates in the cytoplasm and nucleus are quite different owing to different availabilities of appropriate enzymes. To illustrate the effect of the network's asymmetry on signal transduction, we employ two parameters to probe this: the phosphorylation asymmetry parameter and the transport asymmetry parameter. The phosphorylation asymmetry parameter is defined as θ = a′f/af = a′b/ab = a′c/ac = b′f/bf = b′b/bb = b′c/bc, where θ ≤ 1, which characterizes the preference for undergoing dephosphorylation compared with the phosphorylation processes in the different environments. The other parameter, transport asymmetry parameter φ, is defined as φ = af/bf = ab/bb = ac/bc = a′f/b′f = a′b/b′b = a′c/b′c, which characterizes the relative activities of reactions in the cytoplasm compared with the corresponding reactions in the nucleus.
Fig. 7 illustrates the effects of these asymmetries on the network behavior. Fig. 7 Left shows that increasing the phosphorylation asymmetry slows down signaling; Fig. 7 Right shows that increasing the transport asymmetry speeds up signaling.
Effect of the asymmetry of the reaction networks to the passage time. (Left) A plot of characteristic times as functions of the parallel asymmetry parameter θ. The other parameters are fixed at the phosphorylation site number M = 5, the forward reaction rates αf = βf = 0.2, the backward rates αb = βb = 1.0, the catalyzed rates αc = βc = 1.0, the transport rates k + M = k − 0 = 0.2, and the ratio of transport rates γ+ = γ− = 2.718. (Right) The plot of characteristic times as functions of rung asymmetry parameter φ. The other parameters are fixed at M = 5, α′f = β′f = 0.2, α′b = β′b = 1.0, α′c = β′c = 1.0, k + M = k − 0 = 0.2, γ+ = γ− = 2.718.
The FPT Distribution Has a Long Tail.
The probability distribution of the FPT has a long tail due to the network's hierarchical structure. In the limits of either large or small forward reaction rates, the probability distribution is very flat and the tail can be extremely long.
With the long-tail distribution, the mean FPT can be greatly different from the most probable FPT. In Fig. 3, the most probable FPTs for αf = 0.5, 0.2, 0.1 are 40, 60, and 100 s, respectively. However, the ratio of the mean FPT and the most probable FPT ranges from on the order of 1 to the order of 3 with the chosen set of parameters. Fig. 6 shows that the ratio of the most probable FPT and mean FPT is large even on a logarithm scale.
Conclusions and Discussions
In this paper, we studied a general signal transduction network-reaction ladder network that models multiple-site phosphorylation and cytonuclear transport. As often happens for real networks, the enzymes are assumed to be abundant, and so each signaling protein independently wanders through its various states of phosphorylation and location in compartments. Except for the last binding step, the network is effectively linear and therefore can be solved exactly. This exact solution is confirmed by using the Monte Carlo simulation. Even this simple network exhibits several interesting stochastic features. It exhibits a long tail of the probability distribution for the signaling time and there is an optimal forward reaction rate that favors speedy signaling. This optimum suggests there may be an optimal amounts of enzymes for efficient signal transduction.
When available enzymes are not abundant, the random walkers on the network become correlated because they must share the limited resources of available enzymes. Nonlinearity then starts to play an important role throughout the process not just in the acquiring last step of the final target. In this situation when an enzyme is bound, several phosphorylation or dephosphorylation events will occur before unbinding, similar to the processive aspect of transcription (29). It will be interesting to study such scenarios and to study the effects of oscillatory upstream signaling molecules on the network. These will help us further understanding the phenomenology and quantitative design criteria of effective signal transduction mechanisms.
Acknowledgments
This work is supported by the Center for Theoretical Biological Physics through National Science Foundation Grants PHY0216576 and PHY0225630 and National Institutes of Health Grant GM69811-01.
Footnotes
- ‖To whom correspondence should be addressed. E-mail: pwolynes{at}ucsd.edu
-
Author contributions: T.L., T.S., J.H., and P.G.W. designed research; T.L., T.S., and C.Z. performed research; T.L., T.S., and C.Z. analyzed data; and T.L., T.S., C.Z., J.H., and P.G.W. wrote the paper.
-
The authors declare no conflict of interest.
- Abbreviations:
- NFAT,
- nuclear factor activation of T cells;
- NLS,
- nuclear localization sequence;
- FPT,
- first passage time
- © 2006 by The National Academy of Sciences of the USA











