New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
 Agricultural Sciences
 Anthropology
 Applied Biological Sciences
 Biochemistry
 Biophysics and Computational Biology
 Cell Biology
 Developmental Biology
 Ecology
 Environmental Sciences
 Evolution
 Genetics
 Immunology and Inflammation
 Medical Sciences
 Microbiology
 Neuroscience
 Pharmacology
 Physiology
 Plant Biology
 Population Biology
 Psychological and Cognitive Sciences
 Sustainability Science
 Systems Biology
Using independent opentoclosed transitions to simplify aggregated Markov models of ion channel gating kinetics

Edited by Harry L. Swinney, University of Texas, Austin, TX, and approved March 17, 2005 (received for review December 8, 2004)
This article has a correction. Please see:
Abstract
Deducing plausible reaction schemes from singlechannel current traces is timeconsuming and difficult. The goal is to find the simplest scheme that fits the data, but there are many ways to connect even a small number of states (>2 million schemes with four open and four closed states). Many schemes make identical predictions. An exhaustive search over model space does not address the many equivalent schemes that will result. We have found a canonical form that can express all reaction schemes for binary channels. This form has the minimal number of rate constants for any rank (number of independent openclosed transitions), unlike other canonical forms such as the well established “uncoupled” scheme. Because all of the interconductance transitions in the new form are independent, we refer to it as the manifest interconductance rank (MIR) form. In the case of four open and four closed states, there are four MIR form schemes, corresponding to ranks 14. For many models proposed in the literature for specific ion channels, the equivalent MIR form has dramatically fewer links than the uncoupled form. By using the MIR form we prove that all rank 1 topologies with a given number of open and closed states make identical predictions in steady state, thus narrowing the search space for simple models. Moreover, we prove that fitting to canonical form preserves detailed balance. We also propose an efficient hierarchical algorithm for searching for the simplest possible model consistent with a given data set.
Ion channels participate in many cell signaling pathways. They allow spatial and temporal control of calcium ions that regulate processes as diverse as fertilization, memory, muscle contraction (1), and immune response (2, 3). Abnormal channel function is implicated in pathologies such as cystic fibrosis (4), ataxia (5), and overresponse to anesthesia (6).
Patchclamp recordings (7) reveal that ion channels open and close spontaneously and that the kinetics can be complex. This complexity can be ascribed to multiple open and closed states, corresponding to different channel conformations. Changes in conformation may occur because of the intrinsic dynamics of the protein or in response to external events such as ligand binding. The pathways and transition rates between the various conformation states constitute an intrinsic component of the cell's signalprocessing apparatus. Therefore, the study of gating kinetics can shed light on both normal and abnormal cell signaling.
To analyze a given channel, discrete Markov models (816) are fit to patchclamp recordings of its current. However, it is known that different models can make identical predictions of steadystate gating statistics (17, 18), so that attempts to find the bestfitting model can uncover many models that all fit equally well. We introduce a class of models that avoids this problem; when restricting to these models, the mapping between model parameters and gating statistics is onetoone. These models allow for a physical interpretation of the interconductance “rank,” a quantity that can be extracted directly from the data, and that corresponds to the number of independent routes between open and closed states. Furthermore, these models allow optimal fitting of data with a minimum number of parameters.
A Markov model for the dynamics of a channel consists of a reaction scheme or topology of allowed transitions between states, together with rate constants for these transitions. For the typical case of binary channels studied here, all open states have the same current, so the states can each be labeled by “O” for open or “C” for closed. The indistinguishability of the closed (or open) states from one another causes ambiguity in fitting models to data. Kienker (17) showed that the two topologies CCO and COC are “equivalent”: for every set of rates for one there is a set of rates for the other such that both have identical steadystate behavior.
The space of topologies grows rapidly with the number of states. Our goal is to provide tools and strategies for navigating this space, allowing researchers to focus on a small, tractable set. The fitting of rate constants then can be carried out with existing ion channel analysis software, such as qub (19), that is capable of handling any desired topology.
Methods
Here, we discuss the established concepts used in this work. Beyond these concepts, we allude to the mathematical techniques used after each result. Detailed proofs are contained in the supporting information, which is published on the PNAS web site. These proofs mainly depend on linear algebra, especially Kienker transformations (17), which are briefly reviewed there as well.
Most of the results presented here have exceptions which are, literally, infinitely uncommon. These exceptions happen when certain relationships among the rate constants cause degeneracies, such as two exactly identical decay rates in the opentime distribution. In the statements of the theorems below, those exceptions are indicated by the mathematical terms “almost all” or “almost every,” meaning “with infinitely rare exceptions.” Throughout, the term “topology” means the reaction scheme, usually displayed in a state transition diagram, without regard to reaction rates. “Model” means a Markov model with a particular topology and a particular set of reaction rates.
Equivalence and Canonical Forms. There are two related notions of equivalence: equivalent models and equivalent topologies. “Equivalent models” have identical steadystate dwelltime distributions. “Equivalent topologies” are capable of modeling the same range of steadystate behaviors, and therefore almost every model with one topology is equivalent to some model with the other (as in Definition 1 in the supporting information). Equivalent topologies will have the same number N_{o} of open states and the same number N_{c} of closed states, and in fact they must have the same number of “independent” OC links (defined below). Beyond these constraints, it has so far been nontrivial to say whether two topologies are equivalent. Steadystate experiments can never distinguish among equivalent topologies.
Any collection of equivalent models may be viewed as ways in which different dynamics can generate the same steadystate statistics. Any prescription for expressing any set of equivalent entities in a unique way is called a “canonical form.” Representing a model in canonical form means finding an equivalent model with canonical topology, which should always be feasible for a good canonical form. The choice of canonical forms has previously been limited to just two, as discussed below, for which it was known how to find canonical models equivalent to almost any model.
Equivalent models are equivalent to a single model in canonical form, and this fact can be used to test whether two models are equivalent. Equivalent models will fit data equally well, regardless of the fitting technique (e.g., maximum likelihood or least squares). To obtain the best possible fit to steadystate data, it suffices to fit the data to models in canonical topologies, because models in other topologies are always equivalent to canonical models.
Known Canonical Forms. We will compare the canonical form introduced below to the canonical form described by Bauer et al. (12) and Kienker (17), which we refer to as BauerKienker uncoupled (BKU) form. In BKU form, every open state is connected to every closed state and vice versa, but there are no opentoopen or closedtoclosed connections. This form has 2N_{o}N_{c} rate constants.
Physical models must have positive reaction rates. Negative reaction rates can lead to negative probabilities for individual states and consequently are difficult to interpret. Kienker noted (17) that models equivalent to a physical model can have negative rates. For example, it can be shown that any fourstate model in BKU form must have negative rates to be equivalent to any physical model with topology COOC (see COOC Has Negative Rates in BKU Form in the supporting information). This result implies that negative rates should be allowed when fitting data to a canonical form. If the resulting canonical model is nonphysical, it can still be viewed as a parsimonious representation of the data using a generalized Markov process (20). Below, we show that negative rates can sometimes be avoided.
If the principle of detailed balance (DB) is not obeyed, BKU form models may require rates that are complex numbers to achieve the best fit. A generally applicable canonical form given by Larget (18) is similar to BKU form. Both have the same number of parameters, and both minimize the number of intraconductance reactions. In Larget's form, complex numbers are never required, unlike in BKU form. Larget's form always has negative rates, whereas the complex rates in BKU form can be removed by a slight change in the form that adds no more parameters (see Real and Complex Rates in the supporting information).
DwellTime Distributions. The dwelltime distribution f_{o} (t_{o} ) is the probability density that the time interval for which the channel is open has duration t_{o} . Likewise, f_{oc} (t_{o}, t_{c} ) is the probability density consecutive open and closed intervals have durations t_{o} and t_{c} . Colquhoun et al. (8, 9) derived expressions for these distributions in terms of the “generator matrix,” Q_{ij} , which is the matrix of transition rates between the various states. Fredkin et al. (10, 11) showed that f_{oc} and f_{co} are sufficient to fully characterize the steadystate data, under certain technical conditions that are almost always satisfied. All of the dwelltime distributions are sums of decaying exponentials. For example,
where λ _{i} and ω _{j} are the open and closed decay constants, respectively, and α _{ij} is a matrix of amplitudes.
Identifiability. If different models make identical predictions about a particular class of experiments, the models are not “identifiable” with respect to those experiments. A topology is identifiable provided any changes to its rate constants result in changes to the steadystate dwelltime distributions. Conversely, topologies that have more parameters than there are degrees of freedom in the data are not identifiable. In this case, there is a continuum of rate constants for the topology that all yield identical statistics, and thus values for these rates cannot be uniquely determined from steadystate data, regardless of the quantity or quality of data. A trivial example of a nonidentifiable topology is one with two closed states and no open states. Obviously, patchclamp current traces could not be used to infer the transition rates between the two states.
It has long been known (10) that there are at most 2N_{o}N_{c} degrees of freedom in steadystate data. Therefore, topologies with more than N_{o}N_{c} links are not identifiable. The bound on the number of degrees of freedom can be improved (11) to 2R(N_{o} + N_{c}  R), where R is the interconductance rank defined below. Topologies with more than this many rates are not identifiable; topologies with fewer rates may or may not be identifiable. The problem of determining which topologies are identifiable remains unsolved (10).
Interconductance Rank. The canonical form that we introduce in this work makes explicit use of R, the interconductance rank. Before giving the physical interpretation of rank, we review the mathematical definition. The rank of a model is defined as the matrix rank (number of independent rows or columns) of the matrix Q_{oc} , which is the block of Q_{ij}
representing open to closed transitions. We assume the ranks of Q_{oc} and Q_{co} are equal, which is generally true for reversible reactions. These ranks also equal the matrix rank of α _{ij} , which can be estimated directly from the data without reference to a model (11). Therefore, equivalent models must have the same interconductance rank. The rank of a topology is defined as the maximal rank of any model with that topology, and this rank equals the number of “independent” OC links, where different independent links do not link to the same state; this equality is proved in Rank and Independent Reactions in the supporting information.
If a model is fit to data, the rank of the resulting model will equal either the rank of the data (α _{ij} ) or the rank of the topology of the model, whichever is smaller. For a model to have plausibly generated the data, the two ranks should be equal. If the topology used for the fitted model has a larger rank, then there will be linear dependencies among the rates that cannot be justified biochemically, unless there is a symmetry that can explain them.
The ranks of a few illustrative topologies are shown in Fig. 1. The rank of the topology in Fig. 1a is 1. The topologies shown in Fig. 1 bd all have rank 2. Note that in Fig. 1d there are three Os connected to Cs and three Cs connected to Os, but the rank of the topology is only 2. Fredkin and Rice (11) noted that the lesser of the number of Os connected to Cs and the number of Cs connected to Os is an upper bound on a topology's rank. Fig. 1d shows that this bound is not tight; note that no three of the OC links are independent. Incidentally, this example is a subtopology (or special case) of BKU form, because it has no OO or CC links.
It can be shown (Theorem 8 in the supporting information) that when rank 1 data are fit to BKU form, the resulting number of nonzero rate parameters is 2 N_{o}N_{c} , far more than the 2(N_{o} + N_{c}  1) necessary when R = 1. In this case the rank of the topology is the lesser of N_{o} and N_{c} , whereas the rank of the model is 1. If N_{o} and N_{c} are both >1, then it is generally not plausible that rank 1 data were generated by a BKU form model; the model may fit the data, but there is no explanation for the linear dependencies of the rates. Thus, BKU form often obscures the rank, uses more links than necessary, and usually does not lead to plausible models.
Results
Manifest Interconductance Rank (MIR) Form. MIR form is defined by the following requirements: (i) all interconductance links are independent, and (ii) all intraconductance links involve at least one state with an interconductance link. The MIR form is illustrated in Figs. 2b and 3b . Fig. 2b shows how states can be arranged into four columns. The two central columns contain the same number of states, with Cs in one and Os in the other. These states are linked in horizontal pairs to form the R interconductance OC links. The remaining Cs and Os are in the first and last columns, respectively. Every C in the second column is connected to every other C, and Cs in the first column are not connected to each other. Connections for the Os follows the same pattern. The total number of reactions in MIR form is R(N_{o} + N_{c}  R), which gives the Fredkin bound of 2R(N_{o} +N_{c}  R) independent rate constants that can be recovered from steadystate data. The MIR form can be constructed for any rank up to the maximum possible rank (the lesser of N_{o} and N_{c} ). Because all interconductance links are independent, the rank of the topology equals the number of OC links. The maximumrank MIR form has all lower rank MIR forms as subtopologies, making the fullrank MIR form as powerful as BKU form. A comparison of the two forms expressed as matrices of rate parameters is shown schematically in Fig. 3, in a case where the rank is not full.
There are two steps required to show that MIR form is canonical. The first step is to show that for almost any model, there is an equivalent model in canonical form and to give a prescription for how to find that model.
Theorem. Almost every model can be expressed in MIR form.
Mathematically, the MIR form corresponds to “diagonalizing” the interconductance transition matrices Q_{oc} and Q_{co} . This transformation is almost always possible. The nondegeneracy conditions that comprise the exceptions are comparable to ones required for BKU form (see Manifest Interconductance Rank Form in the supporting information). The required diagonalization is performed by first diagonalizing the products Q_{co}Q_{oc} and Q_{oc}Q_{co} . The mathematics of this procedure have been exploited previously in other fields such as canonical correlation analysis (21) and partial leastsquares (22).
The topologies in Fig. 2 both have N_{o} = 5, N_{c} = 6. The rank of the MIR form topology in Fig. 2b is given by R = 3. The rank of the BKU form topology is fixed by R = min(N_{o}, N_{c} ), which is 5 in this case. In BKU form (Fig. 2a ), there are 30 reversible transitions, whereas in MIR form (Fig. 2b ), there are only 24. If one were to fit data generated by a rank3 topology to BKU form, one could recover as many as 60 reaction rates (30 reactions), but these parameters would not all be independent. If one were to fit the same data to MIR form, one would recover at most 48 parameters (24 reactions), and these parameters would almost always be independent.
The second part of showing that MIR form is canonical is proving that its topology is identifiable. If this condition were not true, then expression of a model in MIR form would not be unique. We state our result as follows.
Theorem. MIR and BKU forms are identifiable for almost all parameters.
Proofs for the two forms are presented as Theorems 1 and 2 in the supporting information. Both are related to the fact that there is only one normalized similarity transformation, neglecting permutations, that can diagonalize a nondegenerate matrix.
Although generically MIR form has fewer links than BKU form, there are special topologies that have fewer links when expressed in BKU form. One example is the topology in Fig. 1d . This topology can be constructed by removing four links from the general BKU form, leaving the five links shown. Equivalent models in MIR form have eight links. Conversely, starting from MIR form and removing links results in topologies requiring more links when expressed in BKU form, regardless of rank. Thus, if models in nature tend to resemble simplified special cases of one form or the other, fitting to that form may result in simpler models, as the following examples will show.
Many channel models in the literature conform or nearly conform to MIR form. In Table 1, we count the number of MIR and BKUcompatible reactions in four publications by different authors. Compatible reactions are those that are in the equivalent model in canonical form. A topology with all reactions compatible is already in the given canonical form. We shall see that when only a few reactions are not compatible, the canonical version of the model may have only a few more links than the original model.
Of the 20 models discussed in these articles, 7 are in MIR form, and none are in BKU form. Also, models in the literature tend to have substantially more MIRcompatible than BKUcompatible links. Fig. 4 illustrates two of the cases where only one link was not compatible with MIR form.
Fig. 4 shows both proposed topologies, together with the MIR and BKU form topologies that can model the same data. These topologies are found by the transformation techniques described in Manifest Interconductance Rank Form and Kienker Transformations in the supporting information, and they will give the same results for almost any rate constants. In Fig. 4a , the resulting MIR form requires only two more links than the original topology to fit the same data. BKU form requires an additional five links as shown. Similarly, Fig. 4b shows a proposed topology with 14 links, only 1 of which is not MIRform compatible. The MIR form topology has 15 links. The BKU form topology has 36 links.
In both examples in Fig. 4, when either canonical form is used, the number of links will be less than the R(N_{o} + N_{c}  R) bound, because both topologies are sparsely linked to begin with. However, using MIR form results in significantly fewer links because these models are topologically close to being simplified special cases of MIR form.
The tendency for models in the literature to resemble MIR form more than BKU form may indicate a tacit belief among researchers that nonindependent OC links are rare. This belief implies that usually, when the channel opens from a particular closed state, it arrives in a particular open state.
Canonical Forms and Detailed Balance. At thermodynamic equilibrium, chemical reaction kinetics are governed by the principle of microscopic reversibility, or DB. DB implies no net circulation around reaction cycles at steady state. One proposed mechanism for voltagegated ion channels explicitly violates DB and consumes free energy (23); note that at thermodynamic equilibrium, the membrane potential would be zero. Several authors have discussed tests for DB violations in ion channel data (2427). We have proven (see Theorem 5 in the supporting information) the following.
Theorem. If a model satisfies DB, then the equivalent models in both BKU and MIR form satisfy DB.
However, note that for any model satisfying DB, there are many models equivalent to it that violate DB. Our result implies that if violations of DB are required to fit data using canonical form, then there is no model obeying DB that can fit the data as well.
Rank 1 Topologies. Topologies with only one independent interconductance transition, implying a “gateway” state between open and closed states, are rank 1 topologies. As stated previously, negative rate constants may be required to find a canonical form model that is equivalent to some physical models. Rank 1 topologies do not suffer from this complication.
Theorem. Nonnegativity of rates is preserved when a rank 1 model satisfying DB is expressed in either MIR or BKU forms.
The proof (see Theorem 7 in the supporting information) relies on the special structure of these canonical forms in the rank 1 case. The proof is easy compared with the related result (28) that amplitudes α _{ij} in the dwelltime distribution f_{oc} (see Eq. 1) are positive if the open and closed dwell times are uncorrelated and DB is satisfied. Uncorrelated open and closed dwell times imply that f_{oc} (t_{o}, t_{c} ) = f_{o} (t_{o} ) f_{c} (t_{c} ), which is true for all rank 1 models (10). This fact is intuitive; if there is only a single gateway state linking the Os and Cs in a Markov model, there is no way to generate correlated open and closed times, because an individual Markov state has no memory. Note that the theorem does not apply to rank 2 topologies, such as the COOC example described earlier that does have negative rates in BKU form (but is already in MIR form).
The above result is not the only simplification that applies in the rank 1 case. We have proven for fixed N_{o} and N_{c} the following.
Theorem. All rank 1 topologies are equivalent.
This result is somewhat surprising. It does not hold for rank 2 (see None of COOC, OCCO, and COCO are equivalent to each other in the supporting information). The rank 1 equivalence proof uses induction on the number of open and closed states (see Theorem 6 in the supporting information). The proof also can be adapted to show that given a rank 1 model with nonnegative rates obeying DB, every rank 1 topology has an equivalent model with nonnegative rates. This means that the “all rank 1 topologies are equivalent” result holds when restricting to physical models if DB is satisfied. It also implies that the presence of a negative rate in a rank 1 model indicates DB violation in any equivalent physical model. This is another result that does not hold for rank 2.
The MIR rank 1 form has only N_{o} + N_{c}  1 bidirectional reactions as shown in Fig. 5. The BKU rank 1 form has N_{o}N_{c} bidirectional reactions, none of which can have a zero rate (see Theorem 8 in the supporting information). Thus, BKU form maximally obscures the rank; a rank 1 model will have a topology with the largest possible rank for given N_{o} and N_{c} .
Discussion
The four main results presented here reduce the amount of work needed to find simple models that can account for steadystate channel data. Fitting to MIR canonical form produces simpler models than in BKU form. If a fit to canonical form violates DB, then there is no equivalent model in which DB is satisfied. All rank 1 topologies are equivalent, so they all will fit steadystate data equally well. For rank 1, if DB is satisfied, fits to canonical form can be restricted to positive rates. Thus, in the rank 1 case, the statistical test of whether DB is violated is reduced to testing whether any rates are significantly negative. The rank 1 equivalence and positivity results explain reports in the literature that best fits of data to various rank 1 topologies all yield the same likelihood (16).
Canonical form reaction rates that yield the best fit to the data can be complex numbers if DB is violated. To ensure that the best fit has been obtained, either Larget's canonical form should be fit (real rates suffice) or MIR or BKU form can be fit with complex eigenvalues (see Real and Complex Rates in the supporting information). In any event, negative reaction rates are necessary to ensure the best fit, except in the rank 1 DB case.
Fredkin et al. (10) suggested a procedure for estimating rank from covariance functions of the data. MIR form allows rank to be estimated by using the statistical framework of nested models. Here, nested means that every MIR topology contains the lowerrank MIR topologies as special cases. For example, the rank 3 MIR topology in Fig. 2b can be converted to rank 2 by setting the rates for six bidirectional reactions to zero: the links connecting the bottommost C and O states to different columns.
MIR form has the fewest reactions that a canonical form can have. A form with fewer would have less parameters than Fredkin and Rice's calculation of the generic number of parameters in the data (11). In contrast, the well known BKU “uncoupled” form has the most reactions a canonical form can have. A form with more rates would not be identifiable (10).
We have seen that models in the literature are closer to MIR than to BKU form. If data are generated by a process with fewer than R(N_{o} + N_{c}  R) reactions, then fitting to a topology similar to the true topology can yield a model that also has fewer reactions. For example, the topology in Fig. 4b can be fit by using 15 reactions in MIR form but would require 36 reactions in BKU form.
The reason that MIRlike models are so much more prevalent in the literature than BKUlike models may be because different O and C states often represent different states of ligand binding that persist when the protein opens or closes. Such an O state is expected to link to only one C state and vice versa. In contrast, many O states will be linked to each other (as will C states), but there are no such links in BKU form.
Recent determination of atomic resolution structures of some ion channels have made possible molecular dynamics simulations of the opening or closing of these channels. These simulations are far too short (tens of nanoseconds) to compare with current traces, which describe events on the millisecond scale and longer. However, it could be possible to use numerical techniques to identify quasistable conformations of an ion channel that are candidates for the multiple open and closed states evidenced by patchclamp experiments. To the extent that some of the relevant coordinates are preserved when the channel opens and closes, MIR form should allow simplification of the model, beyond any guaranteed simplification owing to the rank, as was seen in Table 1 and Fig. 4. If a ligand can be bound with the channel open or closed, then the ligandbinding coordinate would constitute an example of such preservation.
Atomic structures also can inform the choice of N_{o} and N_{c} , based on the number of predicted ligandbinding sites, for example. Symmetries observed in the structure should also be reflected in the model, although the implications for the canonical forms are not clear. Molecular dynamics simulations also might shed light on the number of independent pathways between open and closed states, which should then be reflected in the rank R of the model.
Exhaustive Search
The brute force approach to exhaustive search is to search over all topologies. The optimization criterion should take into account the quality of the fit and the simplicity of the model; an established one that has been applied in this setting is the Bayesian Information Criterion (BIC) (15). The brute force approach has the advantage that only positive rates are needed to find the greatest likelihood. It has the disadvantage that there are >10 trillion topologies for the case of 6 open and 6 closed states.
A more sophisticated exhaustive strategy would be to search over only one model in every class of equivalent models. This strategy avoids the problem of sorting through many models with exactly equal likelihoods. This approach still leaves a very large search space and is currently infeasible because it requires enumeration of the equivalence classes, which is unsolved. Because this approach relies on equivalence, negative rates must be allowed to guarantee that the highest likelihood is found. Nonequilibrium dynamical experiments might in principle be used to help choose among multiple models in the bestfitting equivalence class.
This strategy is very effective for rank 1 topologies because there is only one equivalence class for fixed N_{o} and N_{c} . The identifiable rank 1 topologies all have the same number of links, N_{o} + N_{c}  1, and thus the same BIC score. Moreover, they all have positive rates if DB is satisfied. Thus, there is no point in fitting to more than one rank 1 topology.
Whatever the search strategy, there is no point in searching nonidentifiable topologies, because they will always fit the data equally well in a continuum of ways and no better than the special case topologies that are identifiable. In general, it is not known which are the identifiable topologies, but we do know they can have no more than R(N_{o} + N_{c}  R) reactions, where R is the rank of the topology. Some of these topologies can be shown, using methods to be discussed elsewhere, to be nonidentifiable and may be omitted. Beyond rank 1, the space of such topologies is still exponentially large and contains many equivalences.
Hierarchical Search
We propose an alternative strategy. When fitting to a topology that contains the best topology as a subtopology, this fact should be indicated by very small rate constants for the links that should be removed. This observation suggests the following hierarchical search method:

Ascertain the rank, R, by successive fits to MIR form for the various ranks. (N_{o} and N_{c} are assumed known from the dwelltime distributions.)

Fit to all topologies with R(N_{o} + N_{c}  R) links.

For fits that yield links with small rates, remove all combinations of those links and reevaluate the simplicity criterion (such as BIC) for the resulting topologies.
Again, for the sake of efficiency, one can omit remaining nonidentifiable topologies, to the extent they are known. We emphasize that the hierarchical search only helps to find simpler models; the fit will not improve beyond the MIR fits in step 1. In fact, we believe that the quality of all fits in step two will be identical, according to the following proposition.
Conjecture. All identifiable rank R topologies with R(N_{o} + N_{c}  R) links are equivalent.
The number of equations to be solved to put a canonical form model into such a topology equals the number of unknowns. Thus, the conjecture is true if these equations are all mutually consistent. Although the conjecture does not address which topologies are identifiable, the nonidentifiable topologies will have a lower, or at best equal, likelihood score because some of their parameters have no effect on the fit.
Example. COOC.
Here, we discuss the hierarchical search strategy and the conjecture for the case N_{o} = N_{c} = 2. There are four rank 2 topologies with four links, as shown in Fig. 6. These four topologies are identifiable, and each is equivalent to the others (see Identifiability and Equivalence of all 4link Topologies with N_{o} = N_{c} = R = 2 in the supporting information). Moreover, they contain (as subtopologies) all identifiable topologies of all ranks (R = 1 and R = 2) with N_{o} = N_{c} = 2. All topologies with more links, or with four links but lower rank, are not identifiable. Consider the hypothetical search in which the true topology is COOC. From the onedimensional dwelltime distributions, we ascertain that N_{o} = N_{c} = 2. We fit the MIR form rank 2 topology (Fig. 6b ) and the MIR form rank 1 topology, CCOO, and find that the quality of fit is significantly better for the R = 2 MIR form. By using a fitting algorithm that allows negative rates, we fit the remaining topologies in Fig. 6, obtaining the same likelihood for all. The topologies shown in Fig. 6 a and d yield unphysical models with negative rates (see COOC Has Negative Rates in BKU Form in the supporting information). (If we had used fitting algorithm with positivity constraints, the two topologies a and d would have significantly lower likelihood than the other fourlink topologies.) The MIR form topology (Fig. 6b ) has a near zero rate for the CC reaction. The topology shown in Fig. 6c has a near zero rate for the diagonal CO link. Thus, fits to both topologies Fig. 6 b and c indicate COOC as a candidate topology. We now fit to COOC to find that COOC has the best BIC score, because it has six reaction rates compared with eight.
All of the fourlink topologies have identical BIC scores because they are all equivalent. As we fit longer records, two things happen: the BIC score for the COOC reaction improves relative to the 4link candidates, and simultaneously the two physical candidates both become closer to COOC. Thus, we recover COOC.
The benefits of this approach increase as the number of states increases, although it too becomes impractical fairly quickly. For N_{o} = 2 and N_{c} = 3, there are 22 rank 2 topologies with six links, at least two of which are not identifiable. For N_{o} = N_{c} = 3, there are 116 rank 3 topologies with 9 links, and for N_{o} = N_{c} = 4 there are 47,677 rank 4 topologies with 16 links. In the latter case, there are >2.5 million connected topologies, so restricting to the 16link topologies reduces the computational burden by a factor of >50. Unfortunately, we estimate that the search space for the hierarchical strategy contains 200 million topologies for N_{o} = N_{c} = 5 and 10 trillion for N_{o} = N_{c} = 6. The hierarchical strategy makes it practical to search up to N_{o} = N_{c} = 4.
The first step is always to estimate N_{o} and N_{c} . We have shown that the rank R is also of fundamental importance. All three quantities can in principle be extracted from current traces without reference to any specific topology. Other biochemical data also might help to constrain these values. MIR form allows one to make use of the rank and is closer than BKU form to most proposed topologies in the literature we examined. The number of topologies very close to MIR form, which will have few or no nonindependent OC links, is manageable. An effective strategy thus might be to start with MIR form and greedily move one link at a time, keeping the rank fixed.
The results presented here can be extended. In their current form they provide a number of useful tools for the identification of binary channel gating kinetics.
Note Added in Proof. It has come to our attention that Flomenbom et al. (32) give additional arguments and discussion relevant to the rank 1 equivalence theorem presented here.
Acknowledgments
J.E.P. thanks Andy Fraser for many useful conversations. We also thank Silvina Ponce Dawson, Bernardo Pando, Marty Golubitsky, Daniel DonOn Mak, and Werner Horsthemke for useful discussions. David Torney calculated the numbers of topologies mentioned in the discussion. We benefited from the qub course at State University of New York at Buffalo and from conversations with Fred Sachs, Feng Qin, and Anthony Auerbach and their students. We thank Frauke Rininsland and Catherine Macken for suggestions on the manuscript. This work was supported by National Institutes of Health Grant 1R01GM6583001 and Los Alamos National Laboratory LaboratoryDirected Research and Development Grant X1E8 under Department of Energy Contract W7405ENG36.
Footnotes

↵ † To whom correspondence should be addressed at: MS K710, Los Alamos National Laboratory, Los Alamos, NM 87545. Email: pearson{at}lanl.gov.

↵ * W.J.B., J.Y., and J.E.P. made significant contributions to this work.

This paper was submitted directly (Track II) to the PNAS office.

Abbreviations: BIC, Bayesian Information Criterion; BKU, BauerKienker uncoupled; DB, detailed balance; MIR, manifest interconductance rank.
References
 ↵
 ↵
 ↵
 ↵
 ↵
 ↵
 ↵
 ↵
 ↵

↵
Fredkin, D. R., Montal, M. & Rice, J. A. (1985) Proceedings of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer, eds. Le Cam, L. M. & Olshen, R. A. (Wadsworth, Belmont, CA), pp. 269289.
 ↵
 ↵

Grosman, C., Salamone, F. N., Sine, S. M. & Auerbach, A. (2000) J. Gen. Physiol. 116 , 327339. pmid:10962011

↵
Rosales, R. A., Fill, M. & Escobar, A. L. (2004) J. Gen. Physiol. 123 , 533553. pmid:15111644
 ↵
 ↵
 ↵
 ↵

↵
Balasubramanian, V. (1993) Equivalence and Reduction of Hidden Markov Models (MIT, Cambridge, MA), Technical Report AITR1370.
 ↵

↵
Wold, H. (1985) Partial Least Squares, eds. Kotz, S. & Johnson, N. L. (Wiley, New York), Vol. 6, pp. 581591.
 ↵
 ↵
 ↵
 ↵

Rothberg, B. S. & Magleby, K. L. (1999) J. Gen. Physiol. 114 , 93124. pmid:10398695
 ↵
 ↵

↵
Flomenbom, O., Klafter, J. & Szabo, A. (2005) Biophys. J., in press.
Citation Manager Formats
More Articles of This Classification
Physical Sciences
Applied Mathematics
Biological Sciences
Related Content
Cited by...
 Estimating kinetic mechanisms with prior knowledge II: Behavioral constraints and numerical tests
 Estimating kinetic mechanisms with prior knowledge I: Linear parameter constraints
 Modelling modal gating of ion channels with hierarchical Markov models
 Determination of parameter identifiability in nonlinear biophysical models: A Bayesian approach
 A datadriven model of a modal gated ion channel: The inositol 1,4,5trisphosphate receptor in insect Sf9 cells
 Mode Switching Is the Major Mechanism of Ligand Regulation of InsP3 Receptor Calcium Release Channels
 Utilizing the information content in twostate trajectories