## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Molecular circuits for dynamic noise filtering

Edited by Charles S. Peskin, New York University, New York, NY, and approved March 10, 2016 (received for review August 30, 2015)

## Significance

Signaling modules in living cells function and respond in a surprisingly robust way, in spite of their very noisy environment. In contrast, this is not the case for most synthetic circuits in living cells, which are usually designed and tuned in silico under specific assumptions that do not hold in the real environment of the cell. To recover the robust behavior, synthetic circuits need a means to make inference about their dynamic environment, which in turn can be used to effectively counteract it. Here we show how such inference can be implemented as biochemical modules and illustrate how this can guide the design of adaptive synthetic circuits.

## Abstract

The invention of the Kalman filter is a crowning achievement of filtering theory—one that has revolutionized technology in countless ways. By dealing effectively with noise, the Kalman filter has enabled various applications in positioning, navigation, control, and telecommunications. In the emerging field of synthetic biology, noise and context dependency are among the key challenges facing the successful implementation of reliable, complex, and scalable synthetic circuits. Although substantial further advancement in the field may very well rely on effectively addressing these issues, a principled protocol to deal with noise—as provided by the Kalman filter—remains completely missing. Here we develop an optimal filtering theory that is suitable for noisy biochemical networks. We show how the resulting filters can be implemented at the molecular level and provide various simulations related to estimation, system identification, and noise cancellation problems. We demonstrate our approach in vitro using DNA strand displacement cascades as well as in vivo using flow cytometry measurements of a light-inducible circuit in *Escherichia coli*.

Recent developments in synthetic biology have enabled a revolution of biomolecular engineering (1, 2), prompting numerous applications in therapeutics (3⇓–5), biocomputing (6⇓–8), and plant engineering (9), for instance. However, a variety of practical limitations have to be addressed before the field can achieve its full promise. Above all, engineered circuits often exhibit a substantial mismatch between in silico predictions and in vivo behavior (10). Such mismatch is largely attributed to so-called context dependencies, causing individual cells to behave differently depending on their intracellular environment (11). The latter can be understood as the congregation of environmental factors that affect the target circuit, such as the ribosomal abundance or the cell cycle stage. Variations of those factors across cells and over time—also termed “extrinsic noise” (12)—can impair a circuit’s functionality in an unpredicted way and cause total functional failure.

Because extrinsic noise arises outside a circuit, it can be handled in a more systematic fashion than intrinsic molecular noise (13), which is ultimately dictated by biophysical principles. Intuitively, if an extrinsic perturbation is present, one could in principle apply a second perturbation that steers the network into the opposite direction such that the two competing effects cancel. This idea is akin to conventional noise cancellation techniques encountered in communication systems, where a target signal

In quantitative biology, optimal filtering and related concepts have been used in the literature, either to reconstruct biochemical processes from experimental data in silico (16) or to analyze whether existing biochemical networks can act as optimal filters that process intracellular and extracellular signals (17⇓⇓–20). Along those lines, it has been shown theoretically that the suppression of noise is fundamentally bounded by the precision at which the noise can be estimated from the past (21). This underpins the high potential of implementing biochemical estimators that achieve such a bound. Nevertheless, the use of optimal filters for the de novo engineering of synthetic networks remains nonexistent.

Conventional statistical methods aim at inferring molecular signals or parameters based on experimental data, which have been recorded beforehand through dedicated technical devices such as flow cytometers or microscopes. In other words, the inference itself is performed in silico by the observing entity. The goal of this work is to move the observing entity inside a cell or any biochemical network of interest. To this end, we reinterpret such estimators as dynamical systems themselves and map their specification to a list of biochemical reactions.

## Theoretical Results

### Biochemical Signals and Sensors.

Assume a synthetic circuit requires knowledge about the abundance of a biochemical signal *ρ* and *ϕ* (Fig. 1*A*); we will be concerned with more general scenarios later in this manuscript.

Although

Implicitly, the sensor reaction carries information about the unknown *B*). A complete sensor trajectory consists of the random reaction times

### Optimal Signal Estimation.

The theory of optimal filtering (23) provides a mathematical framework for estimating dynamic signals *A* can be shown to satisfy*SI Appendix*, section S.1). Note that Eq. **1** can be informally rewritten as an ordinary differential equation driven by a sum of Dirac pulses *k* stemming from the *k*th reaction firing of the sensor. Between two consecutive firing times

However, two tractable filters can be derived under the assumption of either weakly or highly informative measurements. We refer to the first one as the Poisson filter, and it is based on the assumption that *SI Appendix*, section S.1.3), the filter reduces to a single differential equation*SI Appendix*, section S.1.4). This filter requires a second differential equation corresponding to *SI Appendix*, Eq. **20**).

Taking a systems perspective, those filters are now understood as kinetic models driven by an external input *ρ* and *n* to ensure a deterministic, high–copy-number regime (*SI Appendix*, section S.1.5). Overall, the Poisson filter can be implemented through only three reactions:*n*, the abundance of the species *t* is a discrete random variable which we denote by *n*, the large copy number limit will be approached, and the filter reactions will become virtually deterministic. Note that this will not be the case for the sensor reaction, whose rate depends on the abundance of *n*. This gives us a way to realize the solution of Eq. **2** using chemical reactions. More precisely, for any *t* in the interval *n*, the reactions in Eq. **3** give us the molecular filter realization we seek because for such values,

In the case of the gamma filter, challenges arise because the filter equations are incompatible with biophysical rate laws. We addressed this problem by applying a suitable variable transform *SI Appendix*, section S.1.5. Example trajectories of both filters for different values of *SI Appendix*, Fig. S.1.

To quantitatively check the accuracy of the filters, we compared their empirical MSE to the MMSE obtained through numerical integration of the Kushner–Stratonovich equation. The results reflect the conditions of *SI Appendix*, Fig. S.2).

Optimal filters can typically tolerate a substantial degree of model mismatch. This has great practical relevance because the dynamic noise model is sometimes only poorly characterized. In the considered example, for instance, precise knowledge of the parameters *ρ* and *ϕ* might not be available. We performed additional simulations and found both filters to be largely robust with respect to parameter mismatch (*SI Appendix*, Fig. S.2). Although the Poisson filter performs generally worse than the gamma filter for little or no mismatch, it is surprisingly robust also in case of substantial parameter variations.

### Ensemble Averaging.

When a filter is scaled by *n*, a single sensor reaction has to produce *n* molecules at once. Although this could be achieved, for instance, using DNA hybridization cascades, challenges arise in cellular systems where the degree to which stoichiometries can be engineered is rather limited. Ensemble averaging offers a viable and attractive alternative to filter scaling. The idea here is to achieve determinism by running *n* independent copies *n* could correspond to the number of identical plasmid replicates within a cell. Similarly, one could use a population of *n* isogenic cells to sense extracellular signals such as the abundance of a certain chemical in the media. Although intrinsic fluctuations will affect the individual instance *n* is sufficiently large. This has important practical implications. First, the resulting one-molecule sensors *SI Appendix*, section S.2, that for large *n*, the ensemble Poisson filter converges to the differential equation

When comparing the ensemble Poisson filter to the Poisson filter in terms of the MSE, one finds that for any *ρ*, *ϕ*, and *SI Appendix*, section S.2.1). For large *n*, this difference can be seen by multiplying Eq. **4** with **2**: the two equations coincide except that the term stemming from the sensor

### Extensions to More General Cases.

The filter variants described above are suitable in cases where the sensor is attached directly to the unknown signal *A*), and *A*, then **5** is general, and our previously derived filters and other known estimators can be framed as special cases of it. For instance, if the components **5** can be understood as a continuous-time variant of the recursive least squares algorithm (26).

Although estimator **5** may not be directly realizable, an estimator that satisfies a close approximation of it is always achievable using molecular circuits (*SI Appendix*, section S.3). This will be shown in *Applications* and *Discussion*.

## Applications

In the following sections, we demonstrate practical applications of our filtering circuits using two simulation studies as well as experimental data recorded in vitro and in vivo.

### Adaptive System Identification.

We use a multivariate filter to solve a combined state and parameter estimation problem that is associated with a biochemical system identification task. In particular, we consider a birth–death process *B*) as shown in *SI Appendix*, section S.3.1. Our simulations demonstrate that the filter is able to accurately identify both *C*) after a short transient. Note that this filter is able to readapt to spontaneous changes in the birth rate *λ* (*SI Appendix*, section S.3.1). We used this filter to identify a birth–death process, whose birth rate is controlled by a stochastic bistable switch. Fig. 2*D* shows that if *λ* and

### Cancellation of Extrinsic Noise.

In the following we show how the newly developed filters could guide the design of noise-insensitive circuits. Although a more detailed view on this topic is provided in *SI Appendix*, section S.4, we illustrate the concept by means of an example that is representative of what is typically encountered in vivo. In particular, we consider a microRNA circuit that is deployed to mammalian cells through transient transfection. The goal of the circuit is to stably express RNA *t* as *A*). Overall, we obtain a transcription rate *B*).

The goal is to construct an estimator *A*). Intuitively, the two effects are expected to compensate for each other such that the overall effect of *SI Appendix*, section S.4.

In the considered scenario, the ensemble Poisson filter is particularly suitable to serve as an estimator because it can exploit the availability of multiple gene replicates per cell to improve its accuracy. The gene *M* corresponding to this filter is present twice on the plasmid, once attached to a constitutive promoter *pMc* and once attached to a *M* is likely to be affected by contextual factors. We therefore accounted for an unintended dependency of this gene on *SI Appendix*, section S.4.4, and Fig. 3 for more details). It turns out that if the sensor rate *C* and *D*). An additional case study showing noise cancellation in a bistable switch is provided in *SI Appendix*, section S.4.5.

### In Vitro Implementation of the Poisson Filter.

As a proof of principle, we forward-engineered and tested a DNA-based filtering circuit in vitro as DNA strand displacement (DSD) cascades (6, 7, 27⇓–29). Strand displacement is a competitive hybridization reaction where an incoming single-stranded DNA molecule binds to a complementary strand, in the process displacing an incumbent strand. This elementary mechanism allows one to directly synthesize arbitrary chemical reaction networks (30). Furthermore, because individual DSD reactions can be described by conventional bimolecular rate laws at a remarkably high precision (31), they provide a higher degree of quantitative control compared with cellular systems.

To enable a comparison of the molecular filter with its mathematical counterpart and the true value of *A*). In particular, the concentration of *A*). In reference to Eq. **2** this would correspond to a scaling factor of

The reaction network from Eq. **2** was mapped to a DSD circuit (*SI Appendix*, section S.5 and Fig. S.10) under the join-fork paradigm (6, 32) and quantified experimentally using calibrated fluorescence measurements (*SI Appendix*, section S.6.1). An initial perturbation experiment was performed to check the circuit’s sensitivity with respect to small changes in **M** and to compute initial estimates of kinetic parameters (*SI Appendix*, section S.6.1.4). Based on those estimates, we designed three time course experiments. The corresponding fluorescence trajectories show that the in vitro filter resembles the ideal mathematical model at a remarkably high precision in all three scenarios (Fig. 4 *C*–*E*).

### Ensemble Filtering in *Escherichia coli*.

Engineering biochemical circuits and their properties in living organisms is associated with substantial additional challenges compared with cell-free systems. It is therefore important to show that the desired circuit characteristics are attainable using cellular mechanisms. In the following, we demonstrate experimentally that a simple genetic circuit in *Escherichia coli* can function as an optimal filter.

To check whether our circuit is indeed able to estimate a noise signal

We used an optogenetic circuit encoded in plasmid pJT119b (33), which expresses a fluorescent protein (GFP) at a basal rate through a weak constitutive promoter (34). This rate can be enhanced through a second promoter that is inducible by green light (Fig. 5*A*). Due to this particular promoter configuration and the fact that plasmids are present in multiple copies per cell [*ρ*, *ϕ*. Indeed, based on Eq. **4**, there always exist a *ρ* and *ϕ* for which the optogenetic circuit functions as an optimal filter as long as the degradation rate of *α*, *β*, and *τ* that account for reporter maturation and degradation. From the inferred transcription dynamics, we can get the parameters *ρ*, *ϕ*, and **4** (see also Fig. 5 *A* and *SI Appendix*, section S.7.5 and Fig. S.16). The inferred filter circuit parameters allow us to assess the performance that the filter would achieve in terms of its MSE (*SI Appendix*, Fig. S.16).

Finally, we tested the function of our circuit as an estimator of *SI Appendix*, section S.7.7). We found that the corresponding experimental fluorescence measurements are in very good agreement with the response predicted by the inferred filter model (Fig. 5*B*), indicating that the corresponding transcriptional output *SI Appendix*, sections S.7.5–S.7.7, for more details).

Circuits like the one above could serve as modules for estimating dynamic transcription factor abundances from transcribed RNAs. The estimator could be optimized to specific *ρ*, *ϕ*) by tuning the strengths of the promoters: the constitutive expression rate should be designed to be close to *ρ*, whereas the induced transcription rate should be close to *k* is the degradation rate constant of *ρ* and *ϕ* increases with the sensor rate

## Discussion

Our results illustrate that a seemingly complex filtering operation may be realizable through very simple biochemical mechanisms. This simplicity allowed us to showcase our filtering approach in vitro using DNA strand displacement cascades but also in vivo using a light-inducible gene expression circuit in *E. coli*.

A key strength of model-based filtering techniques is that the assumed model dynamics of *SI Appendix*, Fig. S.2 and Figs. 2*D* and 3*D*). The latter property appears particularly relevant for synthetic biology where the true dynamics of a signal

We found that the proposed ensemble filter variants are favorable over normal ones when replicates of identical circuits are easy to accomplish (e.g., through multiple plasmids). By exploiting this parallelism, they lead to a damping of the sensor noise that is inversely proportional to the ensemble size *n*, and as a consequence, ensemble filters achieve a reduced MSE for all

Most importantly, however, the ensemble concept entails a general recipe for building optimal estimators of arbitrary biochemical signals, even if they are nonlinear and multiple steps away from the sensor. In particular, they can be realized from *n* replications of the signal of interest *SI Appendix*, section S.3). The individual replicates serve as stochastic simulations of *n*-sample Monte Carlo average. As a striking implication, the moment closure problem is bypassed, facilitating applications also to nonlinear

Our simulation studies suggest several potential applications of optimal filters to biomolecular estimation, system identification, and the design of context-independent circuits. In contrast to trial-and-error approaches, the circuits are derived in a principled fashion under an MMSE criterion.

We believe that the ability to perform statistical computations in situ will be crucial for devising robust synthetic networks. Those will allow circuits to sense, estimate, and adapt to their environment, facilitating context-aware designs. We envision many potential applications, ranging from adaptive therapeutics to self-reporting cells that estimate and display inaccessible parameters and states.

## Materials and Methods

Detailed information about mathematical derivations, simulations, and experimental procedures can be found in *SI Appendix*. In *SI Appendix*, section S.1, we provide discussions around the optimal filtering framework for biochemical networks. The ensemble and multivariate filters are described in *SI Appendix*, sections S.2 and S.3, respectively. *SI Appendix*, section S.4, introduces a mathematical framework for noise cancellation and contains details about the corresponding simulations. Rational design and experimental methods related to the DNA-based filtering circuit are provided in *SI Appendix*, sections S.5 and S.6. Experimental methods related to the bacterial circuit are described in *SI Appendix*, section S.7.

## Acknowledgments

The authors are grateful to Yuan Chen and Sundipta Rao for providing technical assistance to the first author while he was performing the experiments. This project was financed with a grant from the Swiss SystemsX.ch initiative, evaluated by the Swiss National Science Foundation. G.S. was supported by National Science Foundation Grant CCF-1317653.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. Email: mustafa.khammash{at}bsse.ethz.ch.

Author contributions: C.Z. and M.K. designed research; C.Z. performed research; C.Z. performed in vitro experiments; G.S. supervised in vitro experimental work; M.R. performed optogenetic experiments in

*Escherichia coli*; G.S., M.R., and M.K. contributed new reagents/analytic tools; C.Z. analyzed data; M.K. supervised research; and C.Z. and M.K. wrote the paper.The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1517109113/-/DCSupplemental.

## References

- ↵
- ↵
- ↵
- ↵.
- Xie Z,
- Wroblewska L,
- Prochazka L,
- Weiss R,
- Benenson Y

- ↵.
- Ruder WC,
- Lu T,
- Collins JJ

- ↵
- ↵.
- Qian L,
- Winfree E

- ↵
- ↵
- ↵
- ↵
- ↵.
- Elowitz MB,
- Levine AJ,
- Siggia ED,
- Swain PS

- ↵.
- McAdams HH,
- Arkin A

- ↵.
- Kay SM

- ↵.
- Kalman RE

- ↵
- ↵.
- Hinczewski M,
- Thirumalai D

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Bain A,
- Crisan D

- ↵
- ↵.
- Opitz D,
- Maclin R

- ↵
- ↵
- ↵
- ↵.
- Phillips A,
- Cardelli L

- ↵.
- Soloveichik D,
- Seelig G,
- Winfree E

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵

## Citation Manager Formats

## Sign up for Article Alerts

## Article Classifications

- Biological Sciences
- Biophysics and Computational Biology

- Physical Sciences
- Applied Mathematics