## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# A cellular solution to an information-processing problem

Edited

^{†}by Albert Libchaber, The Rockefeller University, New York, NY, and approved July 21, 2014 (received for review April 14, 2014)

## Significance

Cell-surface signaling receptors are organized into different architectures that have been arrived at multiple times in diverse contexts. To understand the trade-offs that lead to these architectures, we pose the generic information-processing problem of identifying the optimal strategy for distributed mobile noisy sensors to faithfully “read” an incoming signal that varies in space–time. This involves balancing two opposing requirements: clustering noisy sensors to reduce statistical error and spreading sensors to enhance spatial coverage, resulting in a phase transition that explains the frequent reemergence of a set of architectures. Our results extend to a variety of engineering and communication applications that involve mobile and distributed sensing, and suggest that biology might offer solutions to hard optimization problems that arise in these applications.

## Abstract

Signaling receptors on the cell surface are mobile and have evolved to efficiently sense and process mechanical or chemical information. We pose the problem of identifying the optimal strategy for placing a collection of distributed and mobile sensors to faithfully estimate a signal that varies in space and time. The optimal strategy has to balance two opposing objectives: the need to locally assemble sensors to reduce estimation noise and the need to spread them to reduce spatial error. This results in a phase transition in the space of strategies as a function of sensor density and efficiency. We show that these optimal strategies have been arrived at multiple times in diverse cell biology contexts, including the stationary lattice architecture of receptors on the bacterial cell surface and the active clustering of cell-surface signaling receptors in metazoan cells.

The molecular characteristics of signaling receptors and their spatiotemporal organization have evolved to optimize different facets of information processing at the cell surface. A canonical information-processing problem involves designing strategies for a collection of distributed, noisy, mobile sensors to faithfully estimate a signal or function that varies in space and time (1). This problem appears naturally in many contexts, biological and nonbiological: (*i*) chemoattractant protein sensors on the bacteria cell surface (2, 3); (*ii*) galectin-glycoprotein assemblies designed for effective immune response on the surface of metazoan cells (4, 5); (*iii*) ligand-activated signaling protein receptors on the surface of eukaryotic cells (6⇓⇓⇓–10); (*iv*) coclustering of integrin receptors to faithfully read and discriminate the rigidity and chemistry of a substrate (11); (*v*) clustering of e-cadherin receptors for effective adherence at cell–cell junctions (12); and even (*vi*) radio frequency (RF) sensor networks monitoring the environment or mobile targets (13). In the signal-processing community, this problem is known as data fusion or more generally information fusion (14, 15); however typical applications do not consider mobile sensors.

In this paper we show how biology has, on multiple occasions, arrived at a solution to this optimization problem. The optimal solution needs to balance two opposing objectives, the need to locally assemble sensors to reduce estimation noise and the need to spread them out for broader spatial coverage. We show that in the space of strategies, this leads to a phase transition as a function of sensor density, sensor characteristics, and function properties. At very low sensor density, the optimal design corresponds to freely diffusing sensors. For sensor density above a threshold, there are two different optimal solutions as a function of a dimensionless parameter constructed from the sensor advection velocity and the correlation length and time of the incident signal. One optimal solution is that the sensors are static and located on a regular lattice grid. This is the strategy used in bacteria, such as *Escherichia coli*, to organize their chemoattractant receptors in a regular lattice array (3, 16), and in metazoan cells, where galectin-glycoproteins are organized in a lattice on the cell surface to effect an optimal immune response (4, 5). To realize this strategy, the cell needs to provide a rigid cortical scaffold that holds the receptors in place. Another optimal solution is to make the receptors mobile in such a way that a fraction of them form multiparticle nanoclusters, which then break up and reform randomly, the rest being uniformly distributed. Recent studies on the steady-state distribution of several cell-surface proteins reveal a stereotypical distribution of a fixed fraction of monomers and dynamic nanoclusters (6⇓⇓–9), and our information theoretic perspective could provide a general explanation for this. To realize this dynamic strategy, the cell surface needed to be relieved of the constraints imposed by the rigid scaffold and to be more regulatable. This strategy change needed the innovation of motor proteins and dynamic actin filaments, a regulated actomyosin machinery fueled by ATP, and a coupling of components of the cell surface to this cortical dynamic actin (17).

## Coordinated Signal Estimation Problem

Consider a collection of *N*_{p} mobile sensors in a finite 2D space of size *L* × *L*, such as protein receptors on a cell membrane (Fig. 1*A*), with sensor density ρ = *N*_{p}/*L*^{2}. The external signal monitored by the sensors (ligand field in Fig. 1*A*) is a continuous function *f*(*x*, *t*) of space and time, with ξ(τ) denoting the length (time) scale of signal variation.

We assume that the sensors are equipped with an internal clock that allows them to sample the signal values at precise instances in time *t*_{m}. This can arise, for instance, from a minimum information-processing cycle time, which involves binding of the receptor to a ligand and resulting conformational change, signal transmission downstream and resetting time. New signals cannot be processed during this cycle time. We will also implicitly assume that the location of each sensor is known.

The sensors are inherently noisy; the output of the sensor is the function *f*(*x*, *t*) corrupted by additive noise, i.e., the output from a sensor at space–time location (*y*, *u*) is *f*(*y*, *u*) + *σ*_{p}*ζ*, where ζ ∼ *s*(*y*, *u*) sensors at location *y* at time *u*, then the signal read at location (*y*, *u*) is

An illustrative analogy to keep in mind is that the sensors are “imaging” the function (Fig. 1*C*), with the feature that the sensors that are taking the image are possibly noisy. To overcome the noise, the sensors would need to cluster, however at the expense of less coverage of the entire image plane. We would like to investigate how mobile sensors can mitigate some of the loss due to clustering.

Our goal here is to characterize the placement strategy of the mobile sensors that minimizes the expected distortion

In this paper we focus on the problem of minimizing expected distortion

## Idealized Model

Here we consider optimal sensor organization when there are no physical constraints on sensor transport. This idealized model allows us to make accurate analytical predictions (*SI Appendix*) that serve as a useful guide to the more realistic stochastic model. The input signal *f* is taken from the class of functions spatial correlation ξ and piecewise constant with period τ (*SI Appendix*).

We compare the performance of two different signal acquisition architectures, stationary and mobile. In the mobile architecture sensors move with velocity *v*, and can move in a coordinated fashion. (In the *SI Appendix* we consider another form of mobile architecture where the sensor movement approximates diffusion–advection transport.) We show that the optimal architecture has a phase transition from stationary to mobile architecture as a function of the sensor density ρ, sensor velocity *v*, sensor sampling time *t*_{m}, and the correlation length and time ξ and τ.

The noisy sensors sample the function every *t*_{m} s. Because there are *τ*_{m} = τ/*t*_{m} sampling instants in each signal period, the statistical error in the function estimate *s*(*y*) denotes the number of sensors at location *y* and γ ≥ −1 is a parameter that controls sensor interaction. At locations *x* where there are no sensors, we construct an approximation *y* that minimizes the error*x*|*y*) has two components: the first is the spatial error associated with function decorrelation over the correlation length ξ, and the second is the statistical error associated with sensor noise. We minimize max_{x}{ε(*x*|*y*)} subject to the constraint that *s*(*y*) being a nonnegative integer. In some cases, in addition to the distribution *s*(*y*), the density ρ itself could be a decision variable, although, over different time scales. In this case, the optimization problem takes the form min_{ρ}{ε*(ρ) + *κρ*}, where ε*(ρ) is the optimal error as a function of the density ρ, and κ is the cost of producing more sensors.

In the limit of large *L*, an optimal solution is to tessellate space with identical Voronoi cells *SI Appendix*). Because the signal decorrelation error then depends on the distance ∥*x* − *y*∥ from a measurement location *y*; it follows that the ideal Voronoi tessellation corresponds to the hexagonal packing with regular hexagons *r*) with a radius *r* that is function of the correlation length ξ, the density ρ, and the sensor correlation γ.

The stationary mechanism clearly minimizes the statistical error, because it maximizes the number of independent measurements at each sampling location *y*. However, it is possible that mobile sensors, which are able to sample the function at more locations, can significantly reduce the spatial error, although at the cost of increasing the statistical error.

In the perfect mobile architecture, the sensors move with a maximum velocity *v* and can be organized at specified locations in a coordinated fashion. Note that in this idealized model we assume that the sensor measurements are synchronous. A more refined model would allow for sensors to be asynchronous and establish that clustering in time is optimal in some parameter regimes. The optimal trade-off between spatial sampling vs. temporal sampling is governed by the sensor correlation γ, the scaled sampling time *A*). In *SI Appendix* we compute the optimal error *A*).

## Realistic Stochastic Model

The Voronoi centers, i.e., the foci in the idealized model, are a purely geometrical construct. These foci can be physically realized by having focusing regions or signaling platforms (SP) that colocate sensors at their cores. We imbue the dynamics of both sensors and SPs with physical realism, while allowing for stochasticity. Although this generalization is not amenable to simple analytic treatment, we make considerable progress using Monte Carlo simulations.

In addition to having a fixed density ρ of sensors that diffuse with a diffusion coefficient *D*, there is a fixed uniform density *n*_{sp} = *N*_{sp}/*L*^{2} of SPs of size *R*. The SPs capture sensors with an activation rate *p*_{a} and advect them to their cores with a speed *v*, we define an active Péclet number *Pe* = *vR*/*D* (21), which measures the relative contribution of advection to diffusion. The SPs are allowed to breakup and reform randomly at a new location with a lifetime taken from an exponential distribution with mean *τ*_{a}. Each active sensor becomes inactive with probability 1 − *p*_{a}. We will call this the active clustering strategy. In eukaryotic cells, such SPs may be formed from the active restructuring of the actin cortex adjoining the cell membrane, which drives nanoclustering of passive molecules (17, 21). Alternatively, a fraction of sensors could switch to being SPs and start drawing in other sensors in their vicinity, either by inducing local restructuring of the actin cortex (active molecules; refs. 17, 21) or by elaborate [possibly multivalent (22)] protein–protein interactions. This is the most general physically realizable dynamical setting; the stationary and mobile strategies explored in the idealized model can be obtained as limiting cases of this—the stationary lattice corresponds to permanently fixing the SPs at regular spatial positions, and the mobile architecture corresponds to coordinated movement of SPs. Passive diffusion corresponds to setting *n*_{sp} = 0.

The signal *f*(**x**, *t*) to be estimated is drawn from a Gaussian random field with variance

The sensors make measurements at intervals the mean of which is *t*_{m}; thus on an average, the sensor makes τ/*t*_{m} measurements within the correlation time period τ. The signal *f*(*x*, *t*) is estimated from the measurements *x*, *t*). The mean error

### Phase Diagram.

We define two dimensionless parameters: *Pe τ*)/ξ that explores the trade-off between the advection velocity and the correlation length ξ (made dimensionless by setting *D* = 1/8 (*SI Appendix*). Based on an earlier study (21), we find that the performance is maximized when the mean SP lifetime *τ*_{a} = τ, the signal correlation time, reflecting a kind of active resonance condition. In the cellular context, this suggests the possibility that the SP lifetime has evolved to match the signal correlation time. All other parameters not included in η and θ were taken to be fixed (see *SI Appendix* for default values; varying these merely shifts the phase boundaries and does not affect the qualitative features of the phase diagram).

As in the idealized model, the optimal strategy in the stochastic model also displays a phase transition with three distinct phases, see Fig. 2*B*. On decreasing sensor density (i.e., increasing η) at fixed θ, the optimal architecture changes from the stationary lattice phase to an active clustering phase, just as in the idealized model. A further decrease in sensor density results in a reentrant stationary lattice phase, because the remodeling dynamics of SPs does not give an advantage unless the Péclet number (or θ) is high. Finally at still lower density, it is more optimal to move the sensors—at low θ the sensor movement is diffusive and goes over to an active advection–diffusion when θ is high.

### Robustness of the Optimal Solution.

Fig. 3*A* plots the fraction of sensors in clusters in the active clustering phase as a function of the activation rate *p*_{a}. The maximum in this plot occurs at *p*_{a} = 0.78, corresponding to 38% focused in active clusters and 62% freely diffusing on the surface; this maximum fraction is very weakly sensitive to the overall sensor density ρ. Plotting the estimation error versus *p*_{a} for different values of sensor characteristics, we find, quite remarkably, that the minimum error is obtained when *p*_{a} ∼ 0.78 (Fig. 3 *B* and *C*, and *SI Appendix*, Figs. S4 and S5), i.e., this optimal cluster fraction is robust and fairly independent of sensor parameters, such as the sampling frequency *τ*_{m}, sensor density ρ, density of SPs *n*_{sp}*,* and the sensor correlation γ. We also ran simulations where the probability of error was minimized with respect to both *p*_{a} and density of SPs, *n*_{sp}, and found that in this case both the fraction in clusters (∼ 38%) and the number in a cluster (= 6 ± 2) are robust to variations in the other sensor parameters. This result is in contrast to the idealized model, where the optimal *p*_{a} = 1. It does, however, show a dependence on the active Péclet number, *Pe*, and SP remodeling time *τ*_{a} (Fig. 4 *A* and *B*).

## Biology Solves an Optimization Problem

In this paper we show that the optimal solution to the coordinated signal estimation problem encountered by a collection of mobile sensors is determined by the trade-off between the spatial decorrelation and statistical noise in the sensors. This generic estimation problem appears naturally in a variety of engineering situations such as in communication networks and signal processing, but as discussed in the Introduction, it is the biological context that we wish to highlight here.

It is quite remarkable that this generic signal estimation problem exhibits sharp phase transitions, and that every phase has a realization in a specific cell biology context. For instance, the stationary lattice architecture is optimal when *ρξ*^{2} ≫ 1, i.e., either the sensor density or the spatial correlation length of the signal is large, a condition that is met by the chemotactic receptors on the bacterial cell surface, such as *E. coli* (3); reassembly does not give an advantage. Indeed, following the initial proposal (3), there have been spectacular demonstrations of the hexagonal lattice arrangement of chemotactic receptors in a variety of bacterial species (16). Eukaryotic cells have also independently, and at multiple times, arrived at this strategy in contexts where both the space and time correlations ξ and τ are large. Aggregates of galectin-glycan make use of multivalent interactions to organize themselves in a large lattice array on the surface of metazoan cells. This architecture allows them to optimize the dual requirements of high affinity to ligands and large spatial coverage (4, 5). E-cadherin proteins form a two-dimensional array of microclusters at the junctions of cells, so as to establish reliable cell–cell contact (12).

What is even more remarkable is how the information theory perspective brings out the active clustering phase that optimizes the estimation error at low densities. This organizational strategy is realized in a vast variety of signaling systems at the cell surface, such as GPI-anchored proteins (GPI-AP) (7, 8), Ras-signaling proteins (6, 9), glycoproteins, integrin receptors (11, 23), etc., despite the diversity both in their structural forms and in their network of interacting partners. This would argue for a broader conceptual principle underlying the choice of this strategy, such as the one described here. Our study shows that the optimal solution in the active clustering phase has a fixed activation probability *p*_{a} ∼ 0.78 or, alternatively, a fixed fraction of proteins in active clusters ∼ 38%, the rest being diffusing monomers. Allowing the density of SPs to vary, leads in addition to an optimal number ∼ 6 ± 2 of proteins within a cluster. We find that this optimal fraction is robust over a wide range of parameters such as the density of sensors and sensor characteristics. This not only coincides with the cell-surface distribution of GPI-anchored proteins (7, 8), and Ras-signaling proteins (9), but is consistent with the finding that the fraction of proteins in nanoclusters is maintained over a large variation in cellular expression levels. On the other hand, the optimal solution is relatively sensitive to changes in *Pe* and SP remodeling time *τ*_{a}; therefore, in order for the cell to provide a reliably stable response over variations in temperature, these parameters should be constant across temperature in the physiological range (21). This is consistent with studies on the remodeling dynamics of GPI-anchored protein receptors in mammalian cells, which showed that the fragmentation–aggregation dynamics of GPI-AP nanoclusters was relatively uniform across 24−40 °C (8, 17).

The model predicts that poor (noisier) sensors are more likely to be clustered and low noise sensors are more likely to be diffusing freely on the cell surface. Broad spectrum sensors that bind to a large number of ligands are likely to have lower binding affinity with any ligand; as a consequence they will be noisier and will form clusters. On the other hand, specific sensors are likely to be less noisy, and therefore, diffuse as monomers. The insight that colocating sensors leads to a decrease in the impact of noise, or equivalently, improves information, when the external signal does not change rapidly, may also have implications for organizational decision making. In environments where the information is slowly changing over time and the individuals have access to a very noisy version of the information, the organization should rely on forming relatively large ad hoc teams to counter the lack of information—the teams have to be ad hoc to make sure that one has noise averaging effect. However, when the external environment changes rapidly one has to make do with smaller teams. Similar considerations might operate at a larger scale, in the collective reading of external space–time dependent cues in swarms of organisms (24).

Coming back to the cell surface, the space of strategies that we explored are special cases of a general active composite cell-surface model, wherein the constituent molecules interact with each other and with the active dynamical cytoskeleton juxtaposed to the cell membrane (17). The signal estimation problem posed here can then be viewed as active mechanics of cellular information processing. It is appealing that one might look to biology for insights into solutions of hard optimization problems, arrived at as a result of evolution within an information niche (25).

## Acknowledgments

We thank Satyajit Mayor and Mukund Thattai for useful discussions. We thank Joseph Mathew for help in preparing the schematic figures. G.I. thanks the National Center for Biological Sciences for hospitality. M.R. acknowledges a grant from the Simons Foundation.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. Email: madan{at}ncbs.res.in.

Author contributions: G.I. and M.R. designed research, performed research, analyzed data, and wrote the paper.

The authors declare no conflict of interest.

↵

^{†}This Direct Submission article had a prearranged editor.This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1406608111/-/DCSupplemental.

## References

- ↵
- Cover TM,
- Thomas JA

*(2nd ed.).*(Wiley, Hoboken, NJ). - ↵
- ↵
- Duke TA,
- Bray D

- ↵
- Lajoie P,
- Goetz JG,
- Dennis JW,
- Nabi IR

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- Suzuki KG,
- et al.

- ↵
- Bakker GJ,
- et al.

- ↵
- ↵
- Leoncini M,
- Resta G,
- Santi P

- ↵
- Varshney P,
- Burrus C

- ↵
- Bleiholder J,
- Naumann F

- ↵
- Briegel A,
- et al.

- ↵
- ↵
- ↵
- ↵
- Tlusty T

- ↵
- Chaudhuri A,
- Bhattacharya B,
- Gowrishankar K,
- Mayor S,
- Rao M

- ↵
- ↵
- van Zanten TS,
- et al.

- ↵
- Bialek W,
- et al.

- ↵