Inferring Epigenetic Dynamics from Kin Correlations

Populations of isogenic embryonic stem cells or clonal bacteria often exhibit extensive phenotypic heterogeneity which arises from stochastic intrinsic dynamics of cells. The internal state of the cell can be transmitted epigenetically in cell division, leading to correlations in the phenotypic states of cells related by descent. Therefore, a phenotypic snapshot of a collection of cells with known genealogical structure, contains information on phenotypic dynamics. Here we use a model of phenotypic dynamics on a genealogical tree to define an inference method which allows to extract an approximate probabilistic description of phenotypic dynamics based on measured correlations as a function of the degree of kinship. The approach is tested and validated on the example of Pyoverdine dynamics in P. aeruginosa colonies. Interestingly, we find that correlations among pairs and triples of distant relatives have a simple but non-trivial structure indicating that observed phenotypic dynamics on the genealogical tree is approximately conformal - a symmetry characteristic of critical behavior in physical systems. Proposed inference method is sufficiently general to be applied in any system where lineage information is available.


INTRODUCTION
Collectives of nominally isogenic cells, be it a clonal colony of bacteria or a developing multicellular organism, are known to exhibit a great deal of phenotypic diversity and time dependent physiological variability.While often transient and reversible, phenotypic states of cells can persist on the time scale of the cell cycle and be transmitted from mother to daughter cells.This epigenetic inheritance has been a subject of much recent research and is known to involve a multitude of different molecular mechanisms [1][2][3], from transcription factor transmission to DNA methylation [4,5].Stable phenotypic differentiation is at the heart of any animal and plant developmental program [6,7].The role and extent of phenotypic variability in microbial populations is less well understood, but is coming into focus with the spread of single cellresolved live imaging [8,9] and other single-cell phenotyping methods [10].Phenotypic variability within a colony implements the intuitively plausible bet-hedging strategies of survival [11][12][13][14][15], such as persistence [16], sporulation [17] or competence [18].More generally, phenotypic variability may be implementing interesting "separation of labor" -type cooperative behavior within colonies [19], although evolutionary stability of such strategies remains a subject of much theoretical debate [20][21][22].Phenotypic variation can originate from precisely controlled patternforming interactions either from global or local intercellular signaling, as is the case in animal and plant development.For microbes, intracellular stochasticity is seen as playing a leading role in driving transitions between physiologically significant phenotypic states [23][24][25].It is an open problem to understand the extent to which the phenotypic diversity in a bacterial system is driven by cell-autonomous stochastic processes as opposed to the interaction with their neighbors, which could take a form of a feedback through local nutrient availability, secreted factors [26], or direct contact signals [27].
As an example, we consider P. aeruginosa, a common bacteria that like all others requires iron for metabolism, DNA synthesis, and various other enzymatic activities.To absorb iron from its naturally occurring mineral phase, P. aeruginosa produces and releases iron-chelating molecules called siderophores [28,29].Pyoverdine (Pvd) is a type of siderophore that is particularly suited for experimental analysis because it is naturally fluorescent [30].Pvd concentration varies significantly from one cell to another [31], which is largely due to the fact that Pvd is trafficking between cells that either sip or secrete them [29].Moreover, Pvd concentration along lineages has a correlation time of the order of two to three cell cycles [31].The feeder/recipient phenotypes are epigenetically passed on for a few generations before switching -a recent observation ( [31]) which changes the landscape of the discourse on common goods, cooperation and cheating.
Dynamics of stochastic phenotypes can be followed through multiple generations using fluorescent time-lapse microscopy and single-cell tracking [8,9].However, the number of distinct fluorescent reporters in a single cell is inherently limited by their spectral overlap.Alternatively, phenotypic heterogeneity can be measured with relative ease using destructive or fixed-cell methods (such as immuno-staining [32] and fluorescence in situ hybridization (FISH) [10]) that only provide static snapshots.Destructive measurements can however be supplemented with lineage information (kinship) that can be collected using phase time-lapse microscopy and singlecell tracking, which is less intrusive than fluorescent microscopy.We ask: how much can one say about dynamics from a static snapshot of heterogeneity and the knowledge of the relatedness of the individuals in a population?Below we shall take a constructive approach to this question, demonstrating that by adopting a certain plausible and quite general probabilistic description of phenotypic dynamics along lineages, it is indeed possible to infer the dynamics from static snapshots.We shall test the method on the example of Pvd dynamics in P. aeruginosa, comparing the inference to direct dynamical measurements.
Thus our goal here is to provide a tool for the study of epigenetic dynamics within proliferating collectives of cells.We shall focus on the cell-autonomous dynamics and mother-to-daughter transmission and relate the statistical description of phenotypic dynamics along any one lineage to the observable correlations between phenotypic states in a snapshot of cells at any given time, which, as we shall see, explicitly depend on the degree of kinship of the cells.Below, after framing our approach as an inference problem (Section 1 of Results), we shall define a class of models parameterizing phenotypic dynamics on lineages (Section 2 of Results) and explicitly calculate the form of "kin correlations" from which the underlying dynamics is to be inferred.In Results Section 3, we shall apply the approach to the data on siderophore production in P. aeruginosa colonies [31], which will allow us to compare the inference results with the direct measurement of time dependent phenotypes of all cells within the colony, allowing us to validate our approach.Last section of Results will address the question of kinship and spatial correlations within a bacterial colony.In the Discussion section we shall address other possible applications of the approach.

Inference problem for phenotypic dynamics
Consider a growing population of asexual individuals.At every generation, each individual gives rise to two daughters that with some probability inherit the phenotypic traits of their parent.This growing population is naturally represented as a tree (see Fig. 1): the most current population corresponds to the leaves of the tree, while the branches represent its history back to the "founder" cell at the root.Phenotypic dynamics unfolds along the lineage linking any one leaf to the root and correlations between "kin" arise from the fact that close relatives share more of their history.We shall assume that phenotypic dynamics is stochastic with some well defined probabilistic rule (e.g.some Markovian random process), so that the state of the "leaves" at any given time is a realization of the random process.Kin correlations are then defined as correlations between the phenotypic states of pairs, triples or, in general, m-tuples of leaves with the same degree of relatedness.More specifically we characterize kin correlations by the joint distribution describing the probability of different cells to be found simultaneously in certain states.For example, for the pair distribution, G mn (u) is defined in a given population (i.e. the single realization of the dynamics) as the fraction of all pairs of cells with the common ancestor u generations in the past that are in states m and n, G (2)  mn (u) = where s i is the state of leaf i. ||i − j|| is the genealogical distance, or the level of "kinship", between leaves i, j which is defined as the number of generations to their most recent common ancestor.N 2 is the normalization constant corresponding to the number of all pairs at genealogical distance u.Provided that the sample is large enough, averaging over observed pairs approximates the average over realizations of the stochastic process.In practice, we average over multiple observed trees (realizations of the random process) to ensure that the correlation functions fully sample the dynamics.However, because of the possible correlations this joint probability may not be equal to the product of probabilities to observe n and m on their own p n p m where p n = δ n,sj {s} and ... {s} represents averaging over the stochastic dynamics of {s}.Similarly, the 3rd order correlator takes the form, where u is the number of generations to the common ancestor of the more closely related pair and v is the further number of generations back to the common ancestor of all three nodes (see Fig. 1), and N 3 is the total number of such triplets.How much can such correlation distribution functions, defined by a snapshot of a population with known genealogy, tell us about the dynamics that unfolded on the tree?
It will be useful to also define "connected correlators" (3) which are defined so that they go to zero in the limit of large separation when joint probability factorizes.These connected correlators explicitly quantify the extent of pairwise and 3rd order correlation between the nodes on the boundary of the genealogical tree.

The minimal model of stochastic phenotype propagation
Let us begin with the simplest possible model.Assume that stochastic dynamics can be approximated by a Markov process, which means that probability to transition from state m at time t to a state n at each gener-ation depends only on the two states involved: i.e. the transition probability is some T 1 (m|n) and the probability for a cell to start in state m and end up in state n time u generations later is given by the product of the u transition matrices.We can now calculate the joint distribution for a kin pair descending from a common ancestor in state l, u-generations back where T u (m|k) is the transition probability for a particular lineage starting in state k and ending up in state m u-generations later ( m T u (m|k) = 1).This transition probability is defined by composing transition probabilities for a single generation T u (m|k) = l T 1 (m|l)T u−1 (l|k) and single generation "propagator" The 3rd order correlator can be written down in a similar way: We assume that phenotypic states effectively form a chain with transitions occurring only between neighboring states (more generally, any graph without loops would behave the same way).In this case, stochastic dynamics satisfies "detailed-balance" [33,34] (see supporting information) so that T 1 (m|n) = p The equilibrium distribution corresponds to the largest eigenvalue λ 0 = 1 and corresponding φ n : it is convenient to subtract out this component defining the "propagator" describing approach to equilibrium This is useful because connected correlators g To take full advantage of the ensuing simplifications we define correlators in the φ α -basis Ĝ( 2) The "minimal model" of phenotypic dynamics yields a remarkably simple form for the kin correlators: e.g.pair correlator Ĝ( 2) and similarly for the three-point correlator expressed in φ α -basis in analogy with Eq.( 9) where we have defined the "structure constants": Note that since φ m it follows from the orthonormality of eigenvectors m φ α m φ β m = δ αβ that C αβ0 = δ αβ .It is easy to verify that connected correlators ĝ(2) αβ and ĝ(3) αβγ are non-zero only for α, β, γ ≥ 1 and are also given by Eqs.(10) and (11).We emphasize that C αβγ is determined by φ α , the eigenstates of the two-point correlator.The two-point correlators fully determine the three-point correlation functions.
In fact, it can be shown (see SI) that all of the higher order correlators can be expressed completely in terms of λ α and C αβγ which puts a strong and readily testable constraint on predicted correlators: low order correlators can be used to define model parameters, and higher order correlators can be used to test the model.Actually, as we shall show next, already the simple diagonal form of the expression for Ĝ(2 (u) is a non-trivial consequence of assumed dynamics that must be tested.
Before we proceed with the necessary generalization of the model we note that the remarkable structure of the "minimal model" may be recognized from conformal field theory [34,35] (and can be made more familiar by defining u = log|x − y| as a logarithm of a distance between a pair of leaves and logλ α = ∆ α as a scaling dimension).This is not an accident, but rather a consequence of the tree topology and detailed-balance dynamics.The assumption of detailed-balance of the dynamics makes forward and reverse time directions indistinguishable: there is no "arrow of time" associated with lineage dynamics and the tree is effectively unrooted.At each vertex, the three edges can be detached, swapped, and reattached.One can also arbitrarily designate any of the nodes as the root; performing bulk translations.Swapping branches and translating the root manifest as conformal transformations on the boundary [34,36].Since these transformations do not change the relative distance of two bulk nodes, the correlations functions on the boundary must be conformally invariant, which accounts for their strongly constrained form [35].
Kin correlations in the Pyoverdine dynamics in P. aeruginosa In the experiments of Julou et al, [31], the fluorescence of free Pvd in each bacterium was measured using time-lapse fluorescent microscopy, while the growth of the colony was followed with phase microscopy, providing the genealogical tree.For the analysis below (see Methods), we used only the final snapshots of Pvd distribution for 9 colonies, each with 2 9 cells.Each snapshot gives us Pvd concentrations in individual cells corresponding to the leaves of a genealogical tree 9 generations deep.These concentrations were binned to three equally likely states, denoted from 1 to 3, defining respectively low, medium, and high concentration states.(Connecting to the general formulation presented above, we note that in analyzing the data we can choose our freedom to define "bins" to set p n uniform.) It is plausible to think of Pvd dynamics in the colony as a stochastic process on a tree subject to interactions that correspond to local exchange of Pvd.We begin by comparing the observed pairwise kin correlations to the prediction of our "minimal model" given by Eq. 10.To that end we construct correlation matrices for pairs of leaves conditioned by their relatedness, u, and diagonalize them.Fig. 2A depicts the eigenvalues of the two-point correlation matrices mn as a function of relatedness u.The eigenvalues are taken to the power of 1/2u to remove the trivial distance dependence (λ 2u α → λ α ); for the minimal model considered above, this scaling will result in eigenvalues that are independent of u (see Eq.10).The observed values, however, are significantly different from constant (see Supporting Information for the p-values), suggesting either a presence of interaction or a deviation from the simple Markovian or detailed balance form of stochastic dynamics However, the observed eigenvalues deviate most at u = 1 and then asymptote to a constant value with increasing u, suggesting the "minimal model" may still provide a good description of correlations among distant relatives.To test that we examined 3rd order correlators, for which the minimal model predicts Eq.( 11): an expression defined entirely in terms of the 2nd order correlators, without any additional parameters.(As noted above, this relation is the consequence of the hidden conformal symmetry of the process.)Fig. 2B,C depict the three-point correlation functions of the data.The eigenvectors of G (2) at u = 5 were used as a naive approximation of φ α m .λ α were approximated as eigenvalues of G (2) (u = 5) taken to the power of 1/10.Within statistical error, Eq.( 11), computed using G (2) at u = 5, seems to correctly predict the G (3) at distant boundary points; however, the deviation increases as closer points on the boundary are considered.At distance u = v = 1 the predicted 3rd order correlation function is significantly The absolute value denotes the norm of matrix ∆C (methods), which is normalized by the standard deviation of the finite-size fluctuations expected in the structure constants (Methods).As expected, as the separation distance of the points on the boundary increases, the correlation functions more closely resembles those of the minimal model.The largest deviation (at u = v = 1) has a statistical significance of two standard deviations.
different from the experimental observation, which is not surprising, given the already noted deviations in observed pair correlations.
Yet the fact the "conformal relation" Eq.( 11) correctly predicts the three-point correlation function for sufficiently distant relatives demonstrates that the simple minimal model already provides a reasonable approximation for the long-time dynamics, which is quite remarkable.We shall next demonstrate that the deviations at short times can be accounted for by existence of interaction between sisters.

Effective interactions between siblings
We now generalize our minimal model to allow for interactions between siblings, which can in effect be captured in the form of the mother-daughter transmission function Γ(k 1 , k 2 |n).The two point correlator is now, (13) Γ(k 1 , k 2 |l) describes possible correlation in the states of the two daughters as they "inherit" from the mother.(Because the unconditioned effect of the mother daughter transition is subsumed in T 1 (m|n), we have Finally, p l is the probability of the ancestor to be in state l.
Similarly, the 3rd order correlator is modified to: Without any simplifying assumptions on Γ(k 1 , k 2 |n), we have a more general expression for the pair correlator: Ĝ( 2) with bαβ = l,m,n pn √ p l pm Γ(lm|n)φ α l φ β m .Thanks to the consistency condition (and the fact that φ (0) m = p 1/2 m ) we have bα0 = δ α0 , so that the interaction "mixes" only the (decaying) α ≥ 1 eigenmodes of T .
In the SI we show that φ α m and λ u α are still recovered as the large u asymptotic eigenfunctions and eigenvalues of G lm (u) for distant kin, we can obtain bαβ from the observed sister correlations bαβ = l,m p lm (1).We can then, by diagonalizing λ u−1 α λ u−1 β bαβ , calculate finite u corrections to φ α m and λ α and use these to get a corrected estimate for bαβ , defining an iterative process by which we fit interaction correction to the observed pair correlators.
Pair correlators however do not fully determine the Γ(l, m|n) interaction and we next consider the 3rd order functions.Rewriting Eq. ( 14) in terms of the eigenvectors φ α l of the large u (conformal) limit we find which reduces to a multiple of the symmetric structure constants λ α λ β C αβδ in the absence of interaction, when Γ(l, m|k) = T 1 (l|k)T 1 (m|k).We also observe that Γ(α, β|0) = bαβ , is already determined by the analysis of pair correlations.Because λ δ decreases with increasing δ one can approximate by truncating the sum over δ and proceed to define Γ(α, β|δ) by least-square fitting the (overdetermined, on account of u, v dependence) linear system relating it ĝ(3) αβγ (u, v).In practice, with limited data, we retain only the leading correction term (δ = 1), which as we demonstrate below can already provide a satisfactory approximation.

Testing interaction inference on simulated data
Above inference algorithm was applied to simulated trees with random sibling interactions Γ (see Methods for details).The empirical 2nd and 3rd order correlators were measured by counting occurrences of pairs and triplets of phenotypic states as a function of relatedness using Eqs.( 1) and (2).Eigenvectors and eigenvalues of T 1 (φ α m and λ α ) were calculated using the two-point correlator at the largest distance u = 6.As discussed above, the minimal model is accurate even with interactions at large separation distances.We then calculated a "naive" prediction from the minimal model of what the correlation functions ( G0 ) should be at other distances and higher orders using Eq. ( 10)- (12).
We used three parameters to fit the deviations using the interactive form of the 2nd order correlator (corresponding to the unique non-vanishing terms in the matrix bαβ ).Another three parameters were fit to the 3rd order correlators with the series in Eq.( 16) terminated at the leading order δ = 1 (SI).Correlators at all distances u (u, v) were fit simultaneously to determine the free parameters in bαβ (truncated Γ1 (α, β|δ)) that minimized the least-square difference between the observed and predicted ĝ( 2) αβγ (u, v)).Inferred transition matrix Γ was then computed from Γ(α, β|δ) using Eq.(17).Fig. 3 shows the reduction in deviation of the predicted correlators from observed correlators as interactions are introduced into the minimal model.Fitting the three-point correlators clearly improves the inference of the transition matrix Γ (Fig. 3D).
Inferring the interactions in P. aeruginosa We now return to P. aeruginosa and attempt to infer the form of the interactions from the observed kin correlations of Pvd.Following the above recipe, we have fit the free parameters in bαβ to the observed two-point correlation functions, correctly capturing the deviations in the eigenvalues of G (2) matrices with u (Fig. 2A, colored curves).The inferred switching rates T (k|s) = m Γ(m, k|s) are consistent with the switching rates that have been measured by observing the phenotypic states of parents and daughters in the bulk of the tree (Fig. 4A).The apparent decrease in the probability of conserving the parental phenotype in the bulk dynamics is due to the ambiguity in determining the parent phenotype; Pvd concentration can fluctuate significantly during a cell cycle.
The inferred Γ at this order -the limited nature of data does not allow us to fit higher order correlators-contains a clear signature of interactions.The probability of one daughter cell having a low Pvd concentration while the other has a high Pvd concentration is significantly reduced compared to the non-interacting case (Fig. 4B).This is consistent with nearest neighbor exchange of Pvd, which reduces sharp gradients (1-3 states) between neighboring cells, in particular siblings.Fig. 4B also shows that the change in likelihood of occurrence of certain sibling pairs is independent of the state of the parent.
Moreover, from the calculated decrease in the likelihood of observing 1-3 siblings pairs and the time-scale for division ( 40 min), we can crudely estimate the exchange rate between neighbors.Define c = c − c neigh , the difference between Pvd concentration of a cell and its neighbor.Exchange decreases c over time, dc /dt = −γc .If exchange was infinitely fast (or occurred with probability 1 at each generation) we would never observe 1-3 pairs.Our inferred interaction indicates that 1-3 pair occurs with 1/2 the frequency expected in the absence of interactions.At each generation, probability of exchange is roughly 1/2.At each exchange, ∆c ∼ c and ∆t ∼ 40 × 2 min, yielding the crude order of magnitude estimate γ ∼ 0.01/min.This prediction is consistent with the value calculated from the direct observation of Pvd dynamics following individual cells [31].

Spatial interactions
In this section, we address the spatial nature of Pvd exchange.First, we argue why a model that only includes interactions between sisters can be used to infer interactions that take place between all neighboring bacteria in the colony.Next, we try to estimate the expected spatial correlations in Pvd concentration of cells in a colony.To do so, we restore the interactions inferred from siblings to all neighboring pairs of bacteria.Local interactions in P. aeruginosa colonies are believed to be due to the exchange of Pvd.While exchange of Pvd between neighbors can correlate concentrations found in sister cells, exchange is not limited to siblings and occurs between all adjacent bacteria regardless of the degree of relatedness.Nevertheless, we shall argue that the effect of local exchange on the distribution of Pvd on the genealogical tree can be effectively represented through interactions between siblings.Fig. 5 shows the relationship between spatial distance and the degree of relatedness in the colonies followed in the experiments.Each bacterium has on average 7 neighbors, defined as cells located within 1.5 cells widths.Although it is more likely to find the sister as one of cell's neighbors (with probability 0.4 compared to 0.03 for any particular seventh-cousin), the neighborhood is dominated by distant cousins.This is because the num-ber of cousins grows exponentially for each additional generation back to the common ancestor.Thus, exchange with near neighbors is dominated by the exchange with distant relatives, which effectively averages over the whole distribution without contributing to kin correlations.In the limit of a well-mixed population, where neighbors are random nodes from the current generation, local exchange would contribute exactly nothing to kin correlations: any interaction that is not systematically coupled to the topology of the tree is irrelevant.Bacteria on the plate, however are not that well mixed: the sister cell is systematically a neighbor and couples the exchange interaction to the topology of the tree.Although close relatives are also overrepresented among near neighbors, we found that to a good approximation to account for local Pvd exchange it suffices to introduce interactions between sisters.
In the absence of direct spatial interactions, it is possible to map kin correlations to spatial correlations -the probability of observing a pair of bacteria in states m and n at separation distance d, where p(d|u) is the probability of observing a relative of lineage distance u at separation distance d.This distribution is determined empirically by tracking the growth of the colony and is depicted in Fig. 5B. 2 u−1 is the number of relatives at lineage distance u.
In P. aeruginosa colonies, however, direct spatial interactions exist.Local exchange of Pvd implies that kin correlations do not capture all the spatial correlations.We must reintroduce interactions between neighbors that were averaged out when we computed the kin correlators on the lineage tree.A simple way to do so is using the following observation: progenies of distant ancestors that by chance remain nearest neighbors of closer ancestors are in effect more highly correlated than would be expected from degree of relatedness alone.This is because nearest neighbors are more likely to be in the same phenotypic state.Using this observation (see Methods) and the empirical measurement of the probability of finding a relative at lineage distance u as a nearest neighbor (Fig. 5C inset), we can estimate G (2) mn (d) without using any fitting parameters.Fig. 5D shows that the prediction is in good agreement with the observed spatial correlations.

DISCUSSION
In this study we have systematically related phenotypic correlation as a function of kinship, or kin correlations, to the underlying epigenetic dynamics.Introducing a rather general class of models we were able to formulate a method for inferring dynamical parameters from static measurements on cell populations supplemented by the lineage information.This method was then applied to the data on the dynamics of Pyoverdine in P. aeruginosa colonies with the result validated by the comparison to the direct measurements of Pvd dynamics along cell lineages.
Our analysis was based on the "minimal model" of epigenetic dynamics which assumed 1) independent transmission of phenotype from mother cell to its two daughters; 2) detailed balance property of stochastic transitions between phenotypic states.The former assumption was subsequently relaxed, replaced by a general probabilistic model of epigenetic transmission that allowed to parameterize interaction between sister cells.The profound advantage of our "minimal model" starting point is the highly constrained form of the correlations that it entails.Because of its hidden conformal symmetry [34] all multipoint correlators can be expressed in terms of the pair-correlators, making the validity of the model easy to test and deviations easy to uncover.Remarkably, our minimal model provides a good description of correlations among distant P. aeruginosa in as observed in the experiments of Jilou et al [31], which we found to be approximately conformal.
While the proposed approach relates phenotypic heterogeneity of the population to epigenetic dynamics it has a number of obvious limitations.Virtually by definition it is blind to phenotypic dynamics occurring on the time scale shorter than cell cycle and is not transmitted from mother to daughter.Such fluctuations do not contribute to kin correlations, furthermore, they would tend to mask the epigenetically heritable phenotypic variation.Another limitation was evident in our analysis of Pvd dynamics.Our focus on epigenetic dynamics along lineages does not allow for easy incorporation of information on spatial proximity.As a result, instead of directly estimating the interactions due to local exchange of Pvd, we estimated the effect of this interaction on kin correlation which comes about because siblings are more likely to be exchanging with each other, than with anyone else.Hence, our inference yields effective interaction, the origin of which must be examined to be properly interpreted.Other limitations of the present approach, such as discretization of the phenotypic state space state and discretization of time (corresponding to synchronously dividing population) are less fundamental.The model can be generalized to relax these assumptions if warranted by the system under consideration and the extent of available data.
Our example of inferring Pvd dynamics was meant as a proof of principle.Dynamics of the naturally fluorescent Pvd can be directly observed using time-lapse fluorescent microscopy.Dynamic reporters in general, however, require non-trivial genome engineering, and at best are limited to a few spectrally distinct fluorophores.By contrast, measurements such as fluorescence in situ hybridization (FISH) and immuno-staining do not have these limitations but only provide static snap shots [10,32].High throughput technologies can simultaneously measure numerous biomarkers in large populations at a single-cell resolution [10,[37][38][39][40][41][42][43], resulting in a snap-shot of a high-dimensional phenotypic space.Our approach is ideally suited for such applications.Phase time-lapse microscopy [8,9] and potentially single-cell sequencing [44] can probe relatedness between individuals.Combining the two, above method can elucidate the underlying dynamics.
FIG. 4. Inferring the form of interactions in P. aeruginosa.A) The inferred switching rates (probability per generation) between the three Pvd states (low, medium, and high) along a single lineage, T (k|s) = m Γ(m, k|s).Γ is the inferred transition matrix by fitting the interacting model to the two-point correlation functions of the observed trees up to second-cousins.On the right, the transition rates are deduced from direct observations of Pvd states of parents and daughters in the bulk of the tree.The phenotypic state seems less likely to be conserved from direct measurements in the bulk.The discrepancy, however, is due to the ambiguity in determining the state of the parent.B) The change in likelihood of observing sibling pair 1-3 from a particular parent state s due to interactions.This is the ratio of the joint-distribution of the sibling states Γ over a separable distribution constructed from the marginal distributions T .The separable distribution corresponds to independent lineages with no interactions.Occurrence of 1-3 siblings is significantly smaller with interactions.The change in likelihood is also independent of the state of the parent s.This is consistent with exchange interactions that decrease Pvd differentials between neighboring cells regardless of the lineage history.The inferred interactions is consistent with what is directly measured using the phenotypic states in the bulk of the tree (right).Transitions from parent state 2 to daughter states 1-3 are rare and their change in likelihood was not statistically significant in our limited data set.FIG. 5. Spatial proximity as a function of relatedness.A) Neighbors are defined to be within 1.5 cell widths of a given bacterium.The red bacterium above has 6 neighbors shaded in blue.The average number of neighbors is 7. B) Distribution of pair-wise spatial distances (in units of average bacterium width) between all pairs of bacteria whose common ancestor is u generations back; computed over 9 colonies of 9 generations.Distribution at each value of u has its maximum normalized to 1; color scale shown on the right.The spatial distance on average increases with increasing genealogical distance.However, there are large fluctuations.C) Probability that a randomly chosen nearest neighbor pair has a common ancestor u generations back.It is most likely to find distant cousins (u = 7) adjacent to each other.This is because there are exponentially more cousins going back each generation to the common ancestor.If we normalize for number of relatives at distance u (inset), we observe that with probability ∼ 0.4 the sister will be adjacent to its sibling.However, a particular distant cousin is a neighbor with probability less than 0.03.D) Second and third eigenvalues of the spatial correlation matrix G (2) mn(d) as a function of separation distance; the first eigenvalue is trivially equal to one.The solid lines are the prediction of the model (with no fitting parameters) and capture the change in correlation with decreasing separation distance.

Analyzing the experimental data
The experimental data was in the from of a series of images captured from the growth of P. aeruginosa microcolonies -for details of the experiments, see [31].9 microcolonies were analyzed.The boundary was defined to be the population on the last image.The distance of a pair of boundary nodes was calculated by counting the number of divisions from each node to their common ancestor (CA) -determined by going tracing back their history in the images.Although the division time of the bacteria was on average 40 mins, fluctuations were observed; number of generations to the CA were sometimes not the same for the two nodes.For these cases, we randomly selected the value for one of the nodes as the distance.Same method was used to determine the distances between three boundary points (values of u and v).
The signal (Pvd concentration) in each image was calculated as follows: the fluorescence intensity in the cell was subtracted from background fluorescence in that image and then normalized by the mean signal of all the cells in the image.Normalization removes the effect of increase in the total Pvd concentration in the microcolony over time.The resultant signal distribution is stationary (see Supporting information).For the boundary cells, we discretized the signal into three phenotypes (low, medium, and high Pvd levels; respectively 1 to 3) by binning the signal to ensure a uniform distribution (equal numbers) of each phenotype.Fig. 2C.The statistical error of the experimental data was estimated by simulating the inferred transition matrix for 64000 iterations of 9 trees of 9 generations.The eigenvalues and eigenvectors of the inferred transition matrix were obtained from the observed two-point correlation at u = 5. φ α m are the eigenvectors of G (2) (5) and λ α are its eigenvalues to the power of 1/10.C 0 is calculated using the φ α m and Eq.( 12).C obs is estimated from the 3rd order correlators using Eq.(11).In Fig. 2C, the deviation is calculated using the matrix norm, by the standard deviation of C 0 (u, v) over the 64000 iterations.We use Frobenius norm, where, for example, a N × N matrix has norm ij .The bulk transition rates in Fig. 4 were determined by counting all occurrences of the phenotypic states of parent and daughter cells in the observed trees.The phenotypic state of a bulk node was taken to be the Pvd state at the last time point of the cell cycle.The results were not sensitive to this choice.The phenotypic state of the bulk nodes were discretized using the same threshold as the boundary nodes.

Simulations
We simulated 320000 trees of size 6 generations using three states (phenotypes), M=3.The transition matrix T (m|n) is a randomly chosen symmetric 3 × 3 matrix whose columns add to 1.We also required that the minimum eigenvalue of T be larger than 0.5 to ensure longlasting fluctuation modes.For each tree the root was uniformly drawn number from 1 to N. For non-interacting trees, at each generation, every node gives rise to two offsprings.Each off-spring is independently assigned color m with probability P (m|n), where n is the color of the parent.
We did so by starting from a non-interacting version constructed from the randomly generated marginal distribution T (m|n) (see above), Γ 0 (m 1 , m 2 |n d ) = T (m 1 |n d )T (m 2 |n d ).An inseparable distribution was created by adding a noise term to each element of the matrix (drawn from a Gaussian distribution with mean 0 and standard deviation 0.01), in such a way that the marginal distribution was not changed, and the symmetry condition satisfied, Two-point correlation functions G (2)  na,n b (u) was computed by counting all pairs of the final generation with colors na and n b whose common ancestor is u generations back and dividing by total number of such pairs.The ordering of the pair is discarded ensuring that matrix na,n b (u) is symmetric.Three-point correlation function na,n b ,nc (u, v) was similarly computed.The order of the closest pair is discarded ensuring that G (3) is symmetric in indices n b and nc.
In Fig. 3C, we plot the difference between the predicted minimal model three-point correlators (computed using Eqs.(11), (12) with φ α m and λα equal to eigenvectors of G (2) (u = 6) and its eigenvalues to the power of 1/12 respectively) and the the measured three-point correlators.For each point u and v, we have plotted the norm of the difference of the two matrices.For the interacting model, the predicted correlation functions are determined by first fitting the 3 bαβ to the observed two-point correlators, and three Γ(α, β|1) parameters to the observed three-point correlators.
The parameters of the interacting theory were fit to the simulated/experimental data using an unconstrained nonlinear optimizer implemented using the line search algorithm [45], programmed in Matlab R2011.In Fig. 4B, the deviation between the predicted interaction matrix and the actual one is defined as: ∆ Γ(s) = | Γ(s) − Γ| F , where s=0,1,2 denotes respectively the prediction of the minimal model, fit to the 2nd order correlators, and the fit to 2nd and 3rd order correlators.

Lineages in space
Distance between any pair of bacteria in a colony is defined as the minimum distance between either pole or centroid of one bacterium to either pole or centroid of the other.Nearest neighbors are defined as pairs whose distance is less than 1.5 times the average cell width in the colony.Fig. 5 was computed using spatial information from 9 colonies of 9 generations.The average coordination number is 7.

Predicting spatial correlations
The descendants of an ancestor at lineage distance u > u that remain nearest neighbors of the ancestor at lineage distance u have undergone exchange with the latter ancestor for u − u generations.These bacteria contribute to the spatial correlations not as relatives of distance u but rather u.
We include the contribution of these "effective" ancestors as follows, where θ(∆u) = 1 τ e −∆u/τ is the probability that exchange has not happened in ∆u generations.τ is the mean waiting time for exchange, which we estimated roughly as two generations (see main text), τ = 2. p(r|u) is probability of finding an individual of relatedness u at spatial distance r. q(u) is the empirically observed probability that a particular cousin at lineage distance u is a nearest neighbor (Fig. 5C inset).N is the normalization constant.
The authors thank Stephen Shenker, Douglas Stanford, Richard Neher, and David Bensimon for stimulating discussions and helpful comments.This research was supported in part by the National Science Foundation under Grant No. NSF PHY11-25915.BIS also acknowledges support from NIH Grant R01-GM086793.ND acknowledges support from grant ANR-2011-JSV5-005-01.

Ĝαβγδ
Following the minimal model, we can use the diagonal form of the two point correlation functions, Above correlation functions can be converted to correlation functions in space of phenotypic states by performing a transformation.

Perturbative corrections to the conformal limit
We calculate explicitly the leading order corrections to the conformal limit for M = 3.In the main text, we showed that with no interactions Γ0 (α, β|δ) = λ α λ β C αβδ .We account for interactions by adding a correction to the structure constants, Our goal is to fit the corrections D αβδ to the observed correlation functions.The two-point correlation function with interactions can be represented in a matrix form.
An observer, unaware of local interactions, would diagonalize the above matrix and infer a set of effective eigenvalues and eigenvectors.The eigenvalues are to zeroth order the minimal model eigenvalues -those of a transition matrix constructed from the marginal distribution of Γ-plus a correction that vanishes with increasing u.
In the limit u → ∞ (distant boundary points), the inferred eigenvalues and eigenvectors approach their noninteracting values.u-dependence of the eigenvectors and the scaled eigenvalues of the two-point correlation function is demonstrated using simulations in Fig. 3 of the main text.
To fit the measured three-point correlators, we introduced three more parameters, D 111 , D 121 , and D 221 .These parameters capture the leading order corrections to three-point and higher-order correlators.

Broken detailed balance
Let us consider the general case of Markovian dynamics, which does not satisfy Detailed Balance along a single lineage: with n φ α n ψ β n = δ αβ .{ψ α } form a non-orthogonal basis (assuming no degeneracies).If both φ and ψ are or- thonormal basis then it follows that trivially φ = ψ.Ĝ( 2) where B αβ = n p n ψ α n ψ β n and Cν αβ = n φ ν n ψ α n ψ β n is the generalized "structure constant", which is no longer fully symmetric -the fact we acknowledge by setting index ν apart from α, β. (This asymmetry reflects the directionality of the dynamics in the absence of detailed balance, distinguishing the "mother" state, here corresponding to index ν, from the daughters α, β.).
An observer can infer the ψ vectors and the eigenvalues λ from observing the two-point correlation functions.However, the naive structure constants constructed from the ψs in the form of the minimal model is clearly different than the actual structure constants above.The predicted three-point correlators using these structure constants would differ from the observed correlation functions.
Broken detailed balance, much like interactions, introduces 'mixing' of modes.In the main text, we showed that a signature of sibling interactions was that the two point correlator matrix bαβ was no longer diagonal.Can an observer distinguish sibling interactions from broken detailed balance by observing only the two-point correlators?With enough data, the observer can infer the ψ vectors and eigenvalues λ.If the ψs form an orthogonal basis, the dynamics were set by sibling interactions, if not, the dynamics did not satisfy detailed balance.The difference between the two cases becomes easier to spot at higher order correlators.The structure constants for sibling interactions can be constructed form eigenvectors of two point correlators to leading order.Structure constants under broken detailed balance, however, have a φ dependence (as shown above) that can not be determined from the two point correlation functions.

Statistical analysis of the experimental data
In the main text, we showed that the eigenvalues of the transition matrix, λ i , computed from the two-point correlation function at various distances, u, deviated from the minimal model prediction.Here, we demonstrate that this deviation is statistically significant given the finite size of our data -9 trees of 9 generations.First, we checked whether eigenvalues computed from the two-point correlation functions had predictive power given the statistical fluctuations in a finite data set.We simulated many trees of the same size as the data for random transition matrices of size 3. Fig. 7 depicts the predicted eigenvalues versus the actual ones for two-point correlation functions at different distances.The first eigenvalue is trivially one; only the second and third eigenvalues are shown.The predictions correlate well with the actual values when the eigenvalues are large.A large eigenvalue corresponds to a long-lived fluctuation mode -see main text-and is easier to detect.Fortunately, the eigenvalues estimated from the correlation functions in the data are large enough (roughly 1, 0.8, and 0.6 -see figure) to fall in the region that the predictions are well-correlated with the actual values.

Analysis of Pvd signal in the bulk
We had access to the Pvd signal in the bulk of the tree from time-lapse fluorescent microscopy during the growth of the colonies.As discussed in the main text, the bulk dynamics of Pvd is more complex than the effective theory constructed using the boundary nodes.For instance, the Pvd concentration in a given bacterium can change significantly between divisions.Nevertheless, we set out to confirm the key attributes of the bulk dynamics that must hold for our approach to be valid.We considered fluorescence of Pvd from images taken at earlier time points (frames).Same normalization was used as for the boundary nodes (see Methods in the main text), but involving only the cells in the given frame.Fig. 9 shows the distribution of Pvd in the micro-colony for different time points.The distribution is stationary consistent with existence of detailed balance.

FIG. 1 .
FIG. 1. (A)Phenotypic dynamics on a tree.The phenotypic state (color) of each node stochastically changes when inherited from the parent.Only the boundary nodes are accessible at the time of observation.Stochastic variation on the boundary is characterized by kin correlation functions, for example, the probability of observing blue and red colors on two nodes whose common ancestor was three generations back (u = 3).Similarly, the three point correlation function can be investigated for a set of points, where the common ancestor of the two closest points is two generations back (u = 2), and that of all three, an additional two generations back (v = 2).(B) Pyoverdine distribution in a P. aeruginosa colony.Genealogy is captured by imaging the growth of the colony.At the last time-step (390') the phenotype of the bacteria -the intensity of Pvd fluorescence-is imaged and assigned to each node.

m
mn = A nm -a symmetric matrix.p m , the equilibrium probability of state m, satisfies n T 1 (m|n)p n = p m and hence n A mn p .It is useful to diagonalize the symmetric matrix: Aφ α = λ α φ α and rewrite the transition matrix in terms of its orthonormal eigenvectors φ α and eigenvalues λ α : lmn can be found by substituting T for T in the corresponding expressions for G

FIG. 2 .
FIG. 2. Kin correlations in P. aeruginosa data.A) Eigenvalues of the two-point correlation matrix G (2) mn(u) of the experimentally observed trees.The trivial distance dependance of the eigenvalues is scaled out (by taking the eigenvalues to the power of 1/2u), such that the minimal model would have constant eigenvalues.Solid grey lines are the actual eigenvalues.The largest eigenvalue is always 1 corresponding to the equilibrium mode α = 0.The dashed grey lines are the "naive" minimal model prediction calculated using the observed G (2) (u = 5).The colored lines are the best fit of the interacting theory to the observed two-point correlation functions.B) The three-point correlation function G (3) m 1 ,m2,m 3 computed from the data at distances u = v = 1, and u = v = 3.The correlation function becomes more uniform as the separation distance increases.C) Minimal model allows us to calculate the structure constants by observing only the 2nd order correlators.Deviation of the minimal model structure constants (Eq.(12)) from the actual structures constants inferred from the data, ∆C = C αβδ − G (3)

( 2 )
lm (u).Hence they can be directly estimated from the large u data.With φ α m obtained by diagonalizing G

FIG. 3 .
FIG.3.In silico test of the inference algorithm.(A) Eigenvalues of the two-point correlation matrix G(2) of nodes whose common ancestor is u generations back for trees generated from a randomly chosen interacting transition matrix Γ (methods).The trivial scaling with distance has been removed, such that the eigenvalues of the minimal model with no sibling interactions would be constant.Solid grey lines are the actual eigenvalues computed from simulated trees with interactions (Methods).The largest eigenvalue is always 1 corresponding to the stationary mode.The dashed grey lines are the minimal model prediction calculated using the observed G(2) (u = 5).The colored lines are the interactive model fit to the observed two-point correlation functions.(B) The observed three-point correlation function G (3) (m1, m2, m3) is depicted as 3×3×3 matrix for u = v = 1 (left).The fractional deviation of each element of the predicted three-point correlator (Eq.(11)) from its actual value, ∆G = G − G obs , (depicted schematically as the diameter of the spheres) for the minimal model and interacting model.The predicted correlationfunctions clearly improves by introducing interactions.(C) The cumulative deviation in the predicted G (3) at different distances (u, v) for the minimal model and with corrections from interactions.Deviation is the norm of the matrix ∆G.Deviations are larger for closer boundary points.(D) Deviation of the inferred transition matrix Γ from the actual one Γ for many randomly generated Γs.Distance based on the norm is used to quantify the deviation, ∆ Γs = | Γs − Γ| (Methods), for minimal model s = 0, interactions fit to the the 2-point correlators s = 2, and to the three-point correlators s = 3. Fitting the three-point correlation functions improves the inference.

2u D 2 120 1 +FIG. 7 .
FIG. 7. Simulated trees with random transition matrices.Predicted versus actual eigenvalues (second and third; left and right respectively) for simulated trees with random transition matrices that satisfy detailed balance and have no interactions.The colors denote the distance u at which the two-point correlation function is used to estimate the eigenvalues; blue, green, and red, correspond to u = 1 to u = 3 respectively.The red arrows show the eigenvalues estimated from the data.The small eigenvalues have almost no predictive power since the amplitude of their corresponding propagation mode is smaller than the statistical fluctuations. φ1

FIG. 8 .
FIG. 8. Statistical significance of the deviation in the observed two-point correlation functions from that of the null-hypothesis.Top) Histogram of the third eigenvalue of the two-point correlation matrix of simulated trees with the same transition matrix as the null-hypothesis minimal model, for distances u = 1 to u = 4, from left to right.The red arrow shows the value from the measured two-point correlation function of the data.The p-values are shown on the top.

Fig.2A in
Fig.2Ain the main text depicts the departure in the measured two-point correlation functions from the prediction of the minimal model.However, deviations are expected simply due to the statistical fluctuations in the correlation functions from finite size effects.To quantify the statistical significance of the observed deviation, we simulated 10000 iterations of 9 trees of 9 generations using a transition matrix deduced from the observed twopoint correlation function at u = 6 (the starting point of the minimal model) and with no interactions.Fig.7shows the distribution of the third eigenvalue of the resulting two-point correlation function of these trees for different distances.The observed values from data are also shown as red-arrows.The p-values show that the deviation is statistically significant for distances u = 1 to u = 3.

FIG. 9 .
FIG. 9. Distribution of the Pvd signal of all cells in a given frame (time-point) averaged over 10 adjacent frames and all 9 trees of the data.The legend shows the frame-number.Each generation is roughly 10 frames.The distribution is stationary.