A Markovian dynamics for Caenorhabditis elegans behavior across scales

Significance Complex phenotypes, such as an animal’s behavior, generally depend on an overwhelming number of processes that span a vast range of scales. While there is no reason that behavioral dynamics permit simple models, by subsuming inherent nonlinearities and memory into maximally predictive microstates, we find one for Caenorhabditis elegans foraging. The resulting “Markov worm” is effectively indistinguishable from real worm motion across a range of timescales, and we can decompose our model dynamics both to recover and reveal behavioral states. Finally, we connect postures to trajectories, illuminating how worms explore the environment in different behavioral states.


INTRODUCTION
From molecular motors contracting muscles, to neurons processing an ever changing environment, or the large-scale diffusion of hormones and other neuromodulatory chemicals, animal behavior arises from biological activity across innumerable spatial and temporal scales.With an instantaneous snapshot of all of these variables, the future behavioral state of the animal would be uniquely defined, a biological setting for the demon of Laplace (see e.g.[1]).Of course, such an approach is practically unrealizable.We are limited to a much smaller set of observations and the unobserved degrees of freedom will generally induce non-Markovianity, or memory, to the dynamics of the variables that we do measure [2,3].In animal behavior, interpretations of this memory guide our understanding of the complexity of the process [4,5].But what if we could use our observations to construct memory-full state variables that admit predictive, yet minimal-memory dynamics [6,7]?
The construction of such dynamics appears daunting.We may even conclude that this is impossible if it were not for the fact that it is done routinely in physical systems.Indeed, it is often the case that a subset of observable functions is enough to capture behavior at a particular scale.Hydrodynamics, for example, can be formulated with effective variables such as fluid velocity, density, or temperature: their memory only coming from the previous state.In behavior, we expect the emergent reconstructed dynamics to be generally high-dimensional in order to account for the multitude of unobserved mechanisms.Yet our approach also suggests a principled coarse-graining.Since the dynamics of the reconstructed states are Markovian, the emergent timescales of the (nonlinear) system are naturally ordered by the eigenvalue spectrum of a linear evolution operator, or transition matrix in the case of discrete states.The eigenvectors associated with the gaps in the spectrum indicate slow collective modes and provide natural targets for coarse-graining.In the hydrodynamic example of ∼ 10 23 interacting molecules, these modes are the effective variables.
Here we seek such Markov dynamics from the time series of posture in the foraging behavior of the nematode worm C. elegans, an important model organism in genetics and neuroscience [8][9][10].For both the worm and animals generally, the collection of high-resolution behavioral data has been greatly accelerated by advancing techniques for pose estimation via machine vision [11][12][13][14][15], combined with computational and imaging improvements.Such measurement advances demand new behavioral understanding: analyses, models, and theory of posture-scale dynamics [4,16,17].
We implement a principled, generally-applicable framework which combines delay embedding with Markov modeling [6].In this approach, we seek to overcome the partial observability of behavioral dynamics; variables which influence behavior but are instantaneously hidden become apparent over time and Markov predictability provides the quantitative measure of a selfdetermined system.Posture itself is a very complicated function of its underlying biological variables.In such situations, an initial expansion of dimensionality can simplify computations like function estimation and classification.We thus trade the complex modeling of a low-dimensional time series for the simpler modeling of a much higher-dimensional state space: the encoding of the unobserved degrees of freedom through time delays drastically simplifies our theory, leading to a powerful yet simple description of the emergent nonlinear Markovian dynamics.
While Markov approaches have an extensive history, perhaps most familiarly in Markov Chain Monte Carlo sampling of equilibrium distributions [18], substantially less attention has focused on a Markov encoding of actual dynamics, especially with a large number of states.Importantly, we note that there is no guarantee that our approach will work; for example, the number of necessary delays may be computationally prohibitive.But even this "failure" would provide important information about the memory of the system.On the other hand, if we are successful, we will be left with a finite set of observables that are approximately self-determined, measurable, and whose dynamics span the timescales that are relevant to the phenomena of interest.Such observables are likely to be biologically meaningful.
We find state variables for worm behavior that exhibit Markovian evolution across the multiple timescales of C. elegans foraging behavior: from fine-scale posture movements to "run-and-pirouette" strategies.Additionally, the macroscopic variables we reveal are not some baroque non-physical mathematical functions but rather correspond to interpretable behavioral motifs.We rediscover canonical behaviors from the rich history of C. elegans ethomics, as well as describe new ones.Each of these motifs is associated with its own characteristic timescale, and with them we provide a new hierarchical subdivision of behavior.We show how the dynamics of these macroscopic variables can be propagated through a model of the organisms physical interaction with the environment to accurately predict locomotion from posture.Finally, we dissect the function of these behavioral motifs by investigating their relation to the exploration and exploitation of food sources.

SHORT-TIME BEHAVIORS AS MAXIMALLY-PREDICTIVE POSTURE SEQUENCES
On a 2D agar plate, worms move by making dorsoventral bends along their bodies [19].At the shortest timescales (∼ 1 s) these traveling waves along the body give rise to forward, backward, and turning locomotion [20][21][22].We show here that this organization emerges naturally from short posture sequences, formally a delay embedding first introduced in the theory of dynamical systems [23][24][25][26][27].
We employ a previously analyzed dataset [28] composed of 35 minute recordings of 12 young-adult labstrain N2 worms freely moving on an agar plate, sampled at δt = 1/16 s (a total of 33,600 frames per worm).From high-resolution videos, we measure the worm's body posture using a rich low dimensional representation of the centerline, expressed as five "eigenworm" coefficients ⃗ a = [a 1 , a 2 , a 3 , a 4 , a 5 ] ∈ R 5 [20,28], Fig. 1(a).Posture itself already carries important behavioral information, for example the joint distribution of the first two eigenworm modes is approximately circular, describing the phase of the crawling locomotory wave [20].But posture dynamics carries even more information, such as the direction of the wave which distinguishes forward crawls from reversals.
We capture posture dynamics through a maximally predictive sequence space [6] constructed by stacking K delays of the posture time series, and increasing K until we have maximized the predictability of the resulting dynamics, as measured by the entropy rate, Fig. 1(bleft).While other quantitative approaches to behavior also start with short dynamical motifs [29,30] our work is distinguished by a focus on maximal predictability to determine the motif length.To estimate the entropy rate, we partition each sequence space (indexed by K) into N microstates s i using k-means clustering so that the posture dynamics now appear as transitions between microstates, a Markov chain.We build the partition using all 12 × 33600 observations so that the microstate labelling is consistent across worms.We approximate the entropy rate as that of the Markov chain inferred from each worm's symbolic sequence and choose the largest N before finite size sampling reduces the estimated entropy 1 .This "maximum entropy" partitioning (with N * = 1000 partitions) requires a large number of microstates, but enables our model to be maximally expressive within the limits of the data (which set the onset of finite size effects in the entropy estimation).After K ≈ 8 frames = 0.5 s and the entropy rate curves 1 The entropy rate of the Markov chain should not be confused with the Kolmogorov-Sinai (KS) entropy, which is an intrinsic property of the dynamics.Indeed, the KS entropy can also be estimated using our approach, yielding accurate estimates that agree with the sum of positive Lyapunov exponents [6,22].
start to collapse, and we set K * = 11 frames = 0.6875 s to define the maximally-predictive sequence space X K * of dimensions 5 × 11, Fig. 1(b-right); lengthening the sequences does not increase predictability, Fig. S1(a).While Fig. S1(a) suggests that K * = 6 frames is enough to maximize predictability, we choose a slightly higher value of K * to ensure that we fully resolve the predictive information lying in past posture measurements.We visualize the high-dimensional X K * by projecting into two dimensions using the UMAP manifold learning algorithm [31] (see SI Appendix: Methods), Fig. 1(c).Color-coding according to the worm's body wave phase velocity ω = − 1 2π d dt tan −1 (a 2 /a 1 ) , [20] Fig. 1(c, left), and overall curvature (obtained by summing the tangent angles along the body), γ = i θ i , reveals that distinct short-time behavioral motifs, corresponding to forward, reversal, and turning movements, naturally correspond to different regions of the maximally predictive state space, Fig. 1(c, right).In other words, while the instantaneous posture itself ⃗ a(t) is not enough to disentangle different behaviors, a point in the sequence space X K * (t) uniquely corresponds to a particular short-time behavioral motif.

THE MARKOV WORM
Our combination of sequence embedding and partitioning naturally results in a symbolic dynamics where each state probabilistically transitions to a new state [6].These dynamics are described through the master equation where p i (t) is the probability of observing state s i at time t, and P ij (τ ) is a transition matrix constructed by counting transitions between microstates s i (t) and s j (t + τ ) after a delay τ .We note that P is a stochastic matrix so that each transition probability P ij ≥ 0 and j P ij = 1 for all microstates i.The waiting time τ is a free parameter which can be as short as δt, as we used previously for the maximally-predictive embedding, and we will leverage this freedom to emphasize long-lived aspects of the worm's movement.We have thus transformed the behavioral dynamics into a (generally non-equilibrium) Markov chain with many states, and where each state itself carries short-time dynamical information.
As described in the previous section, for the worm's posture dynamics, we set the number of microstates as N * = 1000 so as to maximize the amount of information with respect to the partitioning, Fig. 1(b).We then set the transition time as τ * = 0.75 s so that the relaxation times of the long-lived dynamics are approximately independent of τ , as rigorously true in a Markov process (see Fig. S1(b,c) and SI Appendix: Methods).In practice, from the 33,600 − K * − τ * observations of each worm, we count how many times state s i reaches state s j in a timescale τ * .Since τ * is much shorter than the mixing time of the dynamics, most transitions occur locally in X K * , so we expect that the number of nonzero transition probabilities is much smaller than N * × N * .Indeed we find that from each s i the dynamics only reaches ∼ 10 other states within τ * yielding transition matrices with ≈ 10, 000 entries, Fig. S4.Despite the conceptual simplicity of our Markov chain model, Eq.1, we next show that it accurately predicts C. elegans foraging behavior across scales, from fine-scale posture movements to long time scale transitions between behavioral states, Fig. 2,S4.

Predicting behavior across scales
Starting from the initial state of an individual worm, which corresponds to a discrete microstate s i (t = 0) of the K * space of posture sequences, we simulate symbolic sequences by sampling from the conditional probability distribution P w (s j |ŝ i ), Fig. 2(a), where ŝi is the current microstate, s j are all possible future microstates after a time scale τ * and P w (s j |ŝ i ) is the i-th row of the transition matrix inferred for worm w.The result is a sequence of microstates with the same duration as the worm trajectories, but with a sampling time δt = τ * .From each microstate we can then obtain a nearly continuous time series of "eigenworm" coefficients {⃗ a(t)} through the sequence of K * postures in each state X K * (note that K * and τ * are quite close in this case).These dynamics are effectively diffusive in the space of posture sequences: hopping between microstates according to the Markov dynamics, followed by random selection from the set of posture sequences X i K * within each visited microstate s i .The posture time series generated through this procedure are nearly indistinguishable from the data, Fig. S2 and SI Movie 1. Quantitatively, the autocorrelation functions of the simulated time series, Fig. 2(b), S4(a), capture the correlations observed in the data, and the distribution of mode coefficients agrees with the steadystate distribution, Fig. S3.
In addition to the finescale posture dynamics, our model also predicts the rate at which forward movements are interrupted by biologically relevant behaviors [33] such as reversals, dorsal turns or ventral turns (identified by thresholding the body wave phase velocity [20] and the overall body curvature, see SI Appendix: Methods), Fig. 2(c), S4(b).
At larger spatio-temporal scales the foraging randomwalk can be coarsely split into forward "runs" interrupted by "pirouettes", a single or sequence of sharp centroid turns, which are used by the worm to reorient itself between longer-lived runs [32].We note that the term "pirouette" was originally introduced to broadly describe the reversals, deep turns and combinations of reversals and deep turns seen during chemotaxis [32].This definition has also been applied to the FIG. 1. Maximally-predictive posture sequences reveal the space of short-time behaviors.(a-left) We represent the posture at each frame t using an "eigenworm" basis [20].We extract the worm's centerline and measure the tangent angles between body segments θi(t).We then subtract the average angle to obtain a worm-centric representation θ = θ − ⟨ θ⟩, and project the mean-subtracted angles onto a set of eigenworms [20], obtaining a 5D (eigenworm coefficient) time series ⃗ a(t).(a-bottom) A segment of the time series: in food-free conditions, the short time scale behavior roughly consists of forward and reversal waves (clearly visible as oscillations in a1 and a2), as well as sharp turns (characterized by large amplitude a3 [28]).(b) Entropy rate as a function of the sequence length K and number of partitions N .We partition each sequence space (indexed by length K) into N microstates using k-means clustering and compute the entropy rate of the resulting Markov chain.The curves collapse after K ∼ 0.5 s, indicating that the entropy rate is approximately constant meaning that there is no further gain in predictability by including more time delays.We choose K * = 11 frames = 0.6875 s to define a maximally predictive sequence space XK * .Error bars are bootstrapped standard deviations across worms.(c) We visualize XK * by projecting onto two-dimensions using UMAP [31], and coloring each point by the body wave phase velocity ω = − 1 2π d dt tan −1 (a2/a1) [20] (left) and the overall body curvature (right) obtained by summing the tangent angles θi along the body γ = i θi.Each point on this space corresponds to a K * sequence of postures (from light to dark colors), and different short-time behaviors naturally correspond to different regions on the projection.
worm's navigation behavior in food-free environments [34], similar conditions to the recordings described here.
Here we identify long-lived behavioral states from posture dynamics by using the inferred transition matrix to find sets of movements that typically occur in a sequence (we call these "macrostates", see the following section COARSE-GRAINING BEHAVIOR THROUGH ENSEMBLE DYNAMICS).At the two-state level, the macrostates split the centroid trajectory into relativelystraight paths and more abrupt reorientations, and we thus identify these macrostates as the posture-scale expression of "runs" and "pirouettes" described above, Fig. 2(d-inset).We estimate the kinetic transition rates from runs-to-pirouettes κ R→P and from pirouettes-toruns κ P →R and find close agreement between data and simulations across worms, Fig. 2(d),S4(c).

Posture to Path
Our Markov model on posture sequences is remarkably powerful at predicting worm behavior from an egocentric point of view.However, to understand how such dynamics relate to biological function, we need to connect posture dynamics to motility in the 2D space.With such a bridge we could, for the first time, connect the neuromechanical control of posture with movement strategies such as optimal search.Adding posture-to-path to our predictive Markov models offers the possibility of going beyond the limits of experimental observations and offering accurate in silico trajectories for studying the navigational role of distinct coarse-grained behaviors.To do so, however, we must first connect posture deforma- Starting from the initial microstate of each worm s0, the next microstate is obtained by sampling from P0j(τ * ).We add new microstates in the same fashion, resulting in a symbolic sequence of length N = T δt/τ * .Within each microstate, we randomly choose a point in the maximally-predictive sequence space, XK * , and use this point to identify the associated sequence of body postures ⃗ at:t+K * .(b) The simulated posture dynamics accurately predicts the autocorrelation function of the "eigenworm" coefficient time series.Shaded regions correspond to 95% confidence intervals of the estimate of the mean autocorrelation function bootstrapped across worms.(c) The simulated posture dynamics accurately predicts the average rate of behavioral events across worms.We estimate the number of reversals, ventral turns and dorsal turns per unit time, and compare the result obtained from the data with simulations of each of the 12 recorded worms, finding excellent agreement.Error bars correspond to bootstrapped 95% confidence intervals on the estimates of the mean.(d) On longer time scales, worms transition between relatively straight "runs" and abrupt reorientations through "pirouettes" [32], which are combinations of reversals and turns (inset, bottom right).We assign each microstate to either a "run" or a "pirouette" by leveraging the inferred transition matrix to identify long-lived stereotyped sequences of posture movements (see section COARSE-GRAINING BEHAVIOR THROUGH ENSEMBLE DYNAMICS).We then estimate the mean transition rates from the run to the pirouette state κR→P (red) and from the pirouette to the run state κP →R (blue) for each of the 12 recorded worms.Data error bars are 95% confidence intervals of the mean bootstrapped across run and pirouette segments, simulation error bars are 95% confidence intervals of the mean bootstrapped across 1000 simulations.
tions with movement in the environment.
Following previous work [35], we approximate the interaction between the worm's body and the viscous agar surface through resistive force theory (RFT) [36].This phenomenological approach assumes that each segment along the body experiences independent drag forces.Despite its simplicity, this approximation has been successfully applied to predict the motility of various organisms in viscous fluids [37][38][39] and granular materials [40].
To propel the worm, we first reconstruct the skeleton positions in each frame x i (t) from the instantaneous tangent angles θ i (t) along each body segment i, Fig. 3(aleft).From these we derive the worm-centric velocities v i (t) = x i (t + 1) − x i (t) and displacements with respect to the center-of-mass position x CM , ∆x i (t) = x i (t) − x CM (t).This results in an expression for the underlying velocities at each body segment as a function of the measured worm-centric v and ∆x and unknown overall translational Ṽ(t) and angular Ω(t) velocities.As in [35], we use linear resistive force theory to decompose the force acting on each body segment into tangent and normal components Fi (t) = α t ṽt i t + α n ṽn i n, with drag coefficients α n and α n , Fig. 3(a-middle).Using this approximation, we can then recover the unknown underlying velocities ṽi (t) by imposing a zero net force and torque condition.The only free parameter in this model is the ratio between the normal and tangential drag coefficients α = α n /α t , which we infer by minimizing the distance between the reconstructed centroid trajectories and the real data (see SI Appendix: Methods), Fig. S5(a).In agreement with the results of Keaveny et al. [35], we find that in such food-free conditions, the value of α that optimizes the reconstruction of centroid trajectories is α * = 30 (29,31).Using such α * , we then reconstruct the centroid path corresponding to the posture time series simulated according to our Markov chain, and show that these qualitatively resemble real worm trajectories, Fig. 3

(a-right).
To further quantify the similarity between the centroid trajectories reconstructed from posture simulations and the data, we estimate the mean squared displacement MSD(τ ) = ⟨(x CM (t + τ ) − x CM (t)) 2 ⟩, which exhibits a transition between super-diffusive (nearly ballistic) and diffusive behavior between 10 s and 100 s [41][42][43][44], Fig. 3(b-left), Fig S5(b).The foraging trajectories corresponding to the operator-based simulations accurately capture the MSD across a wide range of scales, including the ballistic-to-diffusive transition.To further assess the quality of the simulations, we estimate an effective diffusion coefficient by fitting MSD = 4Dτ in the linear regime 2 and find that, across worms, the resulting diffusion coefficients obtained from simulations closely follow 2 We note that on longer time scales the MSD exhibits the behavior of a confined random walk due to the rigid boundaries of the agar plate, which makes it non-trivial to accurately estimate the diffusion coefficient [45].We fit the diffusion coefficient in the the data, Fig. 3(b-right).Our results demonstrate that it is possible to predict long time scale diffusive properties from fast posture dynamics in animals.

COARSE-GRAINING BEHAVIOR THROUGH ENSEMBLE DYNAMICS
As highlighted in Fig. 2, C. elegans foraging behavior exhibits multiple time scales: from the body waves that define short-time behaviors (e.g., forward, reversal, turns), to longer-time sequences (e.g., run, pirouette) that the worm uses to navigate its environment.Typically, these longer-time sequences have been identified phenomenologically by setting thresholds on heuristically defined quantities, as was done in Fig. 2(c).Here, we show that it is possible to reveal the multiple scales of C. elegans locomotion directly from the posture dynamics.Intuitively, stereotyped behaviors correspond to regions of the behavioral space that the animal visits often.In an analogy with statistical mechanics, we can imagine behavior as evolving on a complex potential landscape, where each well corresponds to a particular stereotyped behavior, and the barrier heights set the transition time scales.Such a picture emerges naturally when analyzing the dynamics through an ensemble approach, and we leverage our inferred Markov chain to directly identify metastable behaviors.

"Run-and-Pirouette"
The eigenvalues of the transition matrix provide direct access to the long time-scale properties of the dynamics, even when these are not directly apparent from the original trajectories or the equations of motion.The real part of the eigenvalues {λ i } of P ij (τ * ) quantify the relaxation time to the steady state, The spectrum of relaxation times is shown in Fig. 4(aright), and exhibits both an isolated, longer-lived mode with Λ −1 2 = 2.68 (2.11, 3.27) s, as well as an additional ≈ 10 significant modes.
The eigenvectors corresponding to the long-lived dynamics provide continuous reaction coordinates for a natural coarse-graining.In Fig. 4(b) we color-code the maximally-predictive sequence space according to ϕ 2 ; comparing with Fig. 1(c), we find that negative values of ϕ 2 (red) correspond to positive phase velocities and low curvatures, indicative of "forward runs", while positive regime τ ∈ [60,100], which corresponds to a time scale within which finite-size effects are negligible and the mean squared displacement is approximately a linear function of τ , MSD ∼ τ .

FIG. 3.
Recovering foraging trajectories from posture simulations.(a) We use resistive force theory (RFT) [35] to translate the simulated posture dynamics into movement.We simulate the Markov chain posture dynamics as described in Fig. 2. At each frame t of the simulated posture time series, we reconstruct the coordinates of the i-th segment of the skeleton, xi(t), from the tangent angles θi(t), which are themselves a linear combination of eigenworms with the mode weights particular for each frame.The measured velocities, vi, in the frame-of-reference of the worm, correspond to subtracting the overall velocity Ṽ(t) and angular velocity Ω(t) of the worm from the lab-frame velocities ṽi(t), which are unknown.Following the results Ref. [35], we recover the lab-frame translational Ṽ(t) and angular Ω(t) velocities by leveraging resistive force theory to equate the tangent and normal forces acting in each local segment to the local velocities and imposing a zero net-force and net-torque condition.The ratio between tangent αt and normal αn damping coefficients is the single free parameter α = αt/αn, which we find by minimizing the distance between simulated and real trajectories, Fig. S5(a).We show an example worm trajectory (black), as well as simulated trajectories reconstructed from posture time series generated from the operator dynamics (blue).(b-left) Mean square displacement of centroid trajectories obtained from model simulations (blue) and data (black) for an example worm.Simulation error bars are 95% confidence intervals of the mean bootstrapped across 1000 simulations.The results for all 12 worms analyzed here can be found in Fig. S5 values of ϕ 2 (blue) generally correspond to sequences of reversals, dorsal and ventral turns used during abrupt reorientations, see inset of Fig. 4(b).The inset shows an example 10 minute long centroid trajectory color coded by the projection along ϕ 2 .Negative projections occur during "runs", while positive values are found during abrupt reorientation events composed of sequences of reversals and turns.We thus obtain a slow reaction coordinate that captures the dynamics along a posture-based expression of a "run-and-pirouette" axis.
For example, in Fig. S6 we show the maximally-predictive sequence space color coded by the following 3 long-lived eigenvectors, {ϕ 3 , ϕ 4 , ϕ 5 }.We see that ϕ 3 differentiates dorsal and ventral turns, ϕ 4 differentiates turning and reversals, and ϕ 5 differentiates the compound motifs of shallow turns following a pause, from reversals that are followed by deep δ-turns.Projections onto the slow eigenvectors thus reveal continuous modulations along the short nematode locomotory movements first observed by Croll [19], Fig. S6, as well as longer-lived, compound motifs, such as runs-and-pirouettes, Fig. 4(b).
While the continuous variation along these slow modes neatly highlights the temporal organization of distinct features of worm behavior, we can also use them to identify stereotyped behavioral states as "macroscopic" metastable sets [49,50]: groups of microstates that transition more often within rather than between groups.Stereotyped behaviors correspond to finescale move-FIG.4. Slow modes for coarse-graining: from "runs-and-pirouettes" to stereotyped body waves.(a-left) Schematic of the inferred Markov chain.We partition the maximally predictive sequence space XK * (11 × 5 dimensions) into Voronoi cells, here represented as points in the 2-dimensional UMAP embedding space.The size of each point is proportional to the probability of visiting a given microstate si, and the line width corresponds to the probability of transitioning among distinct states after τ = 1 frame (we only show transitions with Pij > 0.025 for simplicity).(a-right) Relaxation timescales obtained from the 30 eigenvalues with the largest real part of Pij(τ * ) with τ * = 0.75 s.The horizontal gray bar is the largest non-trivial eigenvalue of the transition matrix computed from a shuffled symbolic sequence (see SI Appendix: Methods for details).Error bars are 95% confidence intervals bootstrapped across worms.(b-left) We color the sequence space and an exemplar centroid trajectory (inset) by the first nontrivial eigenvector of the reversibilized transition matrix ϕ2, which optimally separates metastable states [6,46].Positive values correspond to combinations of reversals and turns that reorient the worm, these are "pirouettes", while negative values correspond to "forward runs".(b-right) We use ϕ2 to identify two coarse-grained regions, which we denote as "macroscopic" behavioral states, and find that the complementary cumulative distribution function of the resulting dwell times 1 − P (τ dwell ≤ t) is characterized by two time scales, which we extract by fitting a sum of exponential functions.The inferred time scales are in agreement with previous phenomenological observations of worm behavior [32], and result in a relaxation time Λ−1 = (τ −1 2 ) −1 = 3.31 (2.99, 3.71) s [47] that agrees with the largest eigenvalue of Pij(τ * ), Λ −1 2 , within error bars.The timescale errors and error bars are 95% confidence intervals bootstrapped across events.(c) The subdivision process.At each iteration step, we subdivide the metastable state with the largest measure, S * k , along the first nontrivial eigenvector obtained from the transition probability matrix conditioned only on the states within that metastable state [48].This results in a top-down subdivision of behavior that follows the structure of an effective free energy landscape.The size of the circles represents the relative measure in each state.For interpretability, we stop at the 5th subdivision, yielding 7 "mesoscopic" states (characterized below).(c-right) Cumulative distribution of the mesoscopic state dwell times: the duration is short (≈ 2τ * ).The inset shows a transition diagram in which the size of the nodes and the edge widths are proportional to the measure in each behavior and the transition probabilities, respectively (transition probabilities < 0.05 are not graphed for simplicity).
ments that occur more often than not in a sequence.In other words, stereotypy implies a timescale separation between variations on what we call a behavioral state, and transitions among distinct stereotyped behaviors.This notion of stereotypy is naturally captured by the notion of coherent/metastable sets, which can be directly identified through the eigenvalues and eigenvectors of the Markov chain.In this way, the structure of these sets and the kinetics between them offer a coarse-graining of behavior that directly follows the multiple timescales of the dynamics.We note that this construction provides a precise meaning to the discreteness of such behavioral states: the discrete approximation is appropriate at a particular timescale of interest.Faster timescales require finer scale states, while longer timescales can be neatly approximated by larger coarse-grained states.
More principedly, we start by identifying long-lived behavioral states as almost invariant sets [46] in the reconstructed state space.These sets are optimally determined by the second eigenvector ϕ 2 of the timesymmetric reversibilized transition matrix P r , which we infer from the ensemble of worms (see SI Appendix: Methods) [51,52].We search along ϕ 2 for a single threshold that maximizes the metastability of both resulting coarse-grained sets (see SI Appendix: Methods) [6], Fig. S7, and we identify the resulting two macrostates as "run" and "pirouette".The complementary cumulative distribution of the time spent in either of the two states 1 − P (t beh ≤ t) is roughly characterized by two time scales, Fig. 4(b-right), fit by a sum of exponential functions and in excellent agreement with previous phenomenological observations [32].In addition, these transition timescales are related to the timescale of relaxation to the steady state distribution as Λ−1 = 1/(τ −1 2 ) = 3.313 (2.985, 3.709) s [47], which agrees with the relaxation times of the transition matrix within statistical accuracy, Fig. 4(a-right).In our analysis "run-and-piroutte" kinetics emerge directly from worm-centric posture dynamics, without any positional information.

"Run(s)-and-Pirouette(s)"
While dividing the dynamics along ϕ 2 identifies the longest lived states, the existence of other significant eigenvectors, Fig. 4(a), indicates that there are finerscale divisions.To identify such states in a way that naturally reflects the behavioral landscape, we iteratively subdivide the largest metastable state into faster mesoscopic states [48], Fig. 4(c).In practice, we identify the most-visited metastable state, construct a reversibilized transition matrix using only the microstates within that metastable set, and use its first nontrivial eigenvector to subdivide the dynamics (see SI Appendix: Methods for details).This is akin to subdividing a free-energy landscape; at each iteration, we subdivide the system along the largest energy barrier within the highest measure basin.We also note that our subdivisioning proceeds from the longest-lived states down rather than from the shortest-lived states up, where the latter is more common in behavior coarse-graining approaches [29,53,54].
In the foraging behavior of C. elegans, beyond the initial division into runs and pirouettes (which we denote as "macroscopic" states), we further subdivide the dynamics into 7 "mesoscopic" interpretable states: 4 distinct run states and 3 subdivisions of the pirouette state Fig. 4(c), S8.The run state essentially splits into two fast states and two slower states, which can be distinguished either by the wave length of the body, or by having a particular bias towards the dorsal or ventral sides: the dorsally-biased slow state is akin to a headcasting state [55], while the ventrally-biased stated is akin to a "dwelling" state [56][57][58], with incoherent head and tail movements and no propagating wave [13].On the other hand, the pirouette state neatly splits into dorsal turns, deep ventral δ-turns, and reversals followed by shallow Ω-turns.
These mesoscopic states that decorate the worm's foraging behavioral landscape are short-lived, with a characteristic time scale of ⟨τ dwell ⟩ = 1.65 (1.63, 1.68) s ≈ 2τ * , Fig. 4(c-right).The transition diagram between them, Fig. 4(c-right, inset), reveals the fine-scale organization of the worms' foraging strategy.Further subdivisions result in even shorter-lived states, which are increasingly challenging to interpret.

EXPLORING THE ROLE OF THE MESOSCOPIC STATES
In the data analyzed here, worms were grown in a food-rich environment, but then placed on food-free agar plates and allowed to move without restrictions.Under these conditions, the worm's behavior has been qualitatively described as foraging [59,60].We apply our approach to better understand the role of the mesoscopic states in the worm's search for food.
We use the Markov model to simulate in silico worms that are forced to remain in a particular mesoscopic state and the posture-to-path framework to investigate the properties of the trajectories resulting from the posture dynamics in each of the states.We can simulate trajectories that are much longer than those observed in the data, Fig. 4(c-right), allowing us to dissect how different states produce distinct large-scale tracks.In Fig. 5(a), we show simulated 10 min long trajectories for each of the 7 mesoscopic states.Notably, the difference in posture wavelengths exhibited by the two fast "run" states, Fig. S8(b), results in dramatically different trajectories, with the longer wavelength state (fast wide runs) resulting in overall straighter paths, and the shorter wavelength state (fast narrow runs) resulting in ventrally-biased curved trajectories with a diameter that is several times the body length and a period orders of magnitude longer than the body wave period, Fig. S9.Interestingly, the dorsally-biased slow state also results in loopy trajectories, but with a shorter diameter and FIG. 5. Leveraging our multiscale description to explore the role of behavioral states.
We simulate long trajectories while forcing the worm's dynamics to remain within a single mesoscopic state S obtained through a subdivision of the behavioral space into 7 states, Fig. 4(c).We proceed as in Figs.2,3, but sample each new state according to a transition matrix built only within the partitions belonging to S, ŝj(t + τ ) ∼ PS(sj|ŝi(t)), i, j ∈ S. (a) Example 10 min long centroid trajectories of each state.(b) Probability of finding food in a uniform food patch of radius r for a particular state.We find that the most efficient behavioral states change depending on the radius of the food distribution: at short distances, "pirouette" and slow "run" states provide better chances of finding food, whereas at larger distances the two fast "run" states are most efficient.The black line represents the strategy of an average worm p(food|r, average worm) = S π(S)p(food|r, S), obtained as a weighted average over the probability of finding a worm in a particular mesoscopic state S, π(S).While the worm's movement strategy is more likely to find food sources scattered at larger distances, the ensemble of mescoscopic states can be utilized for foraging success across scales.(c) The likelihood of finding food in a large uniform food patch for each of the states closely matches the probability of finding a worm in a particular state.
faster recurrence time.In addition, the ventrally-biased "dwelling"-like slow state [56][57][58] with its frequent head retractions results in a denser sampling of a local patch.Finally, the three "pirouette" states result in a denser sampling of space and a reduced centroid displacement.
We next interrogate the efficiency of each of the 7 mesoscopic states at encountering food uniformly distributed within a disc of radius r around the initial position, Fig. 5(b), a simple but informative condition.We find that the "pirouette" states as well as the slow "run" states are most efficient at finding food at shorter distances, while at larger distances the two fast "run" states perform best.Such a differential use of behaviors, evocative of an exploration-exploitation trade-off, is also seen in nature.Upon encountering food, C. elegans, as well as many other species, engage in area restricted search, which is characterized by shorter paths and a high frequency of large angle turns [34,43,59,[61][62][63][64][65].Conversely, upon removal from food, C. elegans lowers its turning rate [42,66] to engage in global search or long distance travel [34,59,[63][64][65].
Remarkably, we find that instead of only using the most efficient behavioral state ("fast wide runs"), worms engage in a strategy that employs each mesoscopic state in a proportion that closely matches the relative efficiency of the different states at finding food uniformly distributed in a large patch (several body lengths), Fig. 5(c).This "probability matching" behavior has been observed across several species, including humans (see, e.g., [67][68][69][70][71][72]), and emerges naturally in "multi-armed bandit" situations in which agents must decide among different actions that yield variable amounts of reward without knowing a priori the relative reward of each action (see, e.g., [73]).

DISCUSSION
We combine maximally-predictive short posture sequences with a Markov chain model to bridge disparate scales in the foraging dynamics of the nematode worm C. elegans.Rather than seeking low-dimensional descriptions of the data directly (e.g.[43,58,[74][75][76][77]), we instead first expand in representation complexity: enlarging the variable of interest to include time in the form of posture sequences and constructing a maximum entropy partition to capture as much predictive information as possible.This expansion both in time and number of microstates is similar in spirit to that currently found in large language models, though our conceptual approach is dramatically simpler.
The maximally-predictive sequence space combines worm postures from roughly a quarter of the duration of a typical body wave, in agreement with previous work [22].On longer timescales, the posture-based "run-andpirouette" navigation strategy [32,78] derived from the inferred Markov dynamics provide an accurate and principled coarse graining of foraging behavior, disentangling motions that are confounded by centroid-derived measurements (see e.g.[79]).This is particularly evident in our subdivision of the behavioral space.For example, we identify distinct "run" gaits that exhibit comparable centroid speeds, but are clearly distinguishable by the posture dynamics.Additionally, our top-down subdivision of behavior reflects the hierarchy of timescales in C. elegans foraging behavior [55].Our approach systematically identifies such a control hierarchy from behavioral recordings alone, connecting posture timescales to "run-and-pirouette" kinetics.It will be interesting to investigate how the mesoscopic states identified here are controlled by the nervous system of the worm, and recent advances in experimental techniques that permit simultaneous neural and behavioral imaging in C. elegans provide an exciting path toward such discoveries [80][81][82][83][84].
The power of our modeling approach is in its simplicity; we bridge scales using a simple but effective Markov model, and this is only possible by recognizing and exploiting the mutual dependence between modeling and representation.Instead of directly modeling the posture time series (which can require higher-order and highly non-linear terms, see e.g.[20]), we search for maximally predictive states such that a simpler Markovian description can nevertheless accurately predict behavior.These emergent Markov dynamics offer a promising and powerful demonstration of quantitative connections across the hierarchy of movement behavior generally exhibited by all organisms [76,85].
By finely partitioning the space of posture sequences, we encode continuous nonlinear dynamics through a Markov chain with a large number of states.This is analogous to building a hidden Markov model (HMM), but one in which the "hidden" states are actually observable (through time delays of our observations), and for which there is a one-to-one correspondence between "hidden" states and emitted symbols: each observation in the posture sequence ⃗ a(t) uniquely determines the state X K * (t).While HMMs are commonly used in behavioral analysis (see e.g.[13,86]), they are rarely built with so many states and with the goal of correctly predicting dynamics.In particular, most approaches employ a small number of discrete behavioral states, where the number of states is a hyperparameter of the model and the discretization is not unique.In contrast, we let the data reveal the "hidden" states through time delays, and set the discretization so as to maximize predictive information.In this sense, the HMM we build is unique: by revealing the hidden dynamics through time delays, the "hidden" states are uniquely determined by the observations, making the HMM unifilar [87].In other words, the "hidden" states themselves have a very definite meaning in our approach: we effectively group together "pasts" that have equal predictability over the future up to an ϵ-resolution (set by the number of partitions N ∼ ϵ −D emb , where D emb is the intrinsic embedding dimension of the dynamics), approximating the system's causal states [88].
This set of states together with the resulting Markov chain effectively constitutes an ϵ-machine [89], the minimal maximally-predictive machine.Any other HMM in which hidden states are not causal returns models that severely overestimate the complexity of the dynamics.In addition, even though we start with a large transition matrix, we can coarse-grain it by identifying which states commonly follow each other in time to generate stereotyped sequences.In this way, instead of imposing discrete states from the start (as is common with HMM approaches), we first identify a large number of predictive causal states and only then leverage the resulting Markovian dynamics to identify coarse-grained stereotyped behaviors.
Our information theoretic framework also frees us from the constraint of linearity that is commonly imposed in graphical models applied to animal behavior (such as autoregressive hidden Markov models [30]).In particular, while the stereotyped states found through such models are encoded by linear dynamics, the states we identify can exhibit much more complex nonlinear dynamics, allowing us to capture longer time-scale structures in behavior.
Are Markov models enough to capture the richness of animal behavior more universally?It is important to distinguish between two sources of non-Markovianity.The first one is general and is simply induced by the fact that time series data are typically only a partial observation of the full dynamical state [6,7].Projecting the full unobserved dynamics onto a subset of observable degrees of freedom inherently results in non-Markovian dynamics for the measured variables [2,3,90,91].Such an "under-embedding" might result in apparent memory when naively constructing behavioral states.The second source of non-Markovianity, which is less trivial and likely ubiquitous in behavior, derives from the fact that there may be "hidden" latent variables that modulate behavior over timescales comparable to the measurement time [5].In this case, the steady-state distribution itself is changing slowly over time, rendering the dynamics explicitly non-stationary [92].
A relevant demonstration of non-stationarity is provided by the adaptive changes in pirouette rate seen in the behavior of C. elegans upon removal from a foodrich environment [34,42,43,59].This adaptation is present in the data we analyze and is not captured by our Markov model [92], Fig. S10.To characterize such non-ergodic latent variables requires explicitly timedependent Markov models, which we leave for future work.We note, however, that our coarse graining can be easily extended to capture non-stationary dynamics through the discovery of τ -dependent coherent sets that identify moving regions of the state space that remain coherent within a time scale τ [93][94][95][96][97][98].
Particularly interesting future directions include the analysis of even longer dynamics in C. elegans [13,[99][100][101], where we expect to be able to extract longer-lived behavior strategies, such as the minutes-long transitions between "roaming" and "dwelling" states in food-rich environments [56][57][58].Our modeling approach can also be used as means to obtain a deeper understanding of the effects of genetic, neural, or environmental perturbations on the multiple scales of C. elegans behavior.Indeed, the inferred transition matrices are a powerful phenotype that encapsulates multiple scales of C. elegans foraging behavior, and has the power to reveal how behavior is affected by a given perturbation.Of particular relevance for the study of long timescales in behavior would be to focus on mutations that impair neuromodulatory pathways and are thus likely to impact the spectrum of relaxation times of the inferred Markov chain.
The effectiveness of our Markov model at capturing the nonlinear dynamics of C. elegans body pose, combined with the ability to translate those spatiotemporal dynamics into movement, have allowed us to investigate how different behavioral states result in distinct ways of exploring the environment at much larger scales.Our analysis recovered the two main foraging modes exhibited by C. elegans: one that combines different pirouette states and slow runs resulting in a local search, and another one that mostly leverages fast run states to search for food more globally [34,43,59,63].
We also discovered that the relative use of different behavioral states closely follows the relative efficiency of each state in food discovery.In fact, instead of using the behavioral state that would maximize its chances of finding food in its environment, worms match their strategy with the relative efficiency of each state.Interestingly, such a strategy, termed probability matching [71], or Thompson sampling [102], is a well-studied heuristic solution for the multi-armed bandit problem, a game in which different actions have variable rewards that are a priori unknown to the player, whose goal is to maximize total pay-out.Evidence for probability matching in decision making tasks has been previously demonstrated in experiments in animals and humans [103,104], and is an active area of research in cognitive science of decision making [105].
While this strategy seems "irrational" in the context of maximizing reward in a fixed environment, that is not the condition in which worms have evolved: in ecologically-relevant situations the environment changes over time, rendering the distribution of rewards nonstationary and the subsequent sampling events correlated.Interestingly, reinforcement learning agents that have been evolved in a changing environment also develop probability matching strategies [106].In addition, it has been shown that optimal Bayesian learners engage in probability matching when they expect sampling events to have temporal dependencies [107], as is the case in most ecologically relevant scenarios in which samples of the environment are not independent, but exhibit temporal correlations (at least as a result of the actions taken by the agent).This suggests that probability matching may reflect an exploration-exploitation tradeoff that robustly maximizes reward in an ever-changing environment.Our results indicate that C. elegans may implement such a heuristic in its foraging strategy.If the worm is indeed probability matching, it may have a way of storing its estimate of the current probability of success for each strategy (which may be reflected the dynamics itself).It will be fascinating to look for signatures of this in situations where we can experimentally adjust the pay-out probabilities.It may also be possible in a tractable organism like C. elegans to estimate metabolic costs to utilize each behavioral state [108].The multitude of genetic tools [109], the ability to image neurons in behaving animals [80][81][82][83][84] and to quantify behavior in detail makes C. elegans an ideal system to look inside of an organism as it experiences the world, makes decisions based on those observations and its internal model, and updates that internal model based on the outcomes of those decisions and their effect on the environment.

MATERIALS AND METHODS
Details of the reconstruction of maximally-predictive posture sequences, Fig. 1, and the top-down subdivision of C. elegans behavior, Fig. 4, can be found in SI Appendix, and build upon the implementation of [6].The SI Appendix also includes additional information about the simulation of symbolic sequences and posture time series presented in Fig. 2. Finally, the details of the posture-to-path simulations used in Figs. 3, 5 can also be found in the SI Appendix and build upon the implementation of [35].Software and data availability: Code for reproducing our results is publicly available: https://github.com/AntonioCCosta/markov_worm/.Data can be found in [110].

MATERIALS AND METHODS
Software and data availability: Code for reproducing our results is publicly available: https://github.com/AntonioCCosta/markov_worm/.Data can be found in [110].

C. elegans foraging dataset:
We used a previously-analyzed dataset [20], in which young-adult N2-strain C. elegans were originally imaged at f = 32 Hz with a video tracking microscope on a food-free plate and then downsampled to f = 16 Hz to speed-up the process of resolving coiled postures [28].Worms were grown at 20 • C under standard conditions [111].Before imaging, worms were removed from bacteria-strewn agar plates using a platinum worm pick, and rinsed from E. coli by letting them swim for 1 min in NGM (Nematode Growth Medium) buffer.They were then transferred to an assay plate (9 cm Petri dish) that contained a copper ring (5.1 cm inner diameter) pressed into the agar surface, preventing the worm from reaching the side of the plate.Recording started approximately 5 min after the transfer and lasted for 2100 s, for a total of T = 33600 frames.Each frame is converted into a 5dimensional "eigenworm" representation ⃗ a(t) by projecting the local tangent angles along the worm's centerline onto an "eigenworm" basis [20], Fig. 1(a).
Maximally predictive states: Given the measurement time series, ⃗ a(t), with t ∈ {δt, . . ., T δt} and ⃗ a ∈ R 5 , we build a trajectory matrix by stacking K time-shifted copies of ⃗ a, yielding a (T − K) × Kd matrix X K .For each K, we partition the candidate state space and estimate the entropy rate of the associated Markov chain (see below).We choose K * such that ∂ K h(K * ) ∼ 0, which defined X * K as the maximally predictive states [6], Fig. S1(a).State space partitioning: We partition the state space constructed from the ensemble of worms into N Voronoi cells, s i , i ∈ {1, . . ., N }, through k-means clustering with a k-means++ initialization using scikit-learn [112].
Transition matrix estimation: We build a finite dimensional approximation of the Perron-Frobenius operator using an Ulam-Galerkin discretization [50].In practice, given T observations, a set of N partitions, and a transition time τ , we compute where ζ i (x) are the Ulam basis functions, which are characteristic functions 1, for x ∈ s i 0, otherwise set by the k-means clustering.The maximum likelihood estimator of the transition matrix is obtained by simply row normalizing the count matrix, , which yields an approximation of the Perron-Frobenius operator.
Invariant density estimation: Given a transition matrix P , the invariant density is obtained through the left eigenvector of the non-degenerate eigenvalue 1 of P , πP = π: π i is the probability of finding the system in a partition s i .
Short-time entropy rate estimation: Given a number of partitions N and a sampling time scale τ = δt, we estimate the Markov transition matrix P and the corresponding invariant density π as detailed above and compute the short-time entropy rate as, To obtain error bars in Fig. 1, S1(a), we estimate a transition matrix for each worm using its symbolic sequence, and then estimate its corresponding entropy rate.
Two-dimensional UMAP embedding: We use the UMAP embedding [31] as a tool to visualize the maximally predictive states of C. elegans posture dynamics.In a nutshell, the UMAP algorithm searches for a low dimensional representation of the data that preserves its topological structure.We use a publicly available implementation of the algorithm found in https://github.com/lmcinnes/umap,within which we chose the Chebyshev distance metric to compute distances in the high-dimensional space, n neighbors=50 nearest neighbors and min dist=0.05as the minimum distance.
Matrix diagonalization: The high dimensionality and the sparsity of the transition matrices for large N in numerical errors when using a naive estimator for the full spectrum of eigenvalues.In addition, since we are interested in the longest lived dynamics, we focus on finding only the n modes largest magnitude real eigenvalues using the ARPACK [113] algorithm.
Choice of transition time τ * : We choose τ * such that the resulting Markovian dynamics approximate the longterm behavior of the system accurately, as in [6].In practice, we find the shortest transition time scale after which the inferred implied relaxation times reach a plateau, Fig. S1(b,c).For τ too short, the approximation of the operator yields a transition matrix that is nearly identity (due to the finite size of the partitions and too short transition time), which results in degenerate eigenvalues close to λ ∼ 1: an artifact of the discretization and not reflective of the underlying dynamics.For τ too large, the transition probabilities become indistinguishable from noisy estimates of invariant density, which results in a single surviving eigenvalue λ 1 = 1 while the remaining eigenvalues converge to a noise floor resulting from a finite sampling of the invariant density.Between such regimes, we find a region with the largest time scale separation which also corresponds to the regime for which the longest relaxation times, Eq. ( 2), are robust to the choice of τ , Fig. S1(b,c).To obtain a noise floor (horizontal line in Fig. 4(a-right)), we shuffle the symbolic sequence, reestimate the transition matrix, and compute its first nontrivial eigenvalue.In the limit of infinite data, this shuffle contains only one surviving nonzero eigenvalue corresponding to the steady-state distribution (infinite relaxation time).The observation that the second largest eigenvalue is nonzero even in the shuffle is due to finite-size effects that result in small deviations from the invariant density.For further discussion see [6].To estimate error bars in Figs.4,S1, we estimate the eigenvalues of transition matrices obtained for each worm.

Cross-validation experiments:
We chose the number of partitions N * so as to capture as much finescale detail of the dynamics without inducing finite-size effects on the estimates of the entropy rate, Fig. 1.To further attest that this choice results in generalizable models, we additionally performed cross-validation experiments as follows.
We split each worm dataset into 10 segments of 3.5 minutes each, and randomly select 3 segments as a test set and the remaining 7 segments as a training set.We repeat this process over 50 random shuffles, estimating a transition matrix from the training set and making predictions over the unseen test data.In Fig. S4(a-c), we simulate the test data with the model parameters estimated in the training data, and compare such simulations against the data bootstrapping over 50 random reshuffles of train-test sets.
C. elegans posture simulations: At each iteration, we sample from the conditional distribution given by the Markov chain inferred for each worm P w (s j (t + τ * )|s i (t)) to generate a symbolic sequence sampled on a timescale τ * .We then randomly sample a state space point X K * within the partition s i , and unfold it to obtain a sequence of postures ⃗ a t:t+K * at each τ * .We can thus generate artificial posture time series with the same duration as the experimental time series (35 minutes), but with a missing frame every τ * frames (the gap between K * and τ * ), which we interpolate across using a cubic spline with scipy's interpolate package [114], and smooth with a cubic polynomial and a window size of 11 frames using the signal.savgolfilter package from Scipy [114].We then take the simulated ⃗ a(t) time series and transform it back to the tangent angles at each body segment θ i (t) using the "eigenworms" [20].
Resistive force theory simulations: We recover the rigid body motion from the tangent angle time series using linear resistive force theory, as in [35].We approximate the forces acting independently on each body segment as where ṽt,n i are the tangent and normal components of the velocity at each segment i, which can be written in terms of the velocity and displacements measured after subtracting the overall rigid body motion, ṽi (t) = v i (t) + Ṽ(t) + Ω(t) × ∆x i (t).
Then, by imposing a zero net-force and net-torque condition at each frame, i Fi = 0 i Fi × ∆x i = 0, we obtain a system of linear equations that for a given α = α n /α t can be solved for the components of the worm's velocity Ṽ(t) and angular velocity Ω(t) [35].From these we can integrate the path taken by the worm's body to obtain a reconstructed xCM (t).
We optimize the single free parameter α by comparing the reconstructed trajectories with the real worm trajectories x data CM , Fig. S5(a).In particular, we minimize the maximum distance between 100 s trajectories randomly sampled from the dataset L(α) = max(∥x α CM (t) − x data CM (t)∥ 2 ), t ∈ [t 0 , t 0 + 100 s].To minimize L(α) we use the Nelder-Mead algorithm through the scipy.optimizelibrary of Scipy [114].The software to translate posture into path can be found in https://github.com/AntonioCCosta/markov_worm,and follows closely the implementation of [35].
Metastable states: Metastable states correspond to collections of short-time movements that typically follow each other in time to give rise to stereotyped sequences.Leveraging our previous work [6], we search for metastable states along the slowest mode of the reversibilized dynamics [51].As shown in [46], the second eigenvector ϕ 2 of a time-reversibilized transition matrix P r provides an optimal subdivision of the state space into almost invariant sets.In practice, we use the ensemble of worms to estimate P r as where, is the stochastic matrix governing the time-reversal of the Markov chain.The first non-trivial (λ < 1) right eigenvector of P r , ϕ 2 , allows us to define macrostates as collections of microstates s i , where ϕ c 2 is a threshold that is chosen to maximize the metastability of a set.We measure the metastability of each set S by estimating how much of the probability density remains in S after a time scale τ , To estimate the overall measure of metastability across both sets S + and S − , we define which we maximize with respect to ϕ c 2 .Metastable states are then defined with respect to the sign of ϕ 2 − ϕ c 2 .See [6] for further details and applications to known dynamical systems.In Fig. S7 we show the overall coherence measure as a function of ϕ 2 for the worm data.
Operator-based state space subdivision: We leverage the notion of relatively coherent sets [48] to subdivide the state space.However, instead of subdividing both metastable state at each iteration k, we identify the state with the most measure S * k and build a new transition matrix only with partitions belonging to that state, P S * k (τ ) = p(s j (t + τ )|s i (t)), i, j ∈ S * k .
From P S * k we proceed as before: we compute the stationary distribution of S * k through the first left eigenvector of P S * k , π * i , build the corresponding reversibilized transition matrix P r,S * k and identify relatively metastable states through its first non-trivial eigenvector by maximizing Eq. (S5) where π i and P ij (τ ) are replaced by their relative counterparts π * i and P S * k .Simulating posture-to-path within mesoscopic behavioral states: To generate a centroid trajectory within a given state, we construct a transition matrix among the partitions corresponding to each of the mesoscopic states identified in Fig. 2(c) using data from all worms.We then proceed as in Figs.2,3 to generate both posture time series and centroid trajectories.We first generate a symbolic sequence by sampling states according to the corresponding transition probability matrix ŝj (t + τ ) ∼ P S (s j |ŝ i (t)), i, j ∈ S. From the symbolic sequence, we then sample a time series segment ⃗ a t:t+K * within each sampled partition, and use resistive force theory to translate the resulting θ(t) time series into locomotion.In this way, we can simulate posture and centroid trajectories for in silico worms that are forced to remain within a particular mesoscopic behavioral state for an arbitrary amount of time.
Probability of finding food as a function of distance and behavioral state: We estimate the likelihood of finding food in a given radius r by estimating the fraction of the area within a disc of radius r covered by the worm's body during 100 s trajectories, taking the worm's width to be 5% of its length.We then normalize these area fractions by the total across states, obtaining the p(food|r, state) showed in Fig. 5(b).As expected due to the unpredictable nature of the dynamics [22], the quality of the predictions worsens as time progresses.Nonetheless, the structure of the dynamics is well preserved, making it hard to know a priori which of the two time traces is the data and which one is a simulation.(b) -We assess the predictive power of the Markov model by estimating how prediction errors grow over time.We define E(τ ) by estimating the average distance between two trajectories starting within the same partition: t is chosen at random and t ′ ̸ = t chosen such that ⃗ a t ′ belongs to the same partition as ⃗ at; the expectation value is then taken over multiple samples of t.We compare E(τ ) estimated from simulations (blue) against sampling a trajectory from the data starting from the same partition (black).As summary statistics, we compute the time it takes before predictions completely mix, obtaining τ mix data = 5.19 (4.50, 5.88) s for the data, just slightly higher than τ mix sim = 4.56 (4.31, 5.06) s from simulations.In practice, we estimate the average distance between two randomly sampled points e∞ = ⟨∥⃗ ai(t) − ⃗ ai(t ′ )∥2⟩ t,t ′ , and find the time it takes for E(τ ) ≤ 0.95e∞.Details of the posture-to-path simulations.(a) Optimizing the free parameter α = αn/αt in RFT (see Methods for details).We define a loss function L(α) as the maximum distance between the real worm trajectory and a reconstructed one, and sample randomly from 100 s segments.We the find the optimal α * through the Nelder-Mead algorithm (see Methods).The resulting distribution of α * is broad, as shown through the cumulative distribution function (CDF), with a median of α * = 30 (29, 31) s −1 .On the right, we display two example trajectory reconstructions for different values of α.For α = 5 (comparable to experimental measures of [115]), RFT typically results in a substantial undershoot of the observed trajectories (as observed in [35]).For the no-slip condition, α ≫ 1, the resulting trajectories overshoot the real worm trajectories, indicating that some degree of slip is needed to accurately predict worm trajectories.With values of α * ≈ 30 we get an accurate reconstruction of the worm trajectories.5 .We see that ϕ3 differentiates dorsal and ventral turns, ϕ4 differentiates turning and reversals, and ϕ5 differentiates the compound motifs of shallow turns following a pause, from reversals that are followed by deep δ-turns.Together with ϕ2 these modes provide a principled encoding of C. elegans off-food behavior across scales.FIG.S7.Coherence measure used to define the metastable states.We define metastable states by maximizing the overall coherence, Eq. (S5), of the two macroscopic states obtained by partitioning the state-space along ϕ2 (see Methods for details).We here plot the coherence of each set (orange and blue), as well as the overall minima across sets χ, Eq. (S5) (gray dashed line).The maximum of χ is highlighted with a red cross and indicates the value of ϕ2 that defines the metastable states, ϕ2 = ϕ c 2 .

FIG. S8.
Characterization of the 7 mesoscopic states obtained through a top-down subdivision of the C. elegans posture state space.(a) -Probability Distribution Function (PDF) of the body wave phase velocity ω = − 1 2π d dt tan −1 (a2/a1) (left) and the dorso-ventral turning amplitude (right), as measured through the third "eigenworm" coefficient a3 [20,28].The state labels were given based on these probability distributions.(b) Example local tangent angles θi as a function of time in each of the states, as well as illustrative postures sampled at different time points (black and gray vertical lines).The two fast states (top) correspond to distinct gaits with different wavelengths (right), while the slow states (middle) lack a coherent body wave traveling from head to tail.Instead, the dorsally-biased slow states exhibits short timescale head-casting behavior [55], while the ventrally-biased state exhibits incoherent body motion with partial reversal and forward waves akin to a dwelling state [56].

FIG. S9.
Statistical properties of the trajectories simulated for each of the 7 mesoscopic states obtained through a top-down subdivision of the C. elegans posture state space.(a) -Mean square displacement (MSD) of the center-of-mass trajectories for each of the 7 states revealed in Fig. 4(c).Trajectories in the different states exhibit clearly distinct statistical properties: while "run" states generally exhibit a transition between super-diffusive behavior (MSD ∼ τ β , β > 1) at short times and diffusive (or sub-diffusive) behavior (MSD ∼ τ β , β ≤ 1) at large times, the "pirouette"' states are mostly diffusive even at short times.In addition, some of the states exhibit non-trivial fluctuations in the MSD that result from the quasi-periodic loops observed in the trajectories shown in Fig. 5 Besides the fast oscillations due to the body wave dynamics (the vertical dashed line represents the body wave phase velocity in the fast wide run state), some states also exhibit low frequency peaks due to the loopy nature of the trajectories, which recur on a time scale orders of magnitude longer than the body wave period.The power spectral density was estimated using Welch's method [116] implemented through the signal.welchpackage from Scipy [114] with a Hann window and 10 min long trajectory segments.The error bars corresponding to 95% confidence bootstrapped across 1000 simulations for each state are too small to show.

FIG. S10.
The rate of pirouettes changes with time on the food-free plate, a non-stationary dynamics not captured in our Markov model.We estimate the rate of pirouettes as a function of time from the data (black), and find that it slowly decreases, reflecting a change in search strategy possibly as a result of updates to the animal's prior over the food distribution.Our simulations (blue) do not capture this slow change in the pirouette rate, as that would require an explicit time dependence in the transition probability matrix.Error bars correspond to 95% confidence bootstrapped across the 12 worms in our dataset.To estimate the rate of pirouettes, we coarse-grain the dynamics into "runs" and "pirouettes" as in Fig. 4, and find the number of pirouette events per minute in sliding 5 min windows with 2.5 min overlap.Since the sampling time of the Markov dynamics is τ * , we discard pirouette events with a duration shorter than τ * from this analysis.

FIG. 2 .
FIG. 2. A Markov model captures posture dynamics across timescales.(a) Schematic of the simulation method.Starting from the initial microstate of each worm s0, the next microstate is obtained by sampling from P0j(τ * ).We add new microstates in the same fashion, resulting in a symbolic sequence of length N = T δt/τ * .Within each microstate, we randomly choose a point in the maximally-predictive sequence space, XK * , and use this point to identify the associated sequence of body postures ⃗ at:t+K * .(b) The simulated posture dynamics accurately predicts the autocorrelation function of the "eigenworm" coefficient time series.Shaded regions correspond to 95% confidence intervals of the estimate of the mean autocorrelation function bootstrapped across worms.(c) The simulated posture dynamics accurately predicts the average rate of behavioral events across worms.We estimate the number of reversals, ventral turns and dorsal turns per unit time, and compare the result obtained from the data with simulations of each of the 12 recorded worms, finding excellent agreement.Error bars correspond to bootstrapped 95% confidence intervals on the estimates of the mean.(d) On longer time scales, worms transition between relatively straight "runs" and abrupt reorientations through "pirouettes"[32], which are combinations of reversals and turns (inset, bottom right).We assign each microstate to either a "run" or a "pirouette" by leveraging the inferred transition matrix to identify long-lived stereotyped sequences of posture movements (see section COARSE-GRAINING BEHAVIOR THROUGH ENSEMBLE DYNAMICS).We then estimate the mean transition rates from the run to the pirouette state κR→P (red) and from the pirouette to the run state κP →R (blue) for each of the 12 recorded worms.Data error bars are 95% confidence intervals of the mean bootstrapped across run and pirouette segments, simulation error bars are 95% confidence intervals of the mean bootstrapped across 1000 simulations.
FIG.3.Recovering foraging trajectories from posture simulations.(a) We use resistive force theory (RFT)[35] to translate the simulated posture dynamics into movement.We simulate the Markov chain posture dynamics as described in Fig.2.At each frame t of the simulated posture time series, we reconstruct the coordinates of the i-th segment of the skeleton, xi(t), from the tangent angles θi(t), which are themselves a linear combination of eigenworms with the mode weights particular for each frame.The measured velocities, vi, in the frame-of-reference of the worm, correspond to subtracting the overall velocity Ṽ(t) and angular velocity Ω(t) of the worm from the lab-frame velocities ṽi(t), which are unknown.Following the results Ref.[35], we recover the lab-frame translational Ṽ(t) and angular Ω(t) velocities by leveraging resistive force theory to equate the tangent and normal forces acting in each local segment to the local velocities and imposing a zero net-force and net-torque condition.The ratio between tangent αt and normal αn damping coefficients is the single free parameter α = αt/αn, which we find by minimizing the distance between simulated and real trajectories, Fig.S5(a).We show an example worm trajectory (black), as well as simulated trajectories reconstructed from posture time series generated from the operator dynamics (blue).(b-left) Mean square displacement of centroid trajectories obtained from model simulations (blue) and data (black) for an example worm.Simulation error bars are 95% confidence intervals of the mean bootstrapped across 1000 simulations.The results for all 12 worms analyzed here can be found in Fig.S5(b).(b-right) Effective diffusion coefficients obtained from simulations and from the data.We estimate D by fitting the slope of the mean square displacement curves in the range τ ∈ [60, 100] s, MSD(t) = 4Dt.Errors are standard deviations of the estimated diffusion coefficients across simulations.

FIG. S1 .
FIG. S1.Details of the Markovian approximation of the C. elegans posture dynamics through maximally predictive states.(a) -Change in short-time entropy rate as a function of delays K for N = 1000 partitions.The entropy rates reaches a plateau after K ≳ 0.5 s and we choose K * = 11 frames = 0.6875 s.Error bars represent 95% confidence intervals bootstrapped across worms.(b) -The ten largest relaxation timescales of the reversibilized transition matrix as a function of transition time τ , and corresponding eigenvalues (inset).For τ → δt the transition matrix is nearly the identity matrix (within τ most transitions occur within each partition), resulting in nearly degenerate eigenvalues close to 1 and an overestimation of the relaxation timescales of the reversibilized dynamics.On the other hand, when τ ≳ 5 s the dynamics is mostly mixed, meaning that the transition matrix is composed of near copies of the steady-state distribution.In this regime, the eigenvalues of Pr, λi, become approximately constant, and therefore Λ −1 i (τ ) = −τ / log λi(τ ) grows linearly with τ .Between these two regimes, the relaxation timescales are approximately constant, and this robustness to τ is indicative of Markovian dynamics.We choose τ * = 0.75 s as the shortest τ * consistent with Markovian dynamics.Error bars are 95% confidence intervals bootstrapped across worms.(c) -The reversibilized transition matrix provides an optimal partition into almost invariant sets (see Section COARSE-GRAINING BEHAVIOR THROUGH ENSEMBLE DYNAMICS for details), but the resulting kinetics does not necessarily capture the underlying dynamics.In fact, the obtained relaxation times are only an upper bound to the true relaxation timescales of the locally irreversible dynamics.To directly probe the Markovianity of the underlying slow dynamics, we estimate the relaxation times for the non-reversibilized coarse-grained transition matrix, which should not change with τ when the dynamics is Markovian.We approximate the slow relaxation dynamics by using the metastable states to build a two-state, coarse-grained Markov chain Pc, which necessarily has only real eigenvalues λc ∈ R. The corresponding relaxation time is then obtained through |Λ −1 c | = −τ / log λc(τ ).In general, we find that the regime in which | Λ−1 2 | from Pr is constant (b) overlaps with regime in which |Λ −1 c | is also constant.In addition, while |Λ −1 2 |(τ * ) from Pr overestimates the expected |Λ −1 | from "run" and "pirouette' transition rates, Fig. 4(b), the timescales obtained from Pc, |Λ −1 c |(τ * ) = 3.48(3.03,3.86) s are comparable to the ones estimated from the entire Markov chain in Fig. 2(a) and accurately predict the hopping dynamics.Error bars are 95% confidence intervals bootstrapped across worms.
FIG.S3.Probability Density Function (PDF) of the "eigenworm" coefficients show a tight agreement between the data (colors) and simulations (black error bars), indicating that the inferred dynamics capture the steady-state distribution.Error bars and shaded areas correspond to 95% confidence intervals bootstrapped across worms.

3 > Λ − 1 4> Λ − 1
FIG. S5.Details of the posture-to-path simulations.(a) Optimizing the free parameter α = αn/αt in RFT (see Methods for details).We define a loss function L(α) as the maximum distance between the real worm trajectory and a reconstructed one, and sample randomly from 100 s segments.We the find the optimal α * through the Nelder-Mead algorithm (see Methods).The resulting distribution of α * is broad, as shown through the cumulative distribution function (CDF), with a median of α * = 30 (29, 31) s −1 .On the right, we display two example trajectory reconstructions for different values of α.For α = 5 (comparable to experimental measures of[115]), RFT typically results in a substantial undershoot of the observed trajectories (as observed in[35]).For the no-slip condition, α ≫ 1, the resulting trajectories overshoot the real worm trajectories, indicating that some degree of slip is needed to accurately predict worm trajectories.With values of α * ≈ 30 we get an accurate reconstruction of the worm trajectories.(b) Mean square displacements for the data (dashed line) of each worm, as well as for 1000 centroid trajectory simulations generated from symbolic sequences simulated with the Markov model (gray).By fitting a linear function in the interval τ ∈ [60, 100] s we obtained the effective diffusivity estimate of Fig. 4(b-right).
FIG. S9.Statistical properties of the trajectories simulated for each of the 7 mesoscopic states obtained through a top-down subdivision of the C. elegans posture state space.(a) -Mean square displacement (MSD) of the center-of-mass trajectories for each of the 7 states revealed in Fig.4(c).Trajectories in the different states exhibit clearly distinct statistical properties: while "run" states generally exhibit a transition between super-diffusive behavior (MSD ∼ τ β , β > 1) at short times and diffusive (or sub-diffusive) behavior (MSD ∼ τ β , β ≤ 1) at large times, the "pirouette"' states are mostly diffusive even at short times.In addition, some of the states exhibit non-trivial fluctuations in the MSD that result from the quasi-periodic loops observed in the trajectories shown in Fig.5(a).(b) -Power spectral density of the velocity bearing angle η, ⃗ vCM = v cos(η) êx + v sin(η) êy, where v = ||⃗ vCM||.Besides the fast oscillations due to the body wave dynamics (the vertical dashed line represents the body wave phase velocity in the fast wide run state), some states also exhibit low frequency peaks due to the loopy nature of the trajectories, which recur on a time scale orders of magnitude longer than the body wave period.The power spectral density was estimated using Welch's method[116] implemented through the signal.welchpackage from Scipy[114] with a Hann window and 10 min long trajectory segments.The error bars corresponding to 95% confidence bootstrapped across 1000 simulations for each state are too small to show.