## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Nonlinear Laplacian spectral analysis for time series with intermittency and low-frequency variability

Contributed by Andrew J. Majda, November 22, 2011 (sent for review October 3, 2011)

## Abstract

Many processes in science and engineering develop multiscale temporal and spatial patterns, with complex underlying dynamics and time-dependent external forcings. Because of the importance in understanding and predicting these phenomena, extracting the salient modes of variability empirically from incomplete observations is a problem of wide contemporary interest. Here, we present a technique for analyzing high-dimensional, complex time series that exploits the geometrical relationships between the observed data points to recover features characteristic of strongly nonlinear dynamics (such as intermittency and rare events), which are not accessible to classical singular spectrum analysis. The method employs Laplacian eigenmaps, evaluated after suitable time-lagged embedding, to produce a reduced representation of the observed samples, where standard tools of matrix algebra can be used to perform truncated singular-value decomposition despite the nonlinear geometrical structure of the dataset. We illustrate the utility of the technique in capturing intermittent modes associated with the Kuroshio current in the North Pacific sector of a general circulation model and dimensional reduction of a low-order atmospheric model featuring chaotic intermittent regime transitions, where classical singular spectrum analysis is already known to fail dramatically.

The analysis of complex spatio-temporal signals is a central problem in science and engineering, arising in a diverse range of disciplines, including experimental and theoretical fluid dynamics (1, 2), climate science (3⇓–5), molecular dynamics (6, 7), and astrophysics (8, 9). In these and other applications, there is a strong interest in extracting physically meaningful information about the spatial and temporal variability of data from models, experiments, or observations, with the goal of enhancing the understanding of the underlying dynamics, and improving predictive capabilities. In many cases, observations of the system under study are incomplete; i.e., only a subset of the full phase space is accessible.

A classical way of attacking this problem is through singular spectrum analysis (SSA), or one of its variants (3, 10⇓–12). Here, a low-rank approximation of a dynamic process is constructed by first embedding a time series of a scalar or multivariate observable in a high-dimensional vector space *H* (called embedding space) using the method of delays (13⇓–15) and then performing a truncated singular-value decomposition (SVD) of the matrix *X* containing the embedded data (16). In this manner, information about the dynamical process is extracted from the spatial and temporal singular vectors of *X* with the *K* largest singular values (12). The temporal singular vectors form a set of empirical orthogonal functions (EOFs) in *H*, which, at each instance of time, are weighted by the corresponding principal components (PCs) determined from the temporal singular vectors to yield a rank-*K* reconstruction of *X*.

A potential drawback of this approach is that it is based on minimizing an operator norm that may be unsuitable for signals generated by nonlinear dynamical systems. Specifically, the PCs are computed by projecting onto the principal axes of the *K*-dimensional ellipsoid that best fits the covariance of the data in *H* in the least-squares sense. This construction is optimal when the underlying dynamics is linear, but nonlinear processes in general will produce data lying on a curved manifold *M* in embedding space, with non-Gaussian distributions departing significantly from the ellipsoid defined by the covariance operator in *H*. Physically, a prominent manifestation of this phenomenon is failure to capture via SSA the intermittent patterns arising in turbulent dynamical systems; i.e., temporal processes that carry low variance but play an important dynamical role (17, 18).

Despite their inherently nonlinear character, such datasets possess a natural linear structure, namely the Hilbert space *L*^{2}(*M*,*μ*) of square-integrable functions on *M* with inner product inherited from the volume element *μ* of *M* (the Riemannian measure). This space may be thought of as the collection of all possible weights that can be assigned to the data samples when making a reconstruction, i.e., it is analogous to the temporal space in SSA. Therefore, it is reasonable to develop algorithms that seek to approximate suitably defined maps from *L*^{2}(*M*,*μ*) to the space of spatial patterns *H*. Such maps, denoted here by *A*, have the advantage of being simultaneously linear and compatible with the nonlinear manifold comprised by the data.

In this paper, we advocate that this approach, implemented via algorithms developed in machine learning, can reveal important aspects of complex, high-dimensional signals, which are not accessible to classical SSA. In this framework, which we refer to as nonlinear Laplacian spectral analysis (NLSA), an orthonormal basis for *L*^{2}(*M*,*μ*) is constructed through eigenfunctions of the Laplace–Beltrami operator on *M*, computed efficiently via sparse graph-theoretic algorithms (19, 20). Projecting the data from embedding space *H* onto these eigenfunctions then gives a matrix representation of *A*, such that the optimal rank-*K* reconstruction with respect to the Frobenius norm of maps from *L*^{2}(*M*,*μ*) to *H* is given by standard truncated SVD.

We discuss applications of this technique to two problems, each with its own challenges to data analysis. First, we study the variability of the upper-ocean temperature in the North Pacific sector of a comprehensive coupled climate model (National Center for Atmospheric Research’s Community Climate System Model Version 3 (CCSM3) (21⇓–23). Here, the dynamics take place in a phase space of very high dimension, is strongly mixing, and exhibits variability over a broad range of time scales, including seasonal, interannual, and decadal time scales. Imposing no a priori assumptions (such as periodicity in the statistics), NLSA recovers three distinct types of temporal processes: (1) periodic processes, including annual and semiannual cycles; (2) decadal-scale variability with spatial patterns resembling the Pacific Decadal Oscillation (PDO) (24); (3) intermittent processes associated with the Kuroshio current and variations in the strength of the subtropical and subpolar gyres. The latter carry little variance (and are therefore not captured by SSA), yet their dynamical role is expected to be significant. Moreover, we demonstrate the dynamical significance of Galerkin bases constructed through the spatial modes of NLSA in the context of a six-dimensional chaotic low-order model of the atmosphere (18, 25, 26). This model exhibits strong intermittency and metastability, and its attractor is highly inhomogeneous. Here, we test for the quality of spatial patterns not by their explained variance but rather by their ability to reproduce chaotic regime behavior in reduced dynamical models. It has been established, for instance, that Galerkin projections onto the spatial modes of SSA or optimal persistence patterns (OPPs) (4) fail to reproduce chaotic regime behavior (18). In contrast, NLSA leads to models of dimension as low as three featuring chaotic regimes.

## Mathematical Framework

### Setting.

We consider that we observe samples of a vector-valued signal *x*_{t} sampled uniformly with time step *δt*. Here, *x*_{t} is generated by a dynamical system operating in some phase space, which may only be partially accessible to observations. For instance, in the ocean application ahead, *x*_{t} will be a depth-averaged sea temperature field restricted in the North-Pacific sector of CCSM3, i.e., observations of *x*_{t} alone are not sufficient to uniquely determine the state of the system in phase space. Our objective is to produce a decomposition of *x*_{t} into spatial and temporal modes, taking explicitly into account the fact that the underlying trajectory of the dynamical system lies on a nonlinear manifold *M* in phase space. A further challenge, which will be highlighted in our atmospheric application, is that sampling of *M* may be highly inhomogeneous. In this case, regions of *M* associated with metastable regimes are sampled densely, but the dynamically important transitions between the regimes are observed rarely.

The methodology developed here employs three basic steps to address these issues: (1) Time-lagged embedding of the input data to a high-dimensional Hilbert space *H*, to ensure that geometrical neighborhoods in *H* correspond to neighborhoods in phase space; (2) construction of a regularized linear map *A* taking a Hilbert space of scalar functions on *M* representing temporal patterns to the spatial patterns in *H*, in a manner compatible with the nonlinear geometry of *M*; and (3) singular-value decomposition of *A* in a basis of Laplacian eigenfunctions, evaluated via graph-theoretic algorithms with modified Gaussian kernels to deal with sampling inhomogeneity. Below we describe each step of the method in some detail. Pseudocode summarizing the procedure is listed in *SI Appendix*, Algorithm S1.

### Time-Lagged Embedding.

A consequence of incomplete observations is that distances between points in data space may be small, even if these points lie far apart in phase space. This is likely to adversely affect the dynamical significance of the modes identified by any algorithm (including SSA) that analyses the data based on their geometrical relationships. Time-lagged embedding is a familiar technique of alleviating this issue developed in the qualitative theory of dynamical systems (13⇓–15, 27). Under generic conditions, the image of *x*_{t} in embedding space *H* under the delayed-coordinate mapping, lies in a manifold that is in one-to-one correspondence with *M*, provided that the dimension of *H* is sufficiently large. Thus, given a sufficiently long embedding window Δ*t* = (*q* - 1)*δt*, we obtain a representation of the dataset such that geometrical neighborhoods in embedding space encode similar dynamical features. Augmenting the dimensionality of ambient space is also useful in applications involving nonautonomous dynamical systems, including systems with time-periodic statistics (28). After data analysis has been completed, a collection of elements may be projected to physical space by writing and taking the average (29), [1]

### Riemannian Formulation.

Recall that SSA is essentially an SVD decomposition, [2]of the signal into biorthonormal “topos” (spatial) and “chronos” (temporal) modes, *u*_{k} and *v*_{k} (10). In the above, *T* denotes the set of times at which observations are made, and *L*^{2}(*T*) is the Hilbert space of square-integrable scalar functions on that set. That is, the signal is interpreted as a linear map *X*: *L*^{2}(*T*)↦*H* from the topos to the chronos spaces, given by [3]The *u*_{k} and *v*_{k} are eigenfunctions of the spatial and temporal covariance operators, [4a][4b]respectively, where denotes position in the spatial domain (taking embedding into account), *X*^{∗} is the adjoint of *X*, and *C* and *ρ* are the spatial and temporal two-point functions in embedding space, and , respectively. In particular, the temporal patterns *v*_{k}(*t*) correspond to time-dependent projection coefficients along the directions of highest signal variance in *H*. It is well known, however, that this type of projection is likely to yield poor representation of the data if *M* is not flat (linear) (19, 30, 31). NLSA addresses this shortcoming through a generalization of [**3**] to explicitly account for the nonlinear geometrical structure of *M*.

For simplicity in notation and exposition, we first consider the case where *M* is a differentiable, compact manifold with no boundary, equipped (through its embedding in *H*) with a smooth metric tensor *g*, Riemannian density (volume) *μ* = (det *g*)^{1/2}, and Laplace–Beltrami operator Δ*f* = -div grad *f* (32). These conditions guarantee the well posedness of the eigenvalue problem [5]with 0 = *λ*_{0} < *λ*_{1} ≤ *λ*_{2} ≤ ⋯, giving rise to an orthonormal basis {*ϕ*_{0},*ϕ*_{1},…} for the Hilbert space *L*^{2}(*M*,*μ*) of square-integrable functions on *M* with measure *μ*. Even though *M* may actually be neither differentiable, nor compact, in practice we will always be dealing with graph-theoretic, coarse-grained approximations of *M* constructed through finite observation samples, where the discrete analog of [**5**] holds (33), irrespective of the properties of *M* in the limit of infinite samples. Rigorous theory providing links between the discrete and continuous cases has been developed using heat kernels (20, 34, 35, 36).

With this machinery in place, the key modification of NLSA relative to SSA is to replace the chronos space *L*^{2}(*T*) with the (*l* + 1)-dimensional subspace of *L*^{2}(*M*,*μ*) spanned by the leading *l* Laplace–Beltrami eigenfunctions in [**5**], viz., . Thus, the temporal modes in NLSA are generated by a natural set of basis functions on the nonlinear dataset, with length scale (“resolution”) on *M* controlled by the parameter *l*. As we discuss in more detail below, *l* plays a regularization role, to prevent overfitting when sampling of *M* is not dense. Next, by direct analogy with [**3**], we introduce a family of linear maps *A*^{l}: *V*_{l}↦*H*, providing the link between the chronos and topos spaces of NLSA: [6]The strategy put forward in this paper is to recover the salient spatio-temporal modes associated with nonlinear signals through spectral decompositions of *A*^{l}, as follows.

Let {*ϕ*_{0},…,*ϕ*_{l}} be the orthonormal basis of *V*_{l} from [**5**] and {*e*_{1},*e*_{2},…} an orthonormal basis of *H*. The matrix elements of *A*^{l} with this choice of bases are [7]where is the inner product of *H*, and . This leads to a spectral decomposition of *A*^{l}, [8]where *u*_{ik} and *v*_{jk} are matrix elements of unitary operators on *H* and *V*_{l}, respectively, and *σ*_{k} > 0 are singular values. The decomposition of the signal *X*_{t} in terms of the chronos and topos modes of *A*^{l}, analogous to [**2**], is then , with and [9]It is a standard result that with *K* < *r* is the rank-*K* linear map from *V*_{l} to *H* minimizing the Frobenius norm of the residual, . Moreover, by completeness, converges to the input signal *X*_{t} as *l* → ∞ (or *l* = *s* for finite number of samples *s*).

To gain insight on the properties of these modes, as well as their relation to the corresponding modes from SSA, it is instructive to study the action of the spatial and temporal covariance operators, *L* = *A*^{l}*A*^{l∗} and *R* = *A*^{l∗}*A*^{l}, on spatial and temporal patterns, and *f*(*X*_{t})∈*V*_{l}, respectively. The latter can be expressed as convolution operations, and , with kernels given by [10a][10b]and . In particular, in the limit *l* → ∞, where becomes a Dirac *δ*-function, we have and .

Consider now the corresponding kernels in SSA from [**4**]. If the Riemannian measure *μ* is uniform, and the system trajectory covers *M* densely over the observation time interval *T* (two very strong assumptions), it is possible to replace the integral over time in [**4b**] with an integral over *M*. Therefore, in this idealized case, the results from SSA and NLSA coincide. In typical applications, however, neither the Riemannian measure *μ* is uniform, nor is the coverage of *M* dense. In NLSA, the latter deficiency is alleviated by truncating *V*_{l} to a finite dimensional subspace of *L*^{2}(*M*,*μ*). That is, given a finite number of observations, the results of NLSA up to some upper value of *l* should be close to the corresponding results obtained in the infinite-sample limit (at the same *l*) and still reflect the coarse-grained nonlinear geometric properties of *M*. Moreover, nonuniformity of *μ*, which for finite datasets depends both on the geometrical properties of the embedding *M*↦*H* and the dynamics on *M*, is handled naturally by graph-theoretic algorithms. This will turn out to play a key role in the atmospheric application discussed below, where sampling of the attractor is highly inhomogeneous.

### Evaluating Spatio-Temporal Patterns of Finite Datasets.

To implement the spectral decomposition [**8**] in practical applications involving finite datasets, we proceed by first using graph-theoretic algorithms to evaluate the Laplacian eigenfunction basis for *V*_{l} then performing SVD of the matrix of operator components from [**7**]. Given a dataset consisting of *s* samples, *G* = {*X*_{δt},*X*_{2δt},…,*X*_{sδt}}⊂*M*, graph-theoretic algorithms (19, 20) seek to construct a Markov process on *G* with transition probability matrix *P*, such that, for large-enough *s* and small-enough *i*, the right eigenvectors of *P* approximate the corresponding Laplace–Beltrami eigenfunctions *ϕ*_{i} in [**5**]; i.e., , with and *ϕ*_{ij} ≈ *ϕ*_{i}(*X*_{jδt}). These eigenvectors satisfy an orthonormality condition that is the discrete analog to [**5**], , with *μ*_{k} given by the invariant measure (leading left eigenvector) of *P*; namely, , where , *μ*_{i} > 0, and .

In the present work, we evaluate *P* using the diffusion map (DM) algorithm of Coifman and Lafon (20), with a simple but important modification in the calculation of the Gaussian weight matrix. Specifically, we assign to each sample *X*_{iδt} a local velocity in embedding space, , and evaluate the Gaussian weights *W*_{ij} = exp(-‖*X*_{iδt} - *X*_{jδt}‖^{2}/(*εξ*_{i}*ξ*_{j})), where ‖.‖ denotes the norm of *H*. This approach was motivated by the clustering algorithm developed in (37), with the difference that in the latter paper *ϵ*_{i} is evaluated using spatial nearest neighbors, rather than the temporal nearest neighbors employed here. In the standard implementation of DM, *ε* must be sufficiently small in order for the diffusion process represented by *P* to be sensitive only to local neighborhoods around each data point. Here, the normalization by *ξ*_{i} enforces geometrical localization even for *ε* = *O*(1). The remaining standard steps needed to compute *P* given *W* are displayed in *SI Appendix*, Algorithm S2. The scalability of this class of algorithms to large problem sizes has been widely demonstrated in the machine learning and data mining literature. In particular, as a result of Gaussian decay of the weights, the *W* matrix used in implementations is made sparse, e.g., by truncating *W* to the largest *b* nonzero elements per row with *b*/*s* ≪ 1, significantly reducing the cost of the eigenvalue problem for . The least-favorable scaling involves the pairwise distance calculation between the data samples in embedding space, which scales like *s*^{2} dim(*H*) if done in brute force (which is the method employed here). Despite the quadratic scaling with *s*, the linear scaling with dim(*H*) is of key importance, as it guarantees that NLSA does not suffer from a “curse of dimension” as do neural-network-based methods (38).

## North Pacific Data from a Coupled Climate Model

We first study the variability in the North Pacific sector of CCSM3; specifically, variability of the mean upper 300 m sea temperature field in the 700-year equilibrated integration used in (22, 23) in work on the initial-value and forced-response predictability in that model. Here, our objective is to diagnose the prominent modes of variability in a comprehensive climate model. In this analysis, the *x*_{t} observable is the mean upper 300 m temperature field sampled every month at *n* = 534 gridpoints (native ocean grid mapped to the model’s T42 atmosphere) in the region 20°N–65°N and 120°E–110°W. Below, we work with an embedding window Δ*t* = 2 y; i.e., the dimension of embedding space is dim(*H*) = *n* × 24 = 12,816. After embedding, *x*_{t}↦*X*_{t}, we are left with *s* = 700 × 12 - 24 = 8,376 samples for data analysis. Here, we present results obtained using the parameter values *l* = 23, *ε* = 2, and *b* = 1,000. Figs. 1 and 2 display representative singular values and temporal patterns of the *A*^{l} operator in [**8**], respectively, and snapshots of the temperature anomaly fields (evaluated by applying [**1**] to the spatio-temporal patterns from [**9**]) are shown in Fig. 3 and Movie S1. We find that the spatio-temporal patterns fall into three distinct families of periodic, low-frequency, and intermittent modes, described below. We tested for robustness of these results in a series of calculations with *l*∈[20,100], *ε*∈[1,10], and *b*∈[100,2,000]. Note that time-lagged embedding is essential to the separability of the spatio-temporal patterns into qualitatively distinct processes; their character is mixed if no embedding is performed. Also, Δ*t* is not directly related to the lowest frequencies captured by the temporal patterns *v*_{k}(*t*), but must be long enough to incorporate information about major sources of non-Markovianity in the data—here the annual cycle of solar forcing. We have tested our results for robustness using several Δ*t*∈[2,10] y.

### Periodic Modes.

Modes in this family come in doubly degenerate pairs and have the structure of sinusoidal waves with phase difference *π*/2 and frequency equal to integer multiples of 1/y. The leading periodic modes, {*v*_{1},*v*_{2}} (Fig. 2*A*), represent the seasonal cycle in the data. In the spatial domain (Fig. 3*A* and Movie S1*C*) these modes generate an annual oscillation of the temperature anomaly, whose amplitude is largest (of order 1 °C) in the western part of the basin (130°E–160°E) and for latitudes in the range 30°N–45°N. The semiannual modes (Fig. 3*C* and Movie S1*E*) exhibit significant amplitude in the western part of the domain, but also along the west coast of North America; the latter is consistent with semiannual variability of the upper ocean associated with the California current (39).

### Low-Frequency Modes.

These modes are characterized by high spectral power over interannual to interdecadal timescales. Featuring strongly suppressed power over annual or shorter time scales, they represent the low-frequency variability of the upper ocean, which has been well studied in the North Pacific sector of CCSM3 (21, 22). The leading mode in this family (*v*_{3}; see Fig. 2*B*) gives rise to a typical PDO pattern (Fig. 3*B* and Movie S1*D*), where the most prominent basin-scale structure is a horseshoe-like temperature anomaly pattern developing eastward along the Kuroshio extension, together with an anomaly of the opposite sign along the west coast of North America. The higher modes in this family gradually develop smaller spatial features and spectral content over shorter time scales than *v*_{3} but have no spectral peaks on annual or shorter time scales. As illustrated in Movie S1*B*, these modes also arise in classical SSA.

### Intermittent Modes.

A major result of this analysis, which highlights the importance of taking explicitly into account the nonlinear geometrical structure of complex spatio-temporal datasets, is the existence of intermittent patterns of variability in the North Pacific sector of CCSM3, which are not accessible through SSA. As illustrated in Fig. 2*C*, the key feature of modes of this family is temporal intermittency, arising out of oscillations at annual or higher frequency, which are modulated by relatively sharp envelopes with a temporal extent in the 2–10-y regime. Like their periodic counterparts, the intermittent modes form nearly degenerate pairs, and their base frequency of oscillation is an integer multiple of 1/y. The resulting Fourier spectrum is dominated by a peak centered at the base frequency, exhibiting some skewness towards lower frequencies. In the physical domain, these modes describe processes with relatively fine spatial structure, which are activated during the intermittent bursts, and become quiescent when the amplitude of the envelopes is small. The most physically recognizable aspect of these processes is enhanced transport along the Kuroshio extension region, shown for the leading-two intermittent modes {*v*_{9},*v*_{10}} in Fig. 3*D* and Movie S1*F*. This process features sustained eastward propagation of small-scale temperature anomalies of order 0.2 °C during the intermittent bursts. The intermittent modes higher in the spectrum also encode rich spatio-temporal patterns, including retrograde (westward) propagating anomalies, and gyre-like patterns resembling the subpolar and subtropical gyres.

## A Chaotic Intermittent Low-Order Atmosphere Model

The spatial modes *u*_{k} from [**2**] arising in SSA are optimal from the point of view of capturing the maximal covariance in *H* using the smallest number of modes. However, this does not necessarily imply that the dynamical significance of these modes is correspondingly high. For instance, the SSA-based models of the Kuramoto–Sivashinsky (KS) equation by Aubry et al. (17) capture 99.99995% of the variance using only the leading six modes, but projecting the KS equation onto the subspace spanned by those modes fails to reproduce the right dynamics. Similar issues were encountered by Crommelin and Majda (18) in a study of a six-mode chaotic low-order model for the atmosphere (25, 26). Those authors explored a suite of reduced-modeling approaches, among which only spatial patterns derived through principal interacting patterns (PIPs) (5, 40) were able to reproduce chaotic regime transitions. PIPs, however, make use of significant information about the dynamical system generating the observed data. Yet, spatial modes derived via purely data-driven methods, including SSA and OPPs, were found in (18) to exhibit weak dynamical significance. In particular, Galerkin-truncated models onto those bases typically decayed to fixed points or locked on orbits of unrealistically high temporal regularity (18). Here, we demonstrate that the spatial patterns from [**8**], evaluated through NLSA using the same input data as SSA or OPPs, are able to reproduce chaotic regime transitions.

Specifically, setting , we consider a deterministic dynamical system of the form , with the governing equations and parameter values given in (18), and reproduced in Table S1 for convenience. Here, the *x*_{i}’s are low-order Fourier expansion coefficients of the streamfunction in the barotropic vorticity equation for the atmosphere, and nondimensional time corresponds to 1 d. Physically, this model describes chaotic transitions between zonal and blocked regimes of the atmospheric streamfunction, caused by large-scale topography (e.g., continental mountain ranges) and barotropic instability (26). For our purposes, the modes that most closely represent these transitions are *x*_{1} and *x*_{4}. The latter correspond to purely zonal wind (east-west), with no variation in the meridional direction, forced by a zonal atmospheric jet profile as observed in the real atmosphere; see Figs. 4*A* and 5*A* for representative time series.

The goal of dimensional reduction is to produce a set of orthonormal spatial patterns for Galerkin projection, with *k* ≤ *K* < 6, such that the reduced model projected onto these patterns, with , exhibits chaotic transitions between zonal and blocked states. Note that by the orthogonality property, *U*^{T}*U* = *I*, a solution *z*_{t} generates a trajectory *x*_{t} = *Uz*_{t} in the phase space of the full model. Here, we constructed Galerkin bases using NLSA with a training time series *x*_{t} of length 10^{6} d, sampled daily. Note that because the full phase space of the model is accessible, no embedding was performed in these calculations, i.e., dim(*H*) = 6. We ran NLSA (*SI Appendix*, Algorithm S1) using *ϵ* = 1 throughout, and representative values of *b*∈[30,500] and *l*∈[3,20]. In each case, we carried out the decomposition in [**8**] and formed 6 × *K* projection matrices *U* = [*u*_{i1},…,*u*_{iK}] for *K* = 3, 4, or 5. We tested the resulting reduced models over long, 10^{6}-d integrations with initial conditions chosen randomly from the training data.

With moderate experimentation in the choice of *b* and *l*, it was possible to construct reduced models featuring chaotic regime transitions for all *K*∈{3,4,5}; examples of these models are illustrated in Figs. 4 and 5 and Movie S2. Qualitatively, the fidelity of these models increases with *K*; e.g., the *K* = 5 model reproduces excursions towards large (*x*_{1},*x*_{4}) values, i.e., away from the regions of high joint probability *p*(*x*_{1},*x*_{4}) (26), though with some bias. Note that the objective here is not to reproduce the trajectory of the full model in a pointwise sense; that would be futile (and of little use) given the system’s chaotic dynamics. The fact that models with dimension as low as *K* = 3 feature chaotic transitions between zonal and blocked regimes strongly supports the dynamical significance of NLSA modes.

A further important point concerns the behavior of the discrete Riemannian measure . As illustrated in Fig. 5*C*, when plotted as a function of time, the weights *μ*_{i} exhibit a bursting behavior, with spikes corresponding to transitions between regimes. Thus, the algorithm assigns high weight to transitory events in the operator components in [**6**], which compensates for the fact they are observed only rarely. Heuristically, *μ*_{i} corresponds to the volume in *M* occupied by the sample *X*_{t} at time *t* = *iδt*, which tends to be large for states visited infrequently by the system. In contrast, the corresponding time integral [**4b**] in classical SSA, carries uniform temporal weights, and the influence of the short-time transitions in the resulting spatial patterns is suppressed. The nonuniform Riemannian weighting thus enhances the capability of NLSA of capturing rare events, which frequently turn out to be crucial in reproducing the right dynamics (17, 18).

## Concluding Remarks

Combining techniques from machine learning and the qualitative theory of dynamical systems, we have presented a method for time series analysis, referred to as NLSA, which explicitly accounts for the nonlinear geometrical structure of datasets generated by complex dynamical systems. Like classical SSA, NLSA utilizes time-lagged embedding and SVD to produce a spatio-temporal decomposition of a signal generated by partial observations of high-dimensional dynamical systems. However, the linear operator used here in the SVD step differs crucially from SSA in that its domain of definition is the Hilbert space of square-integrable functions on the nonlinear manifold *M* comprised by the data (in a suitable coarse-grained graph representation). These functions, analogous to the chronos modes in SSA (10), are tailored to the geometry of *M* through its Riemannian measure *μ*.

Applying this scheme to the upper-ocean temperature in the North Pacific sector of a comprehensive climate model, we found a family of intermittent processes that are not captured by SSA. These processes describe eastward-propagating, small-scale temperature anomalies in the Kuroshio current, as well as retrograde-propagating structures at high latitudes and in the subtropics. These modes carry little variance of the raw signal, and display burst-like behavior characteristic of strongly nonlinear dynamics (17). The remaining identified modes include the familiar PDO pattern of decadal variability, as well as annual and semiannual periodic processes.

Besides demonstrating NLSA’s skill in capturing qualitatively distinct spatio-temporal patterns, we have examined its utility in building high-quality bases for model reduction through Galerkin projection. As a particularly challenging example, involving strong intermittency and inhomogeneous sampling of the attractor, we considered a six-mode low-order model for the atmosphere featuring chaotic transitions between zonal and blocked regimes (26). Whereas other data-driven approaches fail to reproduce chaotic regime behavior (18), Galerkin modes based on NLSA led to reduced models of dimension as low as 3 featuring the salient aspects of regime behavior of the full model. More generally, NLSA is well-poised to capture short-time transitions and rare events via nonuniform weighting of the data through the Riemannian volume. This feature is encouraging for potential future research on NLSA-based empirical models for prediction in large-scale models, such as coupled climate models.

## Acknowledgments

We thank G. Branstator and H. Teng for providing access to the CCSM3 dataset used in this analysis. This work was supported by National Science Foundation Grant DMS-0456713, Office of Naval Research Department of Research Initiative Grants N00014-10-1-0554, and N0014-11-1-0306.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. E-mail: jonjon{at}cims.nyu.edu.

Author contributions: D.G. and A.J.M. designed research; D.G. and A.J.M. performed research; and D.G. wrote the paper.

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1118984109/-/DCSupplemental.

Freely available online through the PNAS open access option.

## References

- ↵
- Holmes P,
- Lumley JL,
- Berkooz G

- ↵
- ↵
- ↵
- DelSole T

- ↵
- Kwasniok F

- ↵
- Deuflhard P,
- Dellnitz M,
- Junge O,
- Schütte C

- ↵
- Nadler B,
- Lafon S,
- Coifman RR,
- Kevrikedes I

- ↵
- Brunt CM,
- Heyer MH

- ↵
- Cacadavid AC,
- Lawrence JK,
- Ruzmaikin A

- ↵
- ↵
- Golyandina N,
- Nekrutkin V,
- Zhigljavsky A

- ↵
- Ghil M,
- et al.

- ↵
- Takens F

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- Alexander M,
- et al.

- ↵
- Teng H,
- Branstator G

- ↵
- Branstator G,
- Teng H

- ↵
- ↵
- De Swart HE

- ↵
- ↵
- ↵
- Majda AJ,
- Wang X

- ↵
- Horenko I

- ↵
- Ham D,
- Lee DD,
- Mika S,
- Schölkopf B

- ↵
- Coifman RR,
- et al.

- ↵
- Bérard PH

- ↵
- Chung FRK

- ↵
- Belkin M,
- Niyogi P

- ↵
- Belkin M,
- Niyogi P

- ↵
- Jones PW,
- Maggioni M,
- Schul R

- ↵
- Zelnik-Manor L,
- Perona P

- ↵
- ↵
- Mendelssohn R,
- Schwing FB,
- Bograd SJ

- ↵
- Kwasniok F

## Citation Manager Formats

## Sign up for Article Alerts

## Article Classifications

- Physical Sciences
- Applied Mathematics