# Data-driven discovery of coordinates and governing equations

See allHide authors and affiliations

Edited by David L. Donoho, Stanford University, Stanford, CA, and approved September 30, 2019 (received for review April 25, 2019)

## Significance

Governing equations are essential to the study of physical systems, providing models that can generalize to predict previously unseen behaviors. There are many systems of interest across disciplines where large quantities of data have been collected, but the underlying governing equations remain unknown. This work introduces an approach to discover governing models from data. The proposed method addresses a key limitation of prior approaches by simultaneously discovering coordinates that admit a parsimonious dynamical model. Developing parsimonious and interpretable governing models has the potential to transform our understanding of complex systems, including in neuroscience, biology, and climate science.

## Abstract

The discovery of governing equations from scientific data has the potential to transform data-rich fields that lack well-characterized quantitative descriptions. Advances in sparse regression are currently enabling the tractable identification of both the structure and parameters of a nonlinear dynamical system from data. The resulting models have the fewest terms necessary to describe the dynamics, balancing model complexity with descriptive ability, and thus promoting interpretability and generalizability. This provides an algorithmic approach to Occam’s razor for model discovery. However, this approach fundamentally relies on an effective coordinate system in which the dynamics have a simple representation. In this work, we design a custom deep autoencoder network to discover a coordinate transformation into a reduced space where the dynamics may be sparsely represented. Thus, we simultaneously learn the governing equations and the associated coordinate system. We demonstrate this approach on several example high-dimensional systems with low-dimensional behavior. The resulting modeling framework combines the strengths of deep neural networks for flexible representation and sparse identification of nonlinear dynamics (SINDy) for parsimonious models. This method places the discovery of coordinates and models on an equal footing.

Governing equations are of fundamental importance across all scientific disciplines. Accurate models allow for understanding of physical processes, which in turn gives rise to an infrastructure for the development of technology. The traditional derivation of governing equations is based on underlying first principles, such as conservation laws and symmetries, or from universal laws, such as gravitation. However, in many modern systems, governing equations are unknown or only partially known, and recourse to first-principles derivations is untenable. Instead, many of these systems have rich time-series data due to emerging sensor and measurement technologies (e.g., in biology and climate science). This has given rise to the new paradigm of data-driven model discovery, which is the focus of intense research efforts (1⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓–14). A central tension in model discovery is the balance between model efficiency and descriptive capabilities. Parsimonious models strike this balance, having the fewest terms required to capture essential interactions (1, 3, 8, 10, 15), thus promoting interpretability and generalizability. Obtaining parsimonious models is fundamentally linked to the coordinate system in which the dynamics are measured. Without proper coordinates, standard approaches may fail to discover simple dynamical models. In this work, we simultaneously discover effective coordinates via a custom autoencoder (16⇓–18), along with the parsimonious dynamical system model via sparse regression in a library of candidate terms (8). The joint discovery of models and coordinates is critical for understanding many modern systems.

Numerous recent approaches leverage neural networks to model time-series data (18⇓⇓⇓⇓⇓⇓⇓–26). When interpretability and generalizability are primary concerns, it is important to identify parsimonious models that have the fewest terms required to describe the dynamics, which is the antithesis of neural networks whose parameterizations are exceedingly large. A breakthrough approach used symbolic regression to learn the form of dynamical systems and governing laws from data (1, 3). Sparse identification of nonlinear dynamics (SINDy) (8) is a related approach that uses sparse regression to find the fewest terms in a library of candidate functions required to model the dynamics. Because this approach is based on a sparsity-promoting linear regression, it is possible to incorporate partial knowledge of the physics, such as symmetries, constraints, and conservation laws (27). Successful modeling requires that the dynamics are measured in a coordinate system where they may be sparsely represented. While simple models may exist in one coordinate system, a different coordinate system may obscure these parsimonious representations. For modern applications of data-driven discovery, there is no reason to believe that we measure the correct variables to admit a simple representation of the dynamics. This motivates the present study to enable systematic and automated discovery of coordinate transformations that facilitate this sparse representation.

The challenge of discovering an effective coordinate system is as fundamental and important as model discovery. Many key scientific breakthroughs were enabled by the discovery of appropriate coordinate systems. Celestial mechanics, for instance, was revolutionized by the heliocentric coordinate system of Copernicus, Galileo, and Kepler, thus displacing Ptolemy’s doctrine of the perfect circle, which was dogma for more than a millennium. The Fourier transform was introduced to simplify the representation of the heat equation, resulting in a sparse, diagonal, decoupled linear system. Eigen-coordinates have been used more broadly to enable sparse dynamics, for example in quantum mechanics and electrodynamics, to characterize energy levels in atoms and propagating modes in waveguides, respectively. Principal component analysis (PCA) is one of the most prolific modern coordinate discovery methods, representing high-dimensional data in a low-dimensional linear subspace. Nonlinear extensions of PCA have been enabled by a neural network architecture, called an autoencoder (16, 17, 28). However, PCA and autoencoders generally do not take dynamics into account and, thus, may not provide the right basis for parsimonious dynamical models. In related work, Koopman analysis seeks coordinates that linearize nonlinear dynamics (29); while linear models are useful for prediction and control, they cannot capture the full behavior of many nonlinear systems. Thus, it is important to develop methods that combine simplifying coordinate transformations and nonlinear dynamics. We advocate for a balance between these approaches, identifying coordinate transformations where only a few nonlinear terms are present, as in near-identity transformations and normal forms.

In this work we present a method to discover nonlinear coordinate transformations that enable parsimonious dynamics. Our method combines a custom autoencoder network with a SINDy model for parsimonious nonlinear dynamics. The autoencoder enables the discovery of reduced coordinates from high-dimensional data, with a map back to reconstruct the full system. The reduced coordinates are found along with nonlinear governing equations for the dynamics in a joint optimization. We demonstrate the ability of our method to discover parsimonious dynamics on 3 examples: a high-dimensional spatial dataset with dynamics governed by the chaotic Lorenz system, the nonlinear pendulum, and a spiral wave resulting from the reaction–diffusion equation. These results demonstrate how to focus neural networks to discover interpretable dynamical models. Critically, the proposed method provides a mathematical framework that places the discovery of coordinates and models on equal footing.

## Background

We review the SINDy (8) algorithm, which is a regression technique for extracting parsimonious dynamics from time-series data. The method takes snapshot data

SINDy frames model discovery as a sparse regression problem. If snapshot derivatives are available, or can be calculated from data, the snapshots are stacked to form data matrices **2**. The standard SINDy approach uses a sequentially thresholded least-squares algorithm to find the coefficients (8), which is a proxy for

SINDy has been widely applied to identify models for fluid flows (27), optical systems (32), chemical reaction dynamics (33), convection in a plasma (34), and structural modeling (35) and for model predictive control (36). There are also a number of theoretical extensions to the SINDy framework, including for identifying partial differential equations (10, 37), and models with rational function nonlinearities (38). It can also incorporate partially known physics and constraints (27). The algorithm can also be reformulated to include integral terms for noisy data (39) or handle incomplete or limited data (40, 41). The selected modes can also be evaluated using information criteria for model selection (42). These diverse mathematical developments provide a mature framework for broadening the applicability of the model discovery method.

### Neural Networks for Dynamical Systems.

The success of neural networks (NNs) on image classification and speech recognition has led to the use of NNs to perform a wide range of tasks in science and engineering (17). One recent focus has been the use of NNs to study dynamical systems, which has a surprisingly rich history (43). In addition to improving solution techniques for systems with known equations (24⇓–26), deep learning has been used to understand and predict dynamics for complex systems with unknown equations (18⇓⇓⇓⇓–23). Several methods have trained NNs to predict dynamics, including a time-lagged autoencoder which takes the state at time t as input data and uses an autoencoder-like structure to predict the state at time

Another class of NNs uses deep learning to discover coordinates for Koopman analysis. Koopman theory seeks to discover coordinates that linearize nonlinear dynamics (29). Methods such as dynamic mode decomposition (DMD) (4, 5, 9), extended DMD (48), and time-delay DMD (49) build linear models for dynamics, but these methods rely on a proper set of coordinates for linearization. Several recent works have focused on the use of deep-learning methods to discover the proper coordinates for DMD and extended DMD (22, 23). Other methods seek to learn Koopman eigenfunctions and the associated linear dynamics directly using autoencoders (18). While autoencoders are particularly useful when reconstruction of the original state space is necessary, there are many applications in which full reconstruction is unnecessary. Koopman analysis and its combination with neural networks have also shown impressive results for use in such forecasting applications (19, 50).

Despite their widespread use, NNs face 3 major challenges: generalization, extrapolation, and interpretation. The hallmark success stories of NNs (computer vision and speech, for instance) have been on datasets that are fundamentally interpolatory in nature. The ability to extrapolate, and as a consequence generalize, is known to be an underlying weakness of NNs. This is especially relevant for dynamical systems and forecasting, which is typically an extrapolatory problem by nature. Thus models trained on historical data will generally fail to predict future events that are not represented in the training set. An additional limitation of deep learning is the lack of interpretability of the resulting models. While attempts have been made to interpret NN weights, network architectures are typically complicated with the number of parameters (or weights) far exceeding the original dimension of the dynamical system. The lack of interpretability also makes it difficult to generalize models to new datasets and parameter regimes. However, NN methods still have the potential to learn general, interpretable dynamical models if properly constrained or regularized. In addition to methods for discovering linear embeddings (18), deep learning has also been used for parameter estimation of partial differential equations (PDEs) (24, 25).

## SINDy Autoencoders

We present a method for the simultaneous discovery of sparse dynamical models and coordinates that enable these simple representations. Our aim is to leverage the parsimony and interpretability of SINDy with the universal approximation capabilities of deep neural networks (51) to produce interpretable and generalizable models capable of extrapolation and forecasting. Our approach combines a SINDy model and a deep autoencoder network to perform a joint optimization that discovers intrinsic coordinates which have an associated parsimonious nonlinear dynamical model. The architecture is shown in Fig. 1. We again consider dynamical systems of the form **1**. While this dynamical model may be dense in terms of functions of the original measurement coordinates x, our method seeks a set of reduced coordinates

The coordinate transformation is achieved using an autoencoder network architecture. The autoencoder is a feedforward neural network with a hidden layer that represents the intrinsic coordinates. Rather than performing a task such as prediction or classification, the network is trained to output an approximate reconstruction of its input, and the restrictions placed on the network architecture (e.g., the type, number, and size of the hidden layers) determine the properties of the intrinsic coordinates (17); these networks are known to produce nonlinear generalizations of PCA (16). A common choice is that the dimensionality of the intrinsic coordinates z, determined by the number of units in the corresponding hidden layer, is much lower than that of the input data x: In this case, the autoencoder learns a nonlinear embedding into a reduced latent space. Our network takes measurement data

While autoencoders can be trained in isolation to discover useful coordinate transformations and dimensionality reductions, there is no guarantee that the intrinsic coordinates learned will have associated sparse dynamical models. We require the network to learn coordinates associated with parsimonious dynamics by simultaneously learning a SINDy model for the dynamics of the intrinsic coordinates z. This regularization is achieved by constructing a library **3** and **4** with the standard autoencoder loss

In addition to the *SI Appendix*. Alternatively, one might attempt to learn the library functions using another neural network layer, a double sparse library (53), or kernel-based methods (54) for more flexible library representations.

## Results

We demonstrate the success of the proposed method on 3 example systems: a high-dimensional system with the underlying dynamics generated from the canonical chaotic Lorenz system, a 2D reaction–diffusion system, and a 2D spatial representation (synthetic video) of the nonlinear pendulum. Results are shown in Fig. 2.

### Chaotic Lorenz System.

We first construct a high-dimensional example problem with dynamics based on the chaotic Lorenz system. The Lorenz system is a canonical model used as a test case, with dynamics given by the following equations:*A*. The spatial and temporal modes that combine to give the full dynamics are shown in Fig. 3*B*. Full details of how the dataset is generated are given in *SI Appendix*.

Fig. 3*D* shows the dynamical system discovered by the SINDy autoencoder. While the resulting model does not appear to match the original Lorenz system, the discovered model is parsimonious, with only 7 active terms, and the dynamics exhibit an attractor with a 2-lobe structure, similar to that of the original Lorenz attractor. Additionally, by choosing a suitable variable transformation the discovered model can be rewritten in the same form as the original Lorenz system. This demonstrates that the SINDy autoencoder is able to recover the correct sparsity pattern of the dynamics. The coefficients of the discovered model are close to the original parameters of the Lorenz system, up to an arbitrary scaling, which accounts for the difference in magnitude of the coefficients of

On test trajectories from 100 initial conditions sampled from the training distribution, the relative

### Reaction–Diffusion.

In practice, many high-dimensional datasets of interest come from dynamics governed by PDEs with more complicated interactions between spatial and temporal dynamics. To test the method on data generated by a PDE, we consider a lambda–omega reaction–diffusion system governed by

We train the SINDy autoencoder with *B*. The network discovers a model with nonlinear oscillatory dynamics. On test data, the relative

### Nonlinear Pendulum.

As a final example, we consider a simulated video of a nonlinear pendulum. The nonlinear pendulum is governed by the following second-order differential equation:

For this example, we use a second-order SINDy model with a library of functions including the first derivatives *SI Appendix*.

The SINDy autoencoder is trained with

## Discussion

We have presented a data-driven method for discovering interpretable, low-dimensional dynamical models and their associated coordinates from high-dimensional data. The simultaneous discovery of both is critical for generating dynamical models that are sparse and hence interpretable. Our approach takes advantage of the power of NNs by using a flexible autoencoder architecture to discover nonlinear coordinate transformations that enable the discovery of parsimonious, nonlinear governing equations. This work addresses a major limitation of prior approaches for model discovery, which is that the proper choice of measurement coordinates is often unknown. We demonstrate this method on 3 example systems, showing that it is able to identify coordinates associated with parsimonious dynamical equations. Our code is publicly available at http://github.com/kpchamp/SindyAutoencoders (55).

A current limitation of our approach is the requirement for clean measurement data that are approximately noise-free. Fitting a continuous-time dynamical system with SINDy requires reasonable estimates of the derivatives, which may be difficult to obtain from noisy data. While this represents a challenge, approaches for estimating derivatives from noisy data such as the total variation regularized derivative can prove useful in providing derivative estimates (56). Moreover, there are emerging NN architectures explicitly constructed for separating signals from noise (57), which can be used as a preprocessing step in the data-driven discovery process advocated here. Alternatively our method can be used to fit a discrete-time dynamical system, in which case derivative estimates are not required. It is also possible to use the integral formulation of SINDy to abate noise sensitivity (39).

A major problem with deep-learning approaches is that models are typically neither interpretable nor generalizable. Specifically, NNs trained solely for prediction may fail to generalize to classes of behaviors not seen in the training set. We have demonstrated an approach for using NNs to obtain classically interpretable models through the discovery of low-dimensional dynamical systems, which are well studied and often have physical interpretations. While the autoencoder network still has the same limited interpretability and generalizability as other NNs, the dynamical model has the potential to generalize to other parameter regimes of the dynamics. Although the coordinate transformation learned by the autoencoder may not generalize to data regimes far from the original training set, if the dynamics are known, the autoencoder can be retrained on new data with fixed terms in the latent dynamics space (see *SI Appendix* for discussion). The problem of relearning a coordinate transformation for a system with known dynamics is simplified from the original challenge of learning the correct form of the underlying dynamics without knowledge of the proper coordinate transformation.

The challenge of utilizing NNs to answer scientific questions requires careful consideration of their strengths and limitations. While advances in deep learning and computing power present a tremendous opportunity for new scientific breakthroughs, care must be taken to ensure that valid conclusions are drawn from the results. One promising strategy is to combine machine-learning approaches with well-established domain knowledge: For instance, physics-informed learning leverages physical assumptions into NN architectures and training methods. Methods that provide interpretable models have the potential to enable new discoveries in data-rich fields. This work introduced a flexible framework for using NNs to discover models that are interpretable from a standard dynamical systems perspective. While this formulation used an autoencoder to achieve full state reconstruction, similar architectures could be used to discover embeddings that satisfy alternative conditions. In the future, this approach could be adapted using domain knowledge to discover new models in specific fields.

## Acknowledgments

This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant DGE-1256082. We also acknowledge support from the Defense Advanced Research Projects Agency (PA-18-01-FP-125) and the Army Research Office (W911NF-17-1-0306 and W911NF-19-1-0045). This work was facilitated through the use of advanced computational, storage, and networking infrastructure provided by Amazon Web Services cloud computing credits funded by the Student Technology Fee at the University of Washington. This research was funded in part by the Argonne Leadership Computing Facility, which is a Department of Energy Office of Science User Facility supported under Contract DE-AC02-06CH11357. We also thank Jean-Christophe Loiseau and Karthik Duraisamy for valuable discussions about sparse dynamical systems and autoencoders.

## Footnotes

- ↵
^{1}To whom correspondence may be addressed. Email: kpchamp{at}uw.edu.

Author contributions: K.C., B.L., J.N.K., and S.L.B. designed research; K.C. performed research; and K.C., B.L., J.N.K., and S.L.B. wrote the paper.

The authors declare no competing interest.

This article is a PNAS Direct Submission.

Data deposition: The source code used in this work is available at GitHub (https://github.com/kpchamp/SindyAutoencoders).

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1906995116/-/DCSupplemental.

- Copyright © 2019 the Author(s). Published by PNAS.

This open access article is distributed under Creative Commons Attribution License 4.0 (CC BY).

## References

- ↵
- J. Bongard,
- H. Lipson

- ↵
- C. Yao,
- E. M. Bollt

- ↵
- M. Schmidt,
- H. Lipson

- ↵
- C. W. Rowley,
- I. Mezić,
- S. Bagheri,
- P. Schlatter,
- D. Henningson

- ↵
- ↵
- P. Benner,
- S. Gugercin,
- K. Willcox

- ↵
- B. Peherstorfer,
- K. Willcox

- ↵
- S. L. Brunton,
- J. L. Proctor,
- J. N. Kutz

- ↵
- J. N. Kutz,
- S. L. Brunton,
- B. W. Brunton,
- J. L. Proctor

- ↵
- S. H. Rudy,
- S. L. Brunton,
- J. L. Proctor,
- J. N. Kutz

- ↵
- O. Yair,
- R. Talmon,
- R. R. Coifman,
- I. G. Kevrekidis

- ↵
- K. Duraisamy,
- G. Iaccarino,
- H. Xiao

- ↵
- J. Pathak,
- B. Hunt,
- M. Girvan,
- Z. Lu,
- E. Ott

- ↵
- P. W. Battaglia et al.

- ↵
- H. Schaeffer,
- R. Caflisch,
- C. D. Hauck,
- S. Osher

- ↵
- ↵
- I. Goodfellow,
- Y. Bengio,
- A. Courville,
- Y. Bengio

- ↵
- B. Lusch,
- J. N. Kutz,
- S. L. Brunton

- ↵
- A. Mardt,
- L. Pasquali,
- H. Wu,
- F. Noé

- ↵
- ↵
- C. Wehmeyer,
- F. Noé

- ↵
- E. Yeung,
- S. Kundu,
- N. Hodas

- ↵
- N. Takeishi,
- Y. Kawahara,
- T. Yairi

- ↵
- M. Raissi,
- P. Perdikaris,
- G. E. Karniadakis

- ↵
- M. Raissi,
- P. Perdikaris,
- G. E. Karniadakis

- ↵
- Y. Bar-Sinai,
- S. Hoyer,
- J. Hickey,
- M. P. Brenner

- ↵
- J. C. Loiseau,
- S. L. Brunton

- ↵
- M. Milano,
- P. Koumoutsakos

- ↵
- ↵
- P. Zheng,
- T. Askham,
- S. L. Brunton,
- J. N. Kutz,
- A. Y. Aravkin

- ↵
- L. Zhang,
- H. Schaeffer

- ↵
- M. Sorokina,
- S. Sygletos,
- S. Turitsyn

- ↵
- M. Hoffmann,
- C. Fröhner,
- F. Noé

- ↵
- M. Dam,
- M. Brøns,
- J. Juul Rasmussen,
- V. Naulin,
- J. S. Hesthaven

- ↵
- Z. Lai,
- S. Nagarajaiah

- ↵
- ↵
- ↵
- N. M. Mangan,
- S. L. Brunton,
- J. L. Proctor,
- J. N. Kutz

- ↵
- H. Schaeffer,
- S. G. McCalla

- ↵
- G. Tran,
- R. Ward

- ↵
- H. Schaeffer,
- G. Tran,
- R. Ward

- ↵
- ↵
- R. Gonzalez-Garcia,
- R. Rico-Martinez,
- I. Kevrekidis

- ↵
- ↵
- K. T. Carlberg et al.

- ↵
- F. J. Gonzalez,
- M. Balajewicz

- ↵
- K. Lee,
- K. Carlberg

- ↵
- M. O. Williams,
- I. G. Kevrekidis,
- C. W. Rowley

- ↵
- S. L. Brunton,
- B. W. Brunton,
- J. L. Proctor,
- E. Kaiser,
- J. N. Kutz

- ↵
- H. Wu,
- F. Noé

- ↵
- ↵
- D. P. Kingma,
- J. Ba

- ↵
- R. Rubinstein,
- M. Zibulevsky,
- M. Elad

- ↵
- H. Van Nguyen,
- V. M. Patel,
- N. M. Nasrabadi,
- R. Chellappa

- ↵
- K. Champion

- ↵
- R. Chartrand

- ↵
- S. H. Rudy,
- J. N. Kutz,
- S. L. Brunton

## Citation Manager Formats

## Article Classifications

- Physical Sciences
- Applied Mathematics