## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Sequential sampling strategy for extreme event statistics in nonlinear dynamical systems

Edited by Michael P. Brenner, Harvard University, Cambridge, MA, and accepted by Editorial Board Member David A. Weitz September 24, 2018 (received for review August 1, 2018)

## Significance

Understanding the statistics of extreme events in dynamical systems of high complexity is of vital importance for reliability assessment and design. We formulate a method to pick samples optimally so that we have rapid convergence of the full statistics of a quantity of interest, including the tails that describe extreme events. This is important for large-scale problems in science and engineering, where we desire to predict the statistics of relevant quantities but can only afford a limited number of simulations or experiments due to their very expensive cost. We demonstrate our approach in a hydromechanical system with millions of degrees of freedom, where only 10–20 carefully selected samples can lead to accurate approximation of the extreme event statistics.

## Abstract

We develop a method for the evaluation of extreme event statistics associated with nonlinear dynamical systems from a small number of samples. From an initial dataset of design points, we formulate a sequential strategy that provides the “next-best” data point (set of parameters) that when evaluated results in improved estimates of the probability density function (pdf) for a scalar quantity of interest. The approach uses Gaussian process regression to perform Bayesian inference on the parameter-to-observation map describing the quantity of interest. We then approximate the desired pdf along with uncertainty bounds using the posterior distribution of the inferred map. The next-best design point is sequentially determined through an optimization procedure that selects the point in parameter space that maximally reduces uncertainty between the estimated bounds of the pdf prediction. Since the optimization process uses only information from the inferred map, it has minimal computational cost. Moreover, the special form of the metric emphasizes the tails of the pdf. The method is practical for systems where the dimensionality of the parameter space is of moderate size and for problems where each sample is very expensive to obtain. We apply the method to estimate the extreme event statistics for a very high-dimensional system with millions of degrees of freedom: an offshore platform subjected to 3D irregular waves. It is demonstrated that the developed approach can accurately determine the extreme event statistics using a limited number of samples.

For many natural and engineering systems, extreme events, corresponding to large excursions, have significant consequences and are important to predict. Examples include extreme economic events, such as credit shocks (1), rogue waves in the ocean (2), and extreme climate events (3). Extreme events “live” in the tails of a probability distribution function (pdf); thus, it is critical to quantify the pdf many standard deviations away from the mean. For most real-world problems, the underlying processes are far too complex to enable estimation of the tails through direct simulations or repeated experiments. This is a result of the low probabilities of extreme events, which necessitate a large number of experiments or ensembles to resolve their statistics. For random dynamical systems with inherently nonlinear dynamics (expressed through intermittent events, nonlinear energy transfers, broad energy spectrum, and large intrinsic dimensionality), we are usually limited to a few ensemble realizations.

The setup in this article involves a stochastic dynamical system that depends on a set of random parameters with known probability distribution. We assume that the dimensionality of the random parameters is or can be reduced to a moderate size. Because of the inherent stochastic and transient character of extreme responses, it is not sufficient to consider the dynamical properties of the system independently from the statistical characteristics of solutions. A statistical approach to this problem has important limitations, such as requiring various extrapolation schemes due to insufficient sample numbers (see extreme value theorems) (4). Another strategy is large deviations theory (5, 6), a method for the probabilistic quantification of large fluctuations in systems, which involves identifying a large deviations principle that explains the least unlikely rare event. While applied to many problems, for complex systems, estimating the rate function can be very costly and the principle does not characterize the full probability distribution. The resulting distributions via such approaches cannot always capture the nontrivial shape of the tail, dictated by physical laws in addition to statistical characteristics. On the other hand, in a dynamical systems approach, there are no sufficiently generic efficient methods to infer statistical information from dynamics. For example, the Fokker–Planck equation (7) is challenging to solve even in moderate to low dimensions (8). To this end, it is essential to consider blended strategies. The utilization of combined dynamic–stochastic models for the prediction of extreme events have also been advocated and used in climate science and meteorology by others (9⇓–11). In refs. 12 and 13, a probabilistic decomposition of extreme events was used to efficiently characterize the probability distribution of complex systems, which considered both the statistical characteristics of trajectories and the mechanism triggering the instabilities (extreme events). While effective, the proposed decomposition of intermittent regimes requires explicit knowledge of the dynamics triggering the extremes, which may not be available or easily determined for arbitrary dynamical systems.

We formulate a sequential method for capturing the statistics of an observable that is, for example, a functional of the state of a dynamical system or a physical experiment. The response of the observable is modeled using a machine learning method that infers the functional form of the quantity of interest by using only a few strategically sampled numerical simulations or experiments. Combining the predictions from the machine learning model, using the Gaussian process regression (GPR) framework, with available statistical information on the random input parameters, we formulate an optimization problem that provides the next-best or most informative experiment that should be performed to maximally reduce uncertainty in the pdf prediction of extreme events (tails of the distribution) according to a proposed “distance” metric. To account for tail features, the metric uses a logarithmic transformation of the pdfs, which is similar to the style of working on the rate function in the large deviation principle. The optimization process relies exclusively on the inferred properties on the parameter-to-observation map, and no additional simulations are required in the optimization. For the optimization problem to be practically solvable, we require the parameter space to be of moderate size. The proposed method allows us to sequentially sample the parameter space to rapidly capture the pdf and, in particular, the tails of the distribution of the observable of interest.

## Problem Setup and Method Overview

Consider a dynamical system with state variable

Our objective is to estimate the statistics, especially non-Gaussian features, of the observable q—that is, the pdf of the random variable induced by the mapping

Consider the quantity of interest q with pdf

If we consider the augmented θ-parameterized dataset

Determining the next-best experiment

The overall aim is to accurately capture the tail statistics of

## Method Description

An important component of our proposed method is the construction of an inexpensive surrogate for the map *SI Appendix*, 1 and numerous references, such as ref. 14). An important property of GPR is that the posterior distribution is a Gaussian process with an explicit mean and kernel function. The variance of the posterior can be used as a proxy for the error or uncertainty of the prediction, which we use along with

We learn the parameter-to-observation map *SI Appendix*, 1 for the expressions). We can then construct the following estimate for the pdf

We now formulate the optimization problem for the next sample point

The proposed criterion Q is then based on minimization of a distance metric between the pdfs of the confidence bounds of

The distance metric we propose for the selection of the next sample point is given by

We make a few comments on the optimization problem and the sequential algorithm. The sequential strategy is summarized in pseudocode in *SI Appendix*, 3. For the optimization problem, we use a derivative-free method, specifically a particle swarm optimizer. The integrations for the pdfs are computed explicitly from the definition of the Lebesgue integral by partitioning the codomain of map T. This is much more efficient compared with a Monte Carlo approach that would result in a very expensive computational task, as long as the dimensionality of the parameter space is low. For high-dimensional parameter spaces, the computational cost of the integration can become prohibitive, and in such cases, an order reduction in the parameter space should be first attempted, or alternatively an importance sampling algorithm could be used to compute the integral. In the numerical problems, the optimization problem is practically solvable for low-dimensional parameter spaces *SI Appendix*, 2). (For low-dimensional θ, since we presume the dataset size is small, the cost difference may be negligible.)

### Asymptotic Behavior.

The first theoretical result relates to the convergence of the proposed method to the true pdf as the number of samples goes to infinity (see *SI Appendix*, 4 for the proof). The second result shows that the asymptotic form of the criterion is given by (see *SI Appendix*, 4 for the proof):

Theorem 1: Let

Note that the pdf in the denominator under the integral in 9 is a direct implication of our choice to consider the difference between the logarithms of the pdfs in the optimization criterion. The pdf of the parameter-to-observation map *SI Appendix*, 4).

Corollary 1: Let **10**.

Therefore, for a large value of n, *Left*. The way the integral is computed guarantees that the integrand

On the other hand, if we had instead chosen a criterion that focused directly on the convergence of low order statistics for *Right*. In such a case, sampling the function

## Applications

We illustrate the proposed algorithm to two problems. The first example consists of a nonlinear oscillator stochastically forced by a colored noise process; this application serves to illustrate the main ideas of the proposed method. The second application, involving 3D hydrodynamic wave impacts on an offshore platform (a system with millions of degrees of freedom), showcases the applicability of the proposed method to real-world setups where computation of extreme event statistics using traditional approaches is prohibitively expensive, since the simulation times of experimental runs are on the order of several hours.

### Nonlinear Oscillator Driven by Correlated Stochastic Noise.

Consider the nonlinear oscillator*SI Appendix*, 5) to obtain

We consider a three-term truncation

In Fig. 3, we illustrate the sampling as determined by the proposed algorithm in addition to the **8** before moving on to the criterion using the distance metric Q in Eq. **6** involving the logarithms of the pdfs. Observe the unique shape that the sampling algorithm has identified in θ space, which spans regions in θ associated with finite probability and large values of q. Fig. 4 demonstrates the progression of the estimated pdf as a function of the iteration count. Even after only 100 samples, we have already captured the qualitative features of the exact pdf and have very good quantitative agreement.

We have explored the convergence properties of the algorithm, and in Fig. 5, we compare the proposed sampling method to space-filling LH sampling. The LH strategy is not iterative and thus must be started anew, which puts the LH sampling at a large disadvantage. Nonetheless, this serves as a benchmark to a widely used reference method for the design of experiments due to its simplicity. In the figure, the purple curve represents the mean LH design error and the shaded region represents the SD about the mean, which are computed by evaluating 250 number of random LH designs per fixed dataset size. Even considering the variance of the LH curve, the proposed algorithm under various parameters (initial dataset size or number of “core” iterations where the Q criterion uses the

### Hydrodynamic Forces and Moments on an Offshore Platform.

Here we apply the sampling algorithm to compute the probability distributions describing the loads on an offshore platform in irregular seas. The response of the platform is quantified through direct, 3D numerical simulations of Navier–Stokes using the smoothed particle hydrodynamics (SPH) method (15) (Fig. 6). Our numerical setup parallels that of a physical wave tank experiment and consists of a wave maker on one end and a sloping “beach” on the other end of the tank to quickly dissipate the energy of incident waves and avoid wave reflections. Further details regarding the simulations are provided in *SI Appendix*, 6.

Wind-generated ocean waves are empirically described by their energy spectrum. Here, we consider irregular seas with JONSWAP spectral density (see *SI Appendix*, 6 for details and parameters). While realizations of the random waves have the form of time series, an alternative description can be obtained by considering a sequence of primary wave groups, each characterized by a random group length scale L and height A (see, e.g., ref. 16). This formulation allows us to describe the input space through just two random variables (much fewer than what we would need with a Karhunen–Loeve expansion). Following ref. 16, we describe these primary wave groups by the representation **1**. The statistical characteristics of the wave groups associated with a random wave field (such as the one given by the JONSWAP spectrum) can be obtained by applying the scale-selection algorithm described in ref. 17. Specifically, by generating many realizations consistent with the used spectrum, we use a group detection algorithm to identify coherent group structures in the field along with their length scale and amplitude (L and A). This procedure provides us with the empirical probability distribution

The quantities of interest in this problem are the forces and moments acting on the platform. The incident wave propagates in the x direction, and as such, we consider the pdf of the force in the x direction *SI Appendix*, Figs. S5–S7, we also present the sampled map where it can be seen that the algorithm selects points associated with large forces and nonnegligible probability of occurrence. In the same figures, results for the momentum and an additional spectrum are included. Note that for this problem, the GPR operates on the logarithm of the observable because the underlying function is always positive.

## Conclusions

We developed and studied a computational algorithm for the evaluation of extreme event statistics associated with systems that depend on a set of random parameters. The algorithm is practical for high-dimensional systems but with parameter spaces of moderate dimensionality and where each sample is very expensive to obtain relative to the optimization of the next sample. The method provides a sequence of points that lead to improved estimates of the probability distribution for a scalar quantity of interest. The criterion for the selection of the next design point emphasizes the tail statistics. We prove asymptotic convergence of the algorithm and provided analysis for its asymptotic behavior. We also demonstrated its applicability to two problems, one involving a demanding system with millions degrees of freedom.

## Acknowledgments

We are grateful to the referees for providing numerous suggestions that led to important improvements and corrections. We also thank A. Crespo for helpful comments on the SPH simulations. This work has been supported through the Office of Naval Research Grant N00014-15-1-2381, the Air Force Office of Scientific Research Grant FA9550-16-1-0231, the Army Research Office Grant W911NF-17-1-0306, and a grant from the American Bureau of Shipping.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. Email: sapsis{at}mit.edu.

Author contributions: M.A.M. and T.P.S. designed research, performed research, analyzed data, and wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission. M.P.B. is a guest editor invited by the Editorial Board.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1813263115/-/DCSupplemental.

- Copyright © 2018 the Author(s). Published by PNAS.

This open access article is distributed under Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND).

## References

- ↵
- Fouque J,
- Papanicolaou G,
- Sircar R,
- Sølna K

- ↵
- Liu P

- ↵
- Albeverio S,
- Jentsch V,
- Kantz H

- Hense A,
- Friederichs P

- ↵
- Nicodemi M

*Extreme value statistics*. Computational Complexity: Theory, Techniques, and Applications, ed Meyers RA (Springer, New York), pp 1066–1072. - ↵
- Varadhan S

*CBMS-NSF Regional Conference Series in Applied Mathematics*(Society for Industrial and Applied Mathematics, Philadephia). - ↵
- Dematteis G,
- Grafke T,
- Vanden-Eijnden E

- ↵
- Sobczyk K

*Stochastic Differential Equations*, Mathematics and Its Applications (East European Series) (Springer, The Netherlands). - ↵
- Masud A,
- Bergman L

- ↵
- Franzke C

- ↵
- Majda A

- ↵
- Chen N,
- Majda A

- ↵
- Mohamad M,
- Cousins W,
- Sapsis T

- ↵
- Mohamad M,
- Sapsis T

- ↵
- Rasmussen C,
- Williams C

*Gaussian Processes for Machine Learning*, Adaptive Computation and Machine Learning (MIT Press, Cambridge, MA). - ↵
- Crespo AJC

- ↵
- Cousins W,
- Sapsis T

- ↵
- Cousins W,
- Sapsis T

## Citation Manager Formats

## Sign up for Article Alerts

## Article Classifications

- Physical Sciences
- Applied Mathematics