## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# State estimation and prediction using clustered particle filters

Contributed by Andrew J. Majda, October 20, 2016 (sent for review July 14, 2016; reviewed by Jeffrey Anderson and Sebastian Reich)

## Significance

Particle filtering is an essential tool for the estimation and prediction of complex systems including non-Gaussian features. A class of particle filters, clustered particle filters, is introduced for high-dimensional dynamical systems such as geophysical systems. The proposed method uses relatively few particles compared with the standard particle filter and captures the non-Gaussian features of the true signal, which are typical in complex nonlinear systems. The method is also robust for the difficult regime of high-quality sparse and infrequent observations and does not show any filter divergence in our tests. In the clustered particle filter, coarse-grained localization is implemented through the clustering of state variables and particles are adjusted to stabilize the filter.

## Abstract

Particle filtering is an essential tool to improve uncertain model predictions by incorporating noisy observational data from complex systems including non-Gaussian features. A class of particle filters, clustered particle filters, is introduced for high-dimensional nonlinear systems, which uses relatively few particles compared with the standard particle filter. The clustered particle filter captures non-Gaussian features of the true signal, which are typical in complex nonlinear dynamical systems such as geophysical systems. The method is also robust in the difficult regime of high-quality sparse and infrequent observations. The key features of the clustered particle filtering are coarse-grained localization through the clustering of the state variables and particle adjustment to stabilize the method; each observation affects only neighbor state variables through clustering and particles are adjusted to prevent particle collapse due to high-quality observations. The clustered particle filter is tested for the 40-dimensional Lorenz 96 model with several dynamical regimes including strongly non-Gaussian statistics. The clustered particle filter shows robust skill in both achieving accurate filter results and capturing non-Gaussian statistics of the true signal. It is further extended to multiscale data assimilation, which provides the large-scale estimation by combining a cheap reduced-order forecast model and mixed observations of the large- and small-scale variables. This approach enables the use of a larger number of particles due to the computational savings in the forecast model. The multiscale clustered particle filter is tested for one-dimensional dispersive wave turbulence using a forecast model with model errors.

Data assimilation or filtering combines numerical forecast models with observational data to provide the best statistical estimation and prediction of complex systems. Due to the high dimensionality of complex nonlinear systems such as geophysical systems, accurate and efficient estimation and prediction of such complex systems are formidable tasks. They require enormously large computational resources to run forecast models and observations can be sparse and infrequent, as in oceanography. As a Monte Carlo approach, ensemble-based methods (1, 2) combined with covariance inflation and localization are indispensable tools because they allow computationally cheap, low-dimensional ensemble state approximations for the systems. They have performed well for operational applications such as numerical weather prediction (3, 4). Nevertheless, because the ensemble-based methods approximate the forecast distribution using Gaussian statistics, when the ensemble is not sufficiently large these methods can lead to inaccurate estimation and prediction when the true signal has non-Gaussian statistics, which are typical for a wide range of systems. Particle filtering captures non-Gaussian features using different weights for different samples (or particles) and is an established discipline for low-dimensional dynamical systems (5⇓–7). Particle filtering does not require any assumption on the prior distribution and leads to consistent Bayesian posterior statistics. The particle weights also determine which particles to remove and duplicate [i.e., resampling strategies (8)] to prevent particle collapse in which only a small fraction of particles have the most weights whereas the rest of the particles have minimal or zero weights. Despite the successful applications of the particle filter for low-dimensional systems, the particle filter suffers from the curse of dimensionality for high-dimensional systems. This requires exponentially increasing particle numbers with the dimension of the system (9, 10).

There are several attempts to overcome the curse of dimensionality and enhance the performance of the particle filter with a small particle size. The blended particle filter (11, 12) uses forecast models that have adaptively varying reduced-order models to capture non-Gaussian statistics through particle filtering whereas high-dimensional quasi-Gaussian statistics are maintained by Gaussian mixtures. The maximum entropy particle filter in chapter 15 of ref. (13) makes judicious use of partial marginal distributions to avoid particle collapse. Another method is to solve an optimal transport problem for the transition from before the posterior (14,15) to avoid the random sampling aspects of particle filters. Motivated by the success of the ensemble-based method using covariance localization (16, 17), there are other methods that implement localization for non-Gaussian statistics, including the rank histogram filter (18), sequential particle filter (19), hybrid ensemble transform particle filter (15), and the localized particle filter (20). The localized particle filter uses vector particle weights instead of the scalar weights of the standard particle filter and provides a Bayesian update in regions near the physical locations of observations. This method shows successful results capturing non-Gaussian statistics but is not robust in the difficult regime of high-quality (i.e., small error) infrequent observations.

In this paper, we introduce a class of particle filters that is robust in the difficult regime of high-quality infrequent sparse observations capturing non-Gaussian features of the true signal. In our particle filter each observation at different spatial locations affects only the neighbor state variables through the clustering of the state variables, which implements coarse-grained localization. The clustering is based on the observation network and the data assimilation of the whole space is divided into smaller-sized data assimilation problems in each cluster that do not influence each other (see ref. 21 for a similar theoretical formulation). Our method also adjusts particles so that the prior particles are sufficiently close to the observation when the observation is not in the convex hull of the predicted observations by prior particles (see ref. 6 and chapter 15 of ref. 13 for similar issues).

## Standard Particle Filter and Localization

Throughout this paper, we consider the data assimilation of the true signal

Observations,

Using

Here

## Clustered Particle Filter

One of the key features of our proposed method, clustered particle filter (CPF), is particle adjustment. Particle adjustment updates the prior particles closer to the observation instead of reweighing the particles when the prior is too far from the observation likelihood. When observations are sparse, unobserved adjacent state variables must have the same particle weights with the observed variable as they are updated using cross-correlations. For this purpose, the CPF partitions the state variables into nonoverlapping clusters

For the substate vector

Thus, the assimilation of the full state vector is decomposed into

### Particle Adjustment.

One of the central issues of particle filtering is particle collapse, where only a small fraction of particles have most weights whereas the rest of the particles have minimal weights. Particle collapse can happen when observations are not in the convex hull of the predicted observations by prior particles

Because the observation error *Supporting Information* for the method to find the adjustment matrix

Coarse-grained localization and resampling in each cluster can lead to discontinuous nonphysical state variables and some hybrid methods are designed to avoid this (15). Below we show that the CPF method has robust filter performances for a turbulent system with moderate turbulent behavior dominated by two large-scale Fourier modes.

Now we summarize the hard threshold version of the CPF, which triggers the particle adjustment if the following condition is satisfied.

The hard threshold criterion for particle adjustment is as follows:

### The one-step assimilation of the hard threshold CPF algorithm.

In addition to the hard threshold criterion, other criteria can be used to trigger particle adjustment using additional information such as innovation statistics (see *Supporting Information* for a soft threshold version CPF using innovation statistics). There can be many variants of the CPF incorporating different criteria for the particle adjustment and other factors such as several resampling strategies and inflation techniques. Throughout this study, we use the hard threshold CPF with residual resampling (8), which shows robust filtering skills in our tests.

## Numerical Tests

As a test model for filter performance, we use the following 40-dimensional Lorenz 96 system (23):

### Lorenz 96 with *F* = 8, Standard Test Regime.

The Lorenz 96 model with **2** with observation error variance 0.05, which corresponds to about 1% of the true signal variance]. EAKF, which uses a localization radius 8 and 20% inflation, has the best result with rms errors staying below the observation error (Fig. 2, dashed and dotted line). CPF is slightly worse than EAKF. However, CPF shows filtering skill without tuning localization; rms errors are around the observation error using 20 observations out of 40 variables. The localized particle filter suffers from poor performance even with tuned parameters; rms errors are comparable to the climatological error (dash line). With 40 and 10 regularly spaced observations (see Fig. S1 for these results), all methods have results comparable to the 20 observation case except CPF with 10 observations; the hard threshold version CPF loses skillful performance with 10 observations. In this case, the soft threshold version CPF uses innovation statistics to trigger particle adjustment and recovers skillful filter performance. See Fig. S2 for the results of CPF using the soft version threshold with 10 observations.

In our experiments performed in this study, the particle adjustment is triggered less than 30% on average and at most 70% at rare times in assimilating clusters at each observation time. However, the particle adjustment plays a crucial role in stabilizing the filter. Fig. 3 shows two time series of the rms errors of a CPF data assimilation experiment with and without particle adjustment. CPF without particle adjustment shows stable filter performance up to 50 cycles. However, after this cycle its performance quickly degrades and the filter does not show skillful performance. However, the CPF with particle adjustment shows a robust and stable result throughout long assimilation cycles.

### Lorenz 96 with *F* = 5, Strongly Non-Gaussian Test Regime.

The Lorenz 96 model with **2**, we also test a nonlinear observation *Top*) and log (Fig. 4, *Bottom*) observations (the tuned parameters for EAKF are 6 for the localization radius and 25% for the covariance inflation). With linear observations, CPF, which does not require localization tuning, has a result comparable to EAKF in which localization tuning plays an important role. In the experiment with the log observation there is a significant performance difference between EAKF and CPF. The maximum values of rms errors of both methods are comparable. However, the high rms error is infrequent in CPF, whereas EAKF stays around this high rms error. EAKF may achieve a better performance by using a spatially varying localization radius because observations are irregular. However, this process is extremely expensive and might be impractical in real applications.

The accuracy of the filtering methods relies on the forecast models’ skill to obtain the right statistics of the true signal. We now check the filter performance of the two methods in capturing the non-Gaussian features of the true signal. Due to the ergodicity of the Lorenz 96 system Eq. **12**, the PDFs of the model can be obtained by running an ensemble of the model in statistical steady state for a long time. The PDFs for the physical space variable

## Application to Multiscale Data Assimilation

Tremendously large computational costs to run forecast models for high-dimensional complex systems limit the number of samples, which degrades the filter performance. The multiscale data assimilation method (25) aims at increasing the sample numbers using computationally cheap but robust reduced-order coarse forecast models and provides the resolved large-scale estimation using mixed observations of the large and small scales. The coarse forecast model resolves the large-scale variables whereas the small-scale variables are approximated using conditional Gaussian distributions that can represent non-Gaussian statistics through the interactions between the large and small scales. The multiscale filtering also provides a mathematical framework for representation errors, which are due to the contribution of unresolved scales (26, 27) in the observations. Under this framework, the multiscale data assimilation method has been applied for an ensemble-based method and has shown successful applications in several stringent multiscale problems including wave turbulence (28) and baroclinic turbulence (29). In addition to the ensemble-based approach, the multiscale particle filter in ref. 25 uses particle filtering for the resolved large scales whereas the unresolved small scales are updated using the Kalman update, which has significantly low computational costs compared with the standard particle filter. Here we use the CPF algorithm for multiscale particle filtering and test it for a one-dimensional wave turbulence model. Because the reduced-order forecast method has model errors, the experiment in the next section also serves as a data assimilation test for the CPF with model errors.

### Multiscale Filtering of Wave Turbulence, Test with Model Errors.

We apply the multiscale cluster particle filter for the wave turbulence model introduced by Majda, McLaughlin and Tabak (MMT) in refs. 30 and 31 as a computationally tractable model of wave turbulence. The model is described by the following one-dimensional partial differential equation for a complex scalar

Here, we compare the filtering results of the ensemble-based multiscale data assimilation method (28) and the multiscale CPF for the MMT model. As the forecast model for both filtering methods, we use the stochastic superparameterization multiscale method (32, 33), which is a seamless, multiscale, coarse-grid method using conditional Gaussian statistics for the unresolved small scales. The forecast model uses only 128 grid points whereas the full resolution uses 8,192 grid points, which yields about 250 times cheaper computational savings (considering the saving in the time step). Because the forecast model has a low computational cost compared with the full-resolution model, the forecast model has significant model errors. Observations of the full-scale variables are available at uniformly distributed 64 grid points (which are extremely sparse compared with the full-resolution 8,192 grid points) with an observation error variance corresponding to 3% of the total climatological variance at every time interval of 0.25. The ensemble-based method uses the tuned parameters in ref. 28 (i.e., a short localization radius 1 and 2% covariance inflation). For the hard threshold version CPF, the particle adjustment is triggered if either real or imaginary parts are not in the convex hull of the corresponding predicted observations as we observe both parts of the true signal. Both particle and ensemble-based methods use 129 samples.

The time series of the large-scale estimation rms errors of the ensemble-based filter and clustered multiscale particle filter are shown in Fig. 6. The dashed and dotted line is the effective observation error 0.34, which is defined as

## Concluding Discussion

The CPF introduced in this paper shows robust state estimation and prediction skill for high-dimensional systems using high-quality sparse observations and relatively few particles. In addition to its accurate filter results measured by rms errors, the CPF captures the non-Gaussian features of the true signal whereas the ensemble-based EAKF captures incorrect Gaussian features. The CPF is also extended to multiscale particle filtering. Using a cheap reduced-order forecast model with model errors, our method is successfully applied to the stringent one-dimensional wave turbulence MMT model. It is also shown that particle adjustment is an essential step in addition to localization to stabilize the filter and achieve accurate estimation. This suggests that it is an interesting research topic to investigate various criteria to trigger particle adjustment and combine them with several covariance inflation techniques (34,35) that can affect the performance of the CPF. The locality of the observation operator is essential in our method and we have considered only one-dimensional observations where each cluster contains only one observation point. It would be natural to extend the current method to general observation networks including 2D and 3D spaces using an information theoretic approach, for example, clustering of observations and investigate dynamic imbalance from coarse-grained localization for operational applications.

## Particle Adjustment Matrix *A*

We describe how to find the adjustment matrix **9**. Because each observation

From the prior particles

Now our goal is to find a matrix

First of all, using the particle perturbation matrix

Now we use this representation of G for the posterior covariance:

Because the prior covariance and observation covariance matrices are symmetric and positive definite, we have the following eigenvalue decomposition:

## Soft Threshold Version CPF

In addition to the hard threshold version, we can also use a soft threshold CPF using innovation statistics. The soft version uses the statistics of the innovation

The soft version criterion for particle adjustment is as follows:

In addition to the new criterion for the particle adjustment, the soft version CPF uses a multiplicative covariance inflation

## Acknowledgments

This work was partially supported by Office of Naval Research Grants ONR MURI N00014-12-1-0912 and DARPA 25-74200-F4414 (to A.J.M.). Y.L. is supported as a postdoctoral fellow by these grants.

## Footnotes

- ↵
^{1}To whom correspondence may be addressed. Email: ylee{at}cims.nyu.edu or jonjon{at}cims.nyu.edu.

Author contributions: Y.L. and A.J.M. designed research, performed research, analyzed data, and wrote the paper.

Reviewers: J.A., National Center for Atmospheric Research; and S.R., Universität Potsdam.

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1617398113/-/DCSupplemental.

Freely available online through the PNAS open access option.

## References

- ↵.
- Evensen G

- ↵
- ↵.
- Kalnay E

- ↵.
- Buehner M

- ↵.
- Doucet A,
- de Freitas N,
- Gordon N

- ↵
- ↵.
- Bain A,
- Crisan D

- ↵.
- Moral PD,
- Jacod J

- ↵.
- Bengtsson T,
- Bickel P,
- Li B

- ↵
- ↵.
- Majda AJ,
- Qi D,
- Sapsis TP

- ↵.
- Qi D,
- Majda AJ

- ↵.
- Majda AJ,
- Harlim J

- ↵.
- Reich S

- ↵.
- Chustagulprom N,
- Reich S,
- Reinhardt M

- ↵
- ↵.
- Hamill TM,
- Whitaker JS,
- Snyder C

- ↵
- ↵.
- Cheng Y,
- Reich S

- ↵.
- Poterjoy J

- ↵.
- Rebeschini P,
- van Handel R

- ↵.
- Lorenz EN

- ↵.
- Anderson JL

- ↵.
- Lee Y,
- Majda AJ

- ↵.
- Daley R

- ↵.
- Janjic T,
- Cohn SE

- ↵.
- Grooms I,
- Lee Y,
- Majda AJ

- ↵.
- Grooms I,
- Lee Y,
- Majda AJ

- ↵
- ↵.
- Cai D,
- Majda AJ,
- McLaughlin DW,
- Tabak EG

- ↵.
- Majda AJ,
- Grooms I

- ↵.
- Grooms I,
- Majda AJ

- ↵.
- Tong XT,
- Majda AJ,
- Kelly D

- ↵.
- Luo X,
- Hoteit I

## Citation Manager Formats

### More Articles of This Classification

### Physical Sciences

### Related Content

- No related articles found.