## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Sparse dynamics for partial differential equations

Contributed by Stanley Osher, February 13, 2013 (sent for review November 30, 2012)

## Abstract

We investigate the approximate dynamics of several differential equations when the solutions are restricted to a sparse subset of a given basis. The restriction is enforced at every time step by simply applying soft thresholding to the coefficients of the basis approximation. By reducing or compressing the information needed to represent the solution at every step, only the essential dynamics are represented. In many cases, there are natural bases derived from the differential equations, which promote sparsity. We find that our method successfully reduces the dynamics of convection equations, diffusion equations, weak shocks, and vorticity equations with high-frequency source terms.

In this work, we investigate the approximate dynamics of various partial differential equations (PDEs) whose solutions exhibit behaviors on multiple spatial scales. These scales may interact with one another in a nonlinear manner as they evolve. Many physical equations contain multiscale (as well as multiphysics) phenomena, such as the homogenization problems from material science and chemistry and multiscale systems in biology, computational electrodynamics, fluid dynamics, and atmospheric and oceanic sciences. In some cases, the physical laws used in the model can range from molecular dynamics on the fine scale to classical mechanics on the large scale. In other cases, the equations themselves contain high-wavenumber oscillations that separate into discrete scales, on top of the smooth underlying behavior of the system.

The main source of difficulty in multiscale computation is that accurate simulation of the system requires all phenomena to be fully resolved. The smaller spatial scales influence the global solutions; thus, they cannot be ignored in the numerical computation. In some cases, it is possible to derive an analytical equation for the effect of small scales on the solution (1, 2). In practice, however, it may not be possible to derive a simple expression that represents the fine-scale behavior. Many problem-dependent methods have been proposed in the literature, whereas a few provide a general methodology for modeling the macroscopic and microscopic processes that yield multiscale models. For example, some general methods include the heterogeneous multiscale method (3), the equation-free method (4), multiscale methods for elliptical problems (5), multiscale finite element methods (6, 7), and the sparse transform method (8). An overview of general multiscale approaches is provided in ref. 9. A key difference between our method and other methods (3⇓–5) is that we are directly resolving all the significant scales in the solution. By contrast, the other methods (3⇓–5) directly resolve only the coarse scales of the solution, and they separately “reconstruct” the fine-scale solution (as well as its effect on the coarse scales).

From the perspective of mathematics, multiscale methods began with representation of a function using a global basis, such as a Taylor series or Fourier series. More sophisticated bases have appeared, for example, any one of the many wavelet bases used in imaging and computational physics. The key to the basis approximation is that each basis element represents behavior on a specific scale; therefore, the coefficients of the basis provide complete information about the underlying function. This is also the principle behind multiresolution and decomposition methods.

As the methods of multiscale and multiphysics modeling developed over the past few decades, so did corresponding methods in imaging and information science. One of the fundamental ideas in imaging is that of sparsity. Sparse data representation is used throughout imaging from compression to reconstruction. Early advances in sparse techniques (e.g., refs. 10, 11) presented a convex minimization approach to the computationally challenging sparse basis pursuit problem. Many models that use sparsity to produce both more efficient numerical methods and better quality solutions have been proposed. Some applications of sparsity to imaging include compressive sensing, reconstruction of images from sparse data (12, 13), and recovery of images using sparse regularization (14, 15). The underlying principle of sparsity is that images can be approximately represented by a small number of terms with respect to some basis. Inducing sparsity, creating effective bases, and developing efficient computational algorithms have been intensely active fields in information science.

For imaging and information science, one of the reasons for the success of sparse methods is their ability to resolve drastically different phenomena with a small amount of information. This is also a principal goal of multiscale modeling. In this work, we transfer sparsity methodology, which was developed for information science, to multiscale nonlinear differential equations and show that it can be an effective tool for accurately computing solutions using less information.

In particular, we propose solving PDEs with the constraint that the approximate solution resides on a sparse subspace of a basis. In this way, the complexity of the method will depend on the number of basis terms retained and will be (nearly) independent of the grid size. In the following sections, we will discuss the general problem and the optimization method used to induce sparsity in the solution. The general numerical method will also be explained, as well as results of numerical experiments. The method is tested on an advection equation with oscillatory velocity, a parabolic equation with oscillatory coefficients, a conservation law with oscillatory diffusion, and the vorticity equations with high-frequency source terms. We conclude with a discussion on the proposed work and implications.

## Sparsity

### Problem Statement.

In general, the problem can be stated as follows. Assuming *x* ∈ ℝ^{n} and *t* > 0, let *u*(*x*, *t*): Ω → ℝ be the approximate solution ofsubject to the constraint: , where the number of nonzero *c*_{j} at a given time step is sparse. The operator *F*(⋅) can be nonlinear, nonlocal, and dependent on the derivative of *u*. The basis terms *ϕ*_{j} are assumed to exist on separate scales, which is true of most bases (e.g., Legendre polynomials, Fourier wavelets). In this way, the basis terms represent different global behaviors. The method involves two steps: Evolve the PDE forward in time, and project the updated solution onto a sparse subset. We first address sparsity induction through soft thresholding.

### Sparsity via Optimization: Soft Thresholding.

At a given time step, the problem of projecting the updated solution onto a sparse subset is equivalent to fitting a solution *u*^{n} with corresponding coefficients at the *n*th time step to a solution *u* whose corresponding coefficients are sparse. This can be written as a constrained least squares fit as follows:

Expanding with respect to the basis and assuming that the basis is orthonormal, this constrained optimization problem is related to the following unconstrained problem:where is the vector of coefficients. The “norm” ∥⋅∥_{0} is the number of nonzero coefficients in Eq. **2**. This makes Eq. **3** both nonconvex and difficult to solve. By replacing the *L*^{0} norm with the *L*^{1} norm, we get the following convex relaxation of Eq. **3**:

Note that because , the *L*^{1} norm is , where . The solution of Eq. **4** is given by the following equation:

In general, this can be computed for a nonorthonormal basis, which is equivalent to a basis pursuit problem with the *L*^{1} norm as a sparse regularizer. In that case, the solution must be found by an iterative method rather than the simple shrinkage provided here as an example. The resulting minimizer is a proximal solution that lies on a sparse subset of the original coefficient domain (16). This can be used to show that the solutions form a contraction map in the *L*^{2} norm. Alternatively, we can simply apply the soft thresholding on the coefficients directly in order to induce sparsity in this way.

## Numerical Method

Assuming *u*(*x*, *t*) is periodic in the domain Ω ⊂ ℝ^{n}, one natural basis is the Fourier basis, whose coefficients are the Fourier transform of *u*(*x*, *t*). This is appropriate for the examples shown here. For the rest of this work, we will use the Fourier basis; however, the overall methodology presented here is independent of the corresponding basis.

Taking the Fourier transform of the PDE from Eq. **1** and discretizing the resulting differential equation in time yield a multistep scheme. Because our method does not depend on the choice of numerical updating, we can assume that the scheme takes the following form:

The updated solution may be sparse depending on both the PDE and the update operator *Q*; however, in general, it will have nontrivial values everywhere depending on the approximation and implementation. The auxiliary variable *v* is projected onto a sparse subspace by the shrinkage operator:

Altogether, the update in the spatial domain is simply:

Unlike traditional projections, this is nonlinear and adaptive. Rather than sorting the coefficients and retaining a fixed number of large-amplitude terms or keeping terms whose wavenumbers are below some cutoff, the shrinkage allows the number and choice of nonzero coefficients to evolve over time. Also, this is not the same as hard-thresholding the solution at every step (i.e. keeping only the terms larger than a fixed value) because the coefficients that remain have decreased their magnitude by *λ*. Most importantly, the projection does not favor any particular part of the spectrum; instead, the amplitude of the coefficient determines if it remains. In terms of the Fourier basis, the importance is placed on the amplitude rather than the wavenumber.

For general convergence, as long as *λ* = *Cdt*^{p} for *p* larger than the accuracy of the scheme used to update the variable in time, the shrinkage operation does not change the spatial accuracy of the original method and the method will still converge as *dt* → 0. For example, discretize using the forward Euler method, and then expand the shrinkage operator to get

Therefore, to have convergence as *dt* → 0, the shrinkage parameter must be *λ* = *Cdt*^{1+α}. For linear PDEs, convergence can easily be shown. In general, the shrinkage operator is nonexpansive in each coefficient; hence, it is nonexpansive in coefficient norms. This may help with obtaining a general convergence result.

## Numerical Results

In this section, we discuss the application of the proposed sparse method to several equations with different numerical schemes.

### Convection.

The convection equation we consider is the following:where the coefficient *a*(*x*) is highly oscillatory.

Let *k* be the wavenumber, and use spectral Leap Frog as the updating for Eq. **10** to obtainin which ∗ is the convolution operator over frequency.

The time step is (*dx*) to preserve the stability condition in Eq. **10**. In Fig. 1, the coefficient is chosen as follows:

This choice of *a*(*x*) exhibits both fast and slow modes, but the particular structure is not directly needed.

Fig. 1 illustrates the performance of the sparse solution method on this example by comparison of the sparse solution, the true solution produced using a standard fully resolved method, and a “low-frequency solution” produced by solving Eq. **11** for wavenumbers *k* in the interval |*k*| ≤ *K*/2 in which *K* is the number of modes in sparse solution. In Fig. 1 (*Upper Left* and *Upper Right*), the sparse solution produced by our method and the true solution at *t* = 1 are plotted in the spatial domain at a given time. In Fig. 1 (*Lower Left*), the true and low-frequency solutions are displayed. The low-frequency solution is unable to capture the correct speed. In Fig. 1 (*Lower Right*), the sparse and true spectra are plotted. The sparse spectrum captures the largest amplitude coefficients throughout the domain. In fact, of the 512 coefficients used in the true solution, only 27 are retained in the sparse solution (about 5.3%).

### Parabolic.

The parabolic equation we consider is the following:where the diffusion coefficient *a*(*x*) is highly oscillatory. The coefficient is assumed to be bounded [i.e., *A*_{M} ≥ *a*(*x*) ≥ *A*_{m} > 0]. This is also related to the elliptical case ∂_{x}(*a*(*x*)∂_{x}*u*) = *f*, because an elliptical equation can be solved by taking a parabolic scheme to steady state. Alternatively, the corresponding parabolic scheme can be iterated forward for a small number of time steps in order to find a partial solution to the elliptical problem. Then, by using the partial solution, the locations of the nonzero coefficients can be extracted and the elliptical problem can be solved by a Galerkin method on these coefficients (8).

The updated scheme we use for Eq. **12** is forward Euler method:

The time step is (*dx*^{2}) to preserve the stability condition as well as the highly oscillatory nature of the coefficient *a*(*x*) in Eq. **12**. In Fig. 2, the coefficient is chosen as follows:

In Fig. 2 (*Lower Right*), the highly oscillatory diffusion coefficient *a*(*x*) is plotted in space. In Fig. 2 (*Upper Left* and *Upper Right*), the sparse solution produced by our method and the true solution at *t* = 1 are plotted in the spatial domain at a given time and are nearly indistinguishable. The high-frequency information is near the scale of the grid size, which can be seen in the zoomed-in plot. In Fig. 2 (*Lower Left*), the true and sparse spectra are displayed. The sparse spectrum captures the largest coefficients throughout the domain and not just the low wavenumbers. In fact, of the 2,048 coefficients used in the true solution, only 53 are retained in the sparse solution (about 2.6%). In time, the number of nonzero coefficients, as well as the identities of the nonzero coefficients, will change in order to capture various behaviors.

### Viscous Burgers.

To investigate the sparse dynamics of conservative laws with diffusion, we use the viscous Burgers-type equation:

The left-hand side (LHS) of Eq. **14** is the standard Burgers advection term, and the right-hand side (RHS) is diffusion related to Eq. **12**. The equation exhibits three separate phenomena: (1) smooth large-scale behavior from the diffusion, (2) small-scale oscillations induced from the high frequencies in the coefficient *a*(*x*), and (3) nonlinear interactions between frequencies from the advection term. The update scheme in time is the standard total variational diminishing Runge–Kutta 2 method:where . As before, we have the stability condition in which *dt* is (*dx*^{2}).

For Fig. 3, the diffusion coefficient is chosen as:

The convolutions in the diffusion and nonlinear terms are performed in the spectral domain rather than by other methods, such as the pseudospectral method. The various dynamics can be seen in the spatial and spectral plots (Fig. 3). The true, sparse, and low-frequency solutions are plotted in the space in Fig. 3 (*Upper Left* and *Upper Right*). The low-frequency projection is done by thresholding any coefficients outside of a particular range. Specifically, the number of low wavenumbers retained is the same as in the sparse solution, although their identities are dramatically different. The sparse solution captures the local and global behaviors of the solution more accurately than the low-frequency projected solution. In Fig. 3 (*Lower Left* and *Lower Right*), the spectrum of the true solution is compared with the sparse and low-frequency spectra, respectively. The local peaks in the spectra are related to the wavenumbers in the diffusion coefficient *a*(*x*) and the harmonics induced by the nonlinear advection term. Notice that in this case, each of the distributions in the spectral domain does not decay as rapidly as in the parabolic case. The sparse solution contains 130 of a total 1,024 possible coefficients, which is about 12.7%.

### Vorticity Equations.

The vorticity equation we consider is derived from the (2D) incompressible Navier–Stokes equation (17):where *u* is the vorticity (not the velocity). Similar to Eq. **14**, Eq. **15** exhibits three separate phenomena: (1) smoothness from the diffusion term on the RHS, (2) small-scale oscillations induced from the high frequencies in the source term *f*, and (3) nonlinear interactions between frequencies from the advection term on the LHS. However, because the operator ∇^{⊥}Δ^{−1} is smoothing (in some sense), the advection term can be viewed as less nonlinear than the one found in Eq. **14**. In terms of the numerical method, the operator ∇^{⊥}Δ^{−1} dampens the coefficients by a factor that acts as |*k*|^{−1}.

For the numerical implementation, the diffusion term is discretized using the Crank–Nicolson method, whereas the advection term is lagged. Because the operators are diagonalized in the coefficient basis, the steps can be invertible and lead to a simple updating scheme:

For Fig. 4, the forcing term is chosen to be:

The standard stability condition is used for choosing the time steps in order to ensure capture of all small-scale behaviors. In Fig. 4 (*Upper Left* and *Upper Center*), the true and sparse solutions are plotted in the spatial domain. Notice that the oscillations introduced by the source term interact with the two vortex patches, and thus contribute to the global behavior of the solution. The spectra of the true and sparse solutions are plotted in Fig. 4 (*Lower Left* and *Lower Center*), where the low wavenumbers are located in the middle of the domain. The sparse solution retains about 4.28% of the coefficients. In the sparse spectrum, the coefficients are located throughout the domain, including the highest frequency itself (seen on the boundary of the spectral domain). In Fig. 4 (*Lower Right*), the *L*^{2} and *L*^{∞} errors are shown to decrease as the resolution increases. This sparse solution, as well as the other examples presented here, converges as the spatial discretization goes to zero.

It was observed that as the dimension increases, the sparsity of the solution also increases (because it is proportional to the product of the sparsities in each dimension). Thus, the method scales well with dimension.

It is worth noting that in a related work, wavelet hard thresholding was used to separate coherent and incoherent structures in turbulent flows (18).

In the examples from the previous section, the PDEs contained a mixture of multiscale properties with a diffusion term. The combination of nonlinear and oscillatory terms created a large range of wavenumbers in the solution, whereas the dynamics produced a range of amplitudes. This gives the necessary structure for sparsity with respect to the Fourier basis. In general, the highest order derivative will determine the appropriate basis in which the solutions could be sparse.

If the spectrum is more localized (i.e. nonzero regions in the low-frequency regime), the proposed model can better condition the numerical method. Empirically, the shrinkage operator acts as a nonlinear filter on the coefficients. It was observed that for a fixed *C* and *p*, where *λ* = *Cdt*^{p}, the numerical updates presented here with *dt* larger than theoretically and numerically possible in the original scheme will converge. In the case of the vorticity example, *dt* can be taken much larger when soft thresholding is applied than in the standard scheme. Also, the nonlinear filter seems to reduce parasitic modes and spurious oscillations found in spectral approximations for linear and nonlinear slightly viscous hyperbolic equations.

One key point is that our method works by fully resolving the solution. Its efficiency is gained by omitting modes that are insignificant. This requires that *λ* be small enough so that the filter does not annihilate essential features. For example, if the initial data are smaller than *λ* for a particular unstable mode, our approximate solution will not match the true dynamics. As the grid is refined, the mode will be captured (because *λ* decreases as Δ*x* decreases).

In terms of complexity, each iteration is dominated by the convolution step. The convolution in the coefficient domain (spectral domain) can be performed explicitly over the *n*_{s}(*t*)-sparse vectors rather than transforming back onto the spatial grid, which is (*n*_{s}(*t*)^{2}) at each step. When *n*_{s}(*t*)^{2} ≪ *N*log*N*, convolving in the spectral domain rather than transforming back and forth between domains decreases the computational cost of the algorithm. Knowing *a priori* the maximum sparsity [i.e., *n*_{s,max} = max_{t} *n*_{s}(*t*)], faster routines and transforms could be optimized for specific problems and applications. For example, one can optimize the transform between the spatial and coefficient domains knowing the given sparsity at the current step and the nontrivial coefficients’ identities. In the linear cases, as in Eq. **13**, the operation can be stored as a large but sparse matrix, reducing the updates to a sparse matrix–sparse vector operation at every iteration. Our goal in this work is to formulate a PDE solver that promotes sparsity. In future work, we will present a study of the computational complexity and speed.

When the dynamics are dominated by a linear term, for example, high viscosity, the identities of the nontrivial coefficients settle over time. This was also observed in the nonlinear cases, but over a longer time period. This would enable creation of a sparse basis for elliptical equations (e.g., with oscillatory coefficients). Also, after this settling phase, the shrinkage operator can be replaced by a projection onto the known identities of the nontrivial coefficients.

In many of the cases here, it is possible to obtain hypersparse solutions (those with 1% or fewer coefficients) at the cost of accuracy.

## Conclusion

In this work, we have proposed a method to resolve fully the solutions of multiscale PDEs while only retaining important modes. The reduced dynamics created by the sparse projection properly capture the true phenomena exhibited by the solution. The sparse projection amounts to a shrinkage of the coefficients of the updated solution at every time step. There exist many possibilities for using the sparsity to optimize individual algorithms and to create faster, more efficient computational routines.

## Acknowledgments

The research of H.S. was supported by the Department of Defense through the National Defense Science and Engineering Graduate Fellowship Program. The research of S.O. was supported by the Office of Naval Research (Grant N00014-11-1-719). The research of R.C. was supported by the Department of Energy (DOE; Grant DE-FG02-05ER25710). The research of C.D.H. was sponsored by the Office of Advanced Scientific Computing Research, DOE. C.D.H.'s work was performed at the Oak Ridge National Laboratory, which is managed by the University of Tennessee-Battelle under Contract DE-AC05-00OR22725.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. E-mail: sjo{at}math.ucla.edu.

Author contributions: H.S., R.C., C.D.H., and S.O. performed research; and H.S. wrote the paper.

The authors declare no conflict of interest.

## References

- ↵
- Arnold VI

- ↵
- Bensoussan A,
- Lions JL,
- Papanicolaou G

- ↵
- W E,
- Engquist B,
- Huang Z

- ↵
- Kevrekidis IG,
- et al.

- ↵
- ↵
- Efendiev Y,
- Hou TY

- ↵
- ↵
- ↵
- W E,
- Engquist B

- ↵
- ↵
- ↵
- ↵
- Bengio Y,
- Schuurmans D,
- Lafferty J,
- Williams CKI,
- Culotta A

- Zhou M,
- et al.

- ↵
- ↵
- ↵
- Cai JF,
- Osher S,
- Shen Z

- ↵
- Majda AJ,
- Bertozzi AL

- ↵

## Citation Manager Formats

## Sign up for Article Alerts

## Article Classifications

- Physical Sciences
- Applied Mathematics