## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Stochastic geometry and topology of non-Gaussian fields

Edited by David R. Nelson, Harvard University, Cambridge, MA, and approved October 12, 2012 (received for review July 16, 2012)

## Abstract

Gaussian random fields pervade all areas of science. However, it is often the departures from Gaussianity that carry the crucial signature of the nonlinear mechanisms at the heart of diverse phenomena, ranging from structure formation in condensed matter and cosmology to biomedical imaging. The standard test of non-Gaussianity is to measure higher-order correlation functions. In the present work, we take a different route. We show how geometric and topological properties of Gaussian fields, such as the statistics of extrema, are modified by the presence of a non-Gaussian perturbation. The resulting discrepancies give an independent way to detect and quantify non-Gaussianities. In our treatment, we consider both local and nonlocal mechanisms that generate non-Gaussian fields, both statically and dynamically through nonlinear diffusion.

Random fields are ubiquitous. A disparate class of phenomena—ranging from the cosmic background radiation (1) and surface roughness (2) to medical images of brain activity (3) and optical speckle patterns (4)—produce data that can be regarded as random fields. The statistics of geometrical features of these fields, such as the density of extrema of various types, can be used to characterize them (5, 6). When the fields can be approximated as Gaussian fields, the physical meaning of these statistical properties is generally well understood (7, 8); the statistics of extrema, for instance, reflects the amount of field fluctuations at short distances.

Although analytical investigations are often restricted to Gaussian fields, phenomena described by nonlinear laws (such as the dynamics of inflation that produced the cosmic background radiation) produce non-Gaussian signals. Quite often, the observable signal is averaged over a large scale, producing approximately Gaussian statistics on account of the central limit theorem, thus masking the nonlinearity. Nevertheless, the surviving tiny departures from Gaussianity can carry a crucial signature of the nonlinear microscopic mechanisms at the heart of the phenomena. As an illustration, consider a low-resolution measurement of the spatial magnetization of a material well above the critical temperature. The magnetization fluctuates like a Gaussian random variable—each region contains many domains oriented up or down in arbitrary proportion. However, a small non-Gaussian contribution remains, because there is a maximum possible magnetization per unit area that can be traced all of the way down to the quantization of the spin of the electrons, and hence the probability distribution cannot exhibit Gaussian tails.

To unveil such elusive effects, one needs an indicator that is sensitive to both short distances and small signals. The most common tool used to probe the statistics of a random field is to measure its correlation functions. For example, the statistical properties of a random scalar field, , with Gaussian statistics, are entirely determined by its two-point correlation function , and its higher-order correlation functions can be written simply as the sum of products of two-point correlation functions. The nonfactorizability of these higher-order correlation functions is one of the standard indicators of non-Gaussian statistics.

Here we focus on a more geometric approach: view the scalar field as the height of a surface and study its random topography to infer the statistical properties of the signal (Fig. 1, *Inset*). The densities of peaks and troughs, or of topological defects in the curvature lines known as umbilics (Fig. 1), are sensitive indicators of how jagged the height field is at short distances; as we shall see, they provide an independent pipeline to detect non-Gaussianities, distinct from multiple-point correlation functions. This geometric approach has been applied successfully to track the power spectrum of a Gaussian field, and it has been the subject of extensive theoretical and experimental studies (6⇓⇓⇓–10).

In this paper, we introduce the key physical concepts and mathematical techniques necessary to study the stochastic geometry of signals that can be described as a Gaussian random field plus a perturbation that we wish to track. We first show how to treat non-Gaussianities within a local approximation and calculate how the statistics of extrema change when a nonlinear transformation is applied locally to a Gaussian field . Then we consider the case of fields that cannot be probed directly, by calculating the statistics of umbilical points, which are topological defects of the lines of principal curvature (11). Finally, we turn to a class of nonlinear diffusion (Eq. **2**) and go beyond the local approximation by considering the effects of spatial gradients that couple values of the field at different locations. As an illustration, we solve explicitly for the nonlocal non-Gaussianities generated dynamically by the deterministic Kardar-Parisi-Zhang (KPZ) equation, which models surface growth (2).

## Critical Points

To gain some insights into the physical mechanisms that generate non-Gaussianities, consider first how an isotropic Gaussian field arises from the random superposition of waves (or equivalently, Fourier modes):with an amplitude spectrum that depends only on the magnitude of the wave vectors, . [The power spectrum is the Fourier transform of the two-point correlation function.] The phases are uncorrelated random variables uniformly distributed in the range . The statistical properties of the Gaussian field are entirely encoded by the function or, equivalently, its moments .

The most basic difference between Gaussian and non-Gaussian variables is that Gaussian ones are always symmetric about their mean. As a consequence, irrespective of its power spectrum, a Gaussian field has equal densities of maxima and minima. Hence, a nonvanishing imbalance, , between these two types of extrema serves as a probe to detect and quantify the non-Gaussian component of a signal, provided that it can be measured directly. For example, the statistics of peaks and troughs have been used to test the Gaussianity of the temperature fluctuations in the cosmic microwave background (12, 13).

Consider the primordial curvature perturbation field, , a nearly Gaussian field of central interest to modern cosmological studies (1). Within a local approximation, the primordial field is obtained from a Gaussian field via a nonlinear relation . Determining the parameters and is one of the central tasks in the study of cosmological non-Gaussianities. As we shall see, the quadratic coefficient can be determined from the imbalance between maxima and minima of .

The imbalance can be derived in the more general context of a non-Gaussian field *h* that is obtained from a Gaussian field via any nonlinear deformation . If is a monotonic function, the maxima and minima do not change—only a nonmonotonic behavior of can alter this balance.

The critical points of *h* are given by , where the prime indicates the derivative of with respect to . This condition shows that *h* and have the same critical points. Note however that, if at a critical point , then a maximum (minimum) at will be turned into a minimum (maximum) at . The number of saddle points does not change—it must be equal to the sum of maxima and minima according to Morse theory (14).

If the transformation has a bias toward converting minima into maxima, then *h* will have more maxima than minima; for example, reverses its slope at sufficiently negative values of , which are most likely to be minima. Following this logic, the first step toward calculating the imbalance between maxima and minima is to determine the probability that for a minimum of . The symmetry properties of imply that the analogous probability distribution for maxima is .

The fraction of minima of that become maxima of *h* is obtained by integrating over the range of *z* for which . Likewise, the fraction of maxima of that are turned into minima is given by the integral of over the same range. The overall imbalance in the densities of the maxima and minima of *h* can be readily obtained by adding these opposite contributions. The result reads

For two dimensions, the exact analytical expression for is explicitly derived in *Materials and Methods*—it depends only on the moments , , and . Rescaling does not affect the function , because it increases the density of maxima with any value of by the same proportion. Hence, only (which sets the scale of the distribution) and the dimensionless parameter can enter in the expression for (see the plot in Fig. 2, *Inset*).

As an illustration, apply Eq. **2** to the perturbed Gaussian field . Fig. 2 shows our theoretical formula as a continuous line, validated by numerical data (dots) obtained from computer-generated random surfaces with amplitude spectrum , having . The imbalance between maxima and minima is particularly useful to track large deviations from Gaussianity, and this can be seen explicitly for the quadratic perturbation considered above. Because itself has an equal number of maxima and minima, the imbalance in *h* is only created when one of these critical points is inverted. Whether has a high likelihood of having a negative is controlled by . The probability of exceeding 1 is exponentially small. Thus, rises roughly as , where . Note that our approach in deriving Eq. **2** and is nonperturbative, as demonstrated by the agreement between our formula and numerics over the entire ε-range probed in Fig. 2.

## Umbilical Points

The near-Gaussian fields under investigation are not always directly accessible experimentally. For example, the mass distribution along the line of sight responsible for weak gravitational lensing is believed to be mostly composed of dark matter, and hence it cannot be detected directly (15). If the projected gravitational potential over a flat patch of the sky is taken to be the height of a 2D surface (16), the measurable shear field is given by the lines of principal curvature (11), as shown in Fig. 1. At some special points called umbilics, the curvature is equal in all directions, so the shear field cannot be defined and it must vanish. More precisely, a point on a surface with height function is an umbilic if the second derivatives satisfy the two conditions and . The ratio between different types of umbilical points (which is a universal number for an isotropic Gaussian field) serves as an indicator of non-Gaussianities in lieu of the extrema, which cannot be detected. A similar reasoning can be applied to study polarization singularities in the cosmic microwave background (17⇓⇓–20) and topological defects in a nematic (21, 22) or superfluid near criticality (23, 24).

Inspection of Fig. 1 reveals that there are three types of umbilics: lemons, monstars, and stars. Note that these umbilics are topological defects in the curvature-line field. The topological index of any umbilic is equal to ±1/2, if the curvature-line field rotates counterclockwise (clockwise) by an angle π along any closed path encircling only that umbilic in the counterclockwise direction. A star has three curvature lines terminating at it and a topological index of . A lemon has only one line and index . A monstar has index , like a lemon, but three lines terminating at it, like a star. A striking feature of isotropic Gaussian fields is that the monstar fraction, the relative density of monstars with respect to all umbilics, equals ; this is a universal number independent of the power spectrum (8, 10). Any deviation from this special value is therefore a sure sign of non-Gaussian effects.

Consider a height field , with *f* a nonlinear perturbation and a small parameter controlling the size of the nonlinearity. We can express in terms of the joint probability distribution *p* of the second and third derivatives of *h*. To calculate *p*, we use the property that a probability distribution is determined by its moments of all orders (if the distribution is well-behaved). This task is most easily accomplished by calculating the generating function of the distribution, *χ*; its logarithm can be directly expressed in terms of the cumulants where are the stochastic variables given by the spatial derivatives of *h*. The probability distribution can then be obtained from the generating function by taking the inverse Fourier transform with respect to . The cumulants can be written in terms of expectation values, e.g., . In this context, these expectation values are called moments, which are not to be confused with the moments defined previously.

For Gaussian variables, only the second-order cumulants are nonzero, which gives rise to a generating function of the formOne can easily check that the inverse Fourier transform of *χ* in Eq. **4** precisely yields the probability distribution for a set of correlated Gaussian variables (assuming ; Eq. **12**). More generally, by determining all of the cumulants, one can construct the generating function and from that the probability distribution. We shall derive the monstar fraction up to first order in the perturbation only; consistently we need to determine all of the cumulants up to first order only.

The monstar fraction (even at order ε) could in principle depend on *f* in a complicated way if *f* is an arbitrary nonlinear function. In fact, quadratic terms in the function *f* produce degree 3 cumulants in the distribution function of the field *h* (i.e., skewness), cubic terms produce kurtosis (degree 4 cumulants), and in general degree *n* terms in *f* produce degree cumulants. However, the monstar fraction can be determined from just the distribution of a few derivatives of *h*, whose cumulants vanish beyond the fourth order due to symmetry, as shown in *SI Materials and Methods*. Consequently, the final result for the monstar fraction depends only on a single parameter, , where the primes indicate derivatives with respect to .

The calculation can be briefly summarized as follows. The monstar fraction is related to the distribution function of some of the second and third derivatives of *h*, which we write in terms of complex coordinates and : , , and . The definition of an umbilic point becomes , where is now complex. All of the cumulants of these variables and their conjugates may now be calculated (up to order ε). The complex coordinates allow for optimal use of rotational and translational symmetry. Only a few of the cumulants are nonzero, and these are evaluated in Table 1. With the aid of these cumulants, the generating function can be constructed to first order using Eq. **3**. Taking the Fourier transform leads to the probability distribution , which takes the form of a Gaussian perturbed by cubic and quartic terms in *h* and its derivatives (25). To obtain , we set and integrate over and , including the appropriate Jacobian factor. Integration over all gives the total density of umbilical points, whereas the density of monstars is obtained by integrating over a specific range of (*SI Materials and Methods*). The monstar fraction is then the ratio of these two densities. The resulting deviation from iswhere . When applied to the local model of the primordial field described previously, in Eq. **5** depends only on the cubic coefficient and not on . Hence, the leading-order perturbation that alters the monstar fraction is . The perturbation , like any odd and/or monotonic function of , does not have an effect on the density of maxima and minima. Note that , where is the kurtosis of the signal. [For a more general type of non-Gaussianity there would be additional terms in the expression, depending on correlation functions of partial derivatives of .]

Fig. 3 shows , as determined by Eq. **5** (continuous line), together with data from computer simulations (symbols), performed using the ring spectrum , for which . The agreement between theory and numerics is very good in the linear regime. The monstar fraction is very sensitive to small non-Gaussian perturbations: changes by when ε is just 0.01. For larger values of ε, nonlinear effects become important and prevent from becoming negative. In this regime, our perturbative result does not hold.

## Nonlocal Model and Evolution Equations

In the first section, we presented an exact expression for the imbalance between maxima and minima for a local perturbation of a Gaussian random field. This simple class of models may describe the local evolution of a system that starts out with a Gaussian distribution, such as the growth of a population of cells that are initially distributed on a dish and then divide without any significant migration from one region to another. However, many dynamical systems evolve in a nonlocal, nonlinear way. Non-Gaussianities are generated dynamically from the nonlinear equations of motion that the field obeys, even if the initial condition is Gaussian.

A broad class of nonlinear diffusion equations describes the necessary mixing between regions. Examples include several models of structure formation in both condensed matter (26) and cosmology (1), the Cahn–Hilliard equation for the development of order after a phase transition (27), and simplified models of surface growth (2). We illustrate our approach in the context of the deterministic KPZ Eq. **2**, which models the evolution of the height of a substrate, , as atoms accumulate on it:To first order, the surface grows at a constant rate that is simply subtracted out of Eq. **6**: the two terms on the right-hand side capture additional effects. The first term describes the diffusion of particles along the surface, and the second nonlinear term describes approximately how the growth rate varies with the local slope. The surface is assumed to grow at a constant rate perpendicular to itself, but because the height is measured vertically, depends on the slope; this gives rise to the term quadratic in .

An alternative interpretation of Eq. **6** is obtained by taking the gradient on both sides; one thus derives a Burger’s equation that describes the velocity field . The saddles, maxima, and minima of *h* correspond to stagnation points, sources, and sinks of the potential flow field . If the term quadratic in the gradient in Eq. **6** is substituted by a term quadratic in the field, , one obtains the Fisher equation, which describes the growth and saturation of a population. In all of these cases, we can study the time evolution of an initially Gaussian height profile. Upon setting the coefficient of the nonlinear term *λ* equal to zero, we always retrieve the heat equation, which preserves the Gaussianity of *h* for all later times. However, for , *h* attains a non-Gaussian statistics.

For concreteness, concentrate on how an imbalance between maxima and minima is generated by nonlocal non-Gaussianities in Eq. **6**. The nonlinear term breaks the symmetry between positive and negative values of *h*, which is a necessary condition to generate an imbalance. Note however that in the case of a local evolution, this imbalance grows exponentially slowly as , where ε is the coefficient in front of the nonlinear term and *α* a constant. Local evolution cannot create new extrema; it can only convert a maximum into a minimum whenever *h* happens to have a sufficiently large fluctuation. It is the presence of the diffusion term that is able to create new maxima and minima, even though, on its own, it would not be able to generate any imbalance, because of the symmetry . The two terms on the right-hand side of Eq. **6** conspire together to change the number of maxima and minima asymmetrically. This mechanism causes the imbalance between maxima and minima, , to exhibit a power law increase in λ that can be calculated within perturbation theory.

We use the same approach adopted for the umbilical points and express the joint distribution of , , and in terms of the relevant cumulants, listed in Table 2 up to third order in *λ*. The resulting expression for reads (28)

Eq. **7** is a general result. To apply it to the KPZ equation, we first change variables toNote that *u* is a monotonic function of *h*, so *u* has the same profile of maxima and minima as *h*. This field satisfies the heat equation whose general solution iswhere denotes the Green’s function.

The correlations listed in Table 2 can now be determined from the distribution of , leading to an expression for that is valid over an arbitrary time span, provided that *λ* is small. Analytical results can be obtained for a few convenient choices of the power spectrum of . For example, if we take a Gaussian spectrum, , we findwhere . The validity of this equation is illustrated in Fig. 4, which shows an excellent agreement between theory and numerics. The imbalance starts out at zero because the initial choice for *h* is Gaussian. However, after long times, this expression decays back to zero. The reason for the decay is that, at long times, involves an average over a larger and larger window, so by the central limit theorem, it starts to acquire Gaussian statistics characterized by a vanishingly small imbalance between maxima and minima.

For early times, we can make an expansion in *t*, valid for an arbitrary power spectrum, which gives

The second-order term in Eq. **11** happens to vanish for the Gaussian power spectrum featured in Eq. **10**, but not in general.

The agreement over the entire range of times in Fig. 4 is a peculiar feature of the KPZ equation. The perturbative formula in Eq. **11** is not expected to hold at sufficiently late times, because the nonlinearities eventually grow exponentially. By contrast, the Cole–Hopf transformation, which exactly maps the KPZ equation to a linear diffusion equation, guarantees that the nonlinearities remain bounded in this case.

To sum up, we have illustrated a geometric approach to track the non-Gaussian component of a field that relies on the statistics of topological defects and extrema rather than multiple-point correlation functions. Because these measures are topological in nature, we expect them to be less sensitive to noise than standard correlation-function methods. The nonperturbative formula for the imbalance between maxima and minima is particularly suitable for detecting local non-Gaussianities in the limit of large perturbations. Applications may range from the analysis of images of cerebral activation (3) to peaks and troughs statistics in cosmological structure formation (29).

The statistics of umbilics are a sensitive probe to measure very weak deviations from Gaussianity. It would be interesting to apply our analysis of umbilics to the polarization field of the cosmic microwave background (1, 17⇓⇓–20) or weak gravitational lensing shear maps (15, 16). More controlled experiments can be realized by studying the polarization of a light beam propagating through a nonlinear medium.

Finally, we generalized our study of static local non-Gaussianities to fields that obey nonlinear diffusion equations that allow for some field mixing between different spatial locations. In contrast to local models, the imbalance between peaks and troughs proves a suitable measure of nonlocal non-Gaussian perturbations also in the limit of weak deviations from Gaussianity. The most obvious application of our dynamical analysis would be to analyze the distribution of peaks and valleys in surface growth experiments, as well as more detailed studies of the peak statistics in the temperature fluctuation field of the cosmic microwave background (12, 13), which is affected nonlocally by the mixing of sound waves (1).

## Materials and Methods

The probability distribution can be derived using a method similar to the one outlined in Longuet-Higgins (9). Consider a fixed point . We wish to know the probability density that at this point we have (to avoid confusion with the derivatives of , we shall write *H* from now on) given that it is a minimum. The conditions for this can be written in terms of derivatives of *H*, namely, defines a critical point, whereas and distinguishes a local minimum from a saddle or maximum. First, we determine the joint distribution of these six variables (*H* and its derivatives), which form a set of correlated Gaussian variables. The joint probability distribution for any such set is completely determined by the correlations between the variables

where is the matrix of correlations between the variables.

Correlations between *H* and its first and second derivatives can be expressed in terms of the first three moments () of its amplitude spectrum. By differentiating the Fourier expansion of *H* we find that , and likewise that the variances of the second derivatives are proportional to . The only variables among the six that are correlated to one another turn out to be *H*, , and , with and . After retrieving the probability distribution, we set , and integrate out , , and over the domain defining a minimum. The Jacobian determinant must be added (9, 30). The probability density thus calculated reflects the chance that and are close to zero at the point (there is a vanishing chance that they are exactly zero). To determine the distribution of extrema in the plane, we need the probability of the reverse situation—namely, that exactly at a point within a small range of . The ratio of the two probabilities is given by the Jacobian determinant. The final answer reads

where .

## Acknowledgments

We thank T. Lubensky, R.D. Kamien, B. Jain, A. Boyarsky, L. Mahadevan, B. Chen and W. van Saarloos for stimulating discussions. This work was supported by the Dutch Foundation for Fundamental Research on Matter and the European Research Council.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. E-mail: vitelli{at}lorentz.leidenuniv.nl.

Author contributions: A.M.T. and V.V. designed research; T.H.B., A.M.T., and V.V. performed research; and T.H.B., A.M.T., and V.V. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1212028109/-/DCSupplemental.

## References

- ↵
- Dodelson S

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- Longuet-Higgins MS

- ↵
- ↵
- Longuet-Higgins MS

- ↵
- ↵
- ↵
- ↵
- ↵
- Milnor JW

- ↵
- ↵
- Vitelli V,
- Jain B,
- Kamien RD

- ↵
- Dennis MR,
- Land K

- ↵
- ↵
- ↵
- ↵
- ↵
- Dzyaloshinskii IE

- ↵
- Halperin BI

*Proceedings of Les Houches, Session XXXV NATO ASI*(North Holland, Amsterdam). - ↵
- ↵Turner AM, Beuman TH, Vitelli V (2012) Umbilical points of a non-Gaussian random field. arXiv:1211.0643.
- ↵
- Chaikin PM,
- Lubensky TC

- ↵
- ↵Beuman TH, Turner AM, Vitelli V (2012) Extrema statistics in the dynamics of a non-Gaussian random field. arXiv:1211:0993.
- ↵
- ↵Beuman TH, Turner AM, Vitelli V (2012) Critical points of a non-Gaussian random field. arXiv:1210.6871.

## Citation Manager Formats

## Sign up for Article Alerts

## Article Classifications

- Physical Sciences
- Physics