## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Nonequilibrium candidate Monte Carlo is an efficient tool for equilibrium simulation

Edited by Michele Parrinello, ETH Zurich, Lugano, Switzerland, and approved August 30, 2011 (received for review April 21, 2011)

### This article has a correction. Please see:

## Abstract

Metropolis Monte Carlo simulation is a powerful tool for studying the equilibrium properties of matter. In complex condensed-phase systems, however, it is difficult to design Monte Carlo moves with high acceptance probabilities that also rapidly sample uncorrelated configurations. Here, we introduce a new class of moves based on nonequilibrium dynamics: Candidate configurations are generated through a finite-time process in which a system is actively driven out of equilibrium, and accepted with criteria that preserve the equilibrium distribution. The acceptance rule is similar to the Metropolis acceptance probability, but related to the nonequilibrium work rather than the instantaneous energy difference. Our method is applicable to sampling from both a single thermodynamic state or a mixture of thermodynamic states, and allows both coordinates and thermodynamic parameters to be driven in nonequilibrium proposals. Whereas generating finite-time switching trajectories incurs an additional cost, driving some degrees of freedom while allowing others to evolve naturally can lead to large enhancements in acceptance probabilities, greatly reducing structural correlation times. Using nonequilibrium driven processes vastly expands the repertoire of useful Monte Carlo proposals in simulations of dense solvated systems.

In this paper, we describe a new technique for constructing efficient Markov chain Monte Carlo (MCMC) (1) moves that both have high acceptance rates and also allow rapid transit through configuration space, greatly enhancing convergence rates in simulations of dense solvated systems. The Metropolis Monte Carlo (MC) (2, 3) sampling procedure is generalized by using nonequilibrium processes to generate candidates for equilibrium simulations. Within this framework, moves that are efficient for an isolated part of a system but lead to near-universal rejection in standard Monte Carlo simulations of dense mixtures can be converted to nonequilibrium processes that generate candidates with higher acceptance probabilities. In this new procedure, the acceptance criteria is related to the nonequilibrium work, rather than the potential energy difference used in traditional Monte Carlo moves.

Since their introduction in the mid-twentieth century, MC (2, 3) and molecular dynamics (MD) (4) simulations have become favored tools for sampling from complex multidimensional distributions, such as configurations of microscopic physical systems in thermodynamic ensembles. However, these methods can produce highly correlated samples, leading to slow convergence of estimated expectations. Whereas MD requires the use of small timesteps for numerical stability and to approximate sampling from the desired distribution, MC simulations can, in principle, make use of nonlocal moves that accelerate mixing of the Markov chain. Indeed, vast improvements in efficiency have been obtained by applying cleverly constructed move sets that exploit physical intuition about the system under study, such as cluster moves in Potts and Ising model simulations (5, 6).

Designing efficient moves requires striking a balance between rapid traversal of phase space and ensuring reasonable acceptance probabilities. For complex heterogeneous systems such as solvated biomolecules, achieving this balance remains challenging. Typically, efficient moves exploit physical insight into kinetically slow processes and energetically favorable configurations. Often, the experimenter may possess physical insight about one component in the system (e.g., a biomolecule) that permits the design of moves that would be efficient in the absence of other components (e.g., solvent), but encounter energetically unfavorable interactions in their presence, reducing acceptance rates to levels where standard MC provides no benefit. As an illustrative example, consider a bistable dimer—a pair of particles interacting with a potential with minima in compact or extended configurations, separated by a high barrier (see Fig. 1). For simulations of this system in a vacuum, a simple and effective standard MC move is to instantaneously increase the interparticle distance from a compact to extended configuration (or conversely, to decrease the distance from an extended to compact configuration). When the dimer is immersed in a dense solvent, however, this move is met with near-universal rejection because solvent molecules overlap with proposed configurations.

One approach that can allow unperturbed degrees of freedom to relax, and hence maintain a reasonable acceptance rate, is to use a nonequilibrium process to generate candidate configurations. Using the appropriate acceptance criterion for the final configuration will preserve the equilibrium distribution. In the case of the bistable dimer immersed in dense solvent, the extension (or contraction) may be carried out over a finite number of increments interleaved with standard Metropolis Monte Carlo or molecular dynamics steps that allow the solvent to reorganize to avoid overlap with the dimer particles.

The basic idea of using nonequilibrium driven processes as Monte Carlo moves has precedents in both the statistical (7, 8) and chemical (9⇓⇓–12) literature. Among the latter, Athènes developed “work-bias Monte Carlo” to enhance acceptance rates in grand canonical Monte Carlo simulations (9), Stern presented a scheme to sample an equilibrium mixture of protonation states at constant pH in explicit solvent (11) [although an inexact variant was proposed earlier (10)], and Nilmeier et al. (12) proposed the driving of a subset of degrees of freedom to enhance acceptance rates (using an approximate acceptance criterion). Nonequilibrium processes have also been used to generate configurations for parallel tempering simulations (13⇓–15).

Here, we unify these ideas and significantly extend the application of nonequilibrium moves in physical simulations. We present a theoretical framework, nonequilibrium candidate Monte Carlo (NCMC), that is applicable to both single thermodynamic states (e.g., NVT, NpT, μVT ensembles) as well as mixtures of thermodynamic states [e.g., expanded ensemble (16, 17) simulations]. Nonequilibrium proposals may drive a subset of degrees of freedom, the thermodynamic parameters characterizing the equilibrium distribution, or both, significantly expanding the repertoire of Monte Carlo moves that lead to high acceptance and efficient mixing in dense condensed-phase systems.

## Equilibrium and Expanded Thermodynamic Ensembles

For physical systems in equilibrium, the probability of observing a microstate is given by the Boltzmann distribution, [1]where *x*∈Γ denotes a microstate of the system (which may include coordinates, momenta, and other dynamical variables, such as simulation box dimensions), *λ* denotes a set of thermodynamic parameters whose values define a thermodynamic state, and *Z*_{λ} is a normalizing constant known as the partition function.

The reduced potential *u*_{λ}(*x*) depends on the thermodynamic ensemble under consideration (18). For instance, in an isothermal-isobaric (*NpT*) ensemble, the reduced potential will assume the form, [2]which depends on the Hamiltonian *H*(*x*) (which may include an external biasing potential, and is presumed to be invariant under momentum inversion) and the system volume *V*(*x*). In this ensemble, the vector of controllable thermodynamic parameters *λ* ≡ {*β*,*H*,*p*} includes the inverse temperature *β*, the Hamiltonian *H*(*x*), and external pressure *p*. Other thermodynamic parameters and their conjugate variables can be included or excluded to generate alternative physical (or unphysical) ensembles.

To allow sampling from multiple thermodynamic states within a single simulation, we also define an expanded ensemble (16, 17), which specifies a joint distribution for (*x*,*λ*) in a weighted mixture of thermodynamic states, [3]where *ω*_{λ} > 0 specifies an externally imposed weight for state *λ*. Here, may assume values in a discrete or continuous space . If the set consists of a single value of *λ*, a single thermodynamic state is sampled, and *π*(*x*,*λ*) = *π*_{λ}(*x*). These thermodynamic states may correspond to a variety of different states of interest, such as temperatures in a simulated tempering simulation (19), alchemical states in a simulated scaling simulation (20), or protonation states in a constant-pH simulation (21).

## Nonequilibrium Candidate Monte Carlo

We first describe the general form of NCMC. At the start of an iteration, the current sample in the Markov chain, (*x*^{(n)},*λ*^{(n)}), which is assumed to be drawn from *π*(*x*,*λ*), is used to initialize a trajectory, (*x*_{0},*λ*_{0}) = (*x*^{(n)},*λ*^{(n)}). A candidate configuration (*x*_{T},*λ*_{T}) is then proposed through a nonequilibrium process in which a set of degrees of freedom and/or thermodynamic parameters may be driven according to some protocol (22) selected with a probability dependent only on (*x*_{0},*λ*_{0}). Even if we only wish to sample from a single thermodynamic state *λ*, we may use a protocol that transiently drives the thermodynamic parameters away from *λ* and back again (as in ref. 14). Finally, an acceptance probability is computed and used to decide whether the next sample in the Markov chain, (*x*^{(n+1)},*λ*^{(n+1)}), is the candidate, (*x*_{T},*λ*_{T}), or the momentum reversal of the initial sample, .

An NCMC move begins by selecting a protocol Λ from a set of possible protocols with probability *P*(Λ|*x*_{0},*λ*_{0}), such that there exists a reverse protocol labeled as (to be defined momentarily) with . A protocol Λ specifies both a series of *T* perturbation kernels *α*_{t}(*x*,*y*) and propagation kernels *K*_{t}(*x*,*y*), arranged in an alternating pattern such that Λ ≡ {*α*_{1},*K*_{1},*α*_{2},*K*_{2},…,*α*_{T},*K*_{T}}. Both *α*_{t}(*x*,*y*) and *K*_{t}(*x*,*y*) are conditional probabilities of *y*∈Γ given any *x*∈Γ, and must satisfy the requirement that if *p*(*x*,*y*) > 0, then *p*(*y*,*x*) > 0, for *p* substituted by *α*_{t} and *K*_{t}.

Each perturbation kernel *α*_{t} drives some or all of the degrees of freedom *x* in a stochastic or deterministic way (e.g., by driving a torsion angle, a distance between two atoms, or the volume of the simulation cell). Similarly, each propagation kernel *K*_{t} propagates some or all of the coordinates of the system at fixed *λ*_{t} according to some form of MCMC or MD [e.g., Metropolis Monte Carlo (2, 3), velocity Verlet (23) deterministic dynamics, or overdamped Langevin stochastic dynamics (24, 25)] that may also depend on the time index *t*. Interleaving perturbation and propagation allows for energetically unfavorable interactions introduced by perturbation to be relaxed during propagation, potentially increasing acceptance rates relative to the instantaneous perturbations of standard Metropolis Monte Carlo.

The procedure by which a trajectory *X* ≡ (*x*_{0},*x*_{1},…,*x*_{T}) is generated from an initial microstate *x*_{0} according to a protocol Λ can be illustrated by the scheme, [4]Application of the perturbation *α*_{t} to *x*_{t-1} generates a perturbed configuration , which is then propagated by the kernel *K*_{t} to obtain *x*_{t}.

The reverse protocol reverses the order in which the perturbation and propagation steps are applied, generating the time-reversed trajectory , where denotes *x* with inverted momenta, [5]

The next step in NCMC is to accept or reject (*x*_{T},*λ*_{T}) as the next sample in the chain. To ensure that the stationary distribution *π*(*x*,*λ*) is preserved, we enforce a strict pathwise form of detailed balance*, [6]The quantity *A*(*X*|Λ) is the NCMC acceptance probability, whereas Π(*X*|*x*_{0},Λ) and denote the probability of generating trajectory *X* from initial configuration *x*_{0} using protocol Λ, or from final configuration with protocol , respectively, [7][8]Summation of Eq. **6** over all trajectories starting with *x*_{0} and ending with *x*_{T} recovers the standard detailed balance condition (see *Appendix* for proof).

We define the ratio of proposal kernels as [9]and the ratio of propagation kernels as the exponentiated difference in forward and backward conditional path actions as [10]Using the above expressions and the momentum invariance property , we may write the ratio of acceptance probabilities as [11]where Δ*u*(*X*|Λ) ≡ *u*_{T}(*x*_{T}) - *u*_{0}(*x*_{0}) is the energy difference. Eq. **11** is the main result of this paper, and is highly general with regard to both the choice of protocol for perturbation and propagation. In subsequent sections, we discuss specific choices for these protocols that lead to particularly simple acceptance criteria.

Many choices of acceptance probabilities *A*(*X*|Λ) that satisfy Eq. **11** are possible, including the well-known Metropolis–Hastings criterion (2, 3) [12]After generating (*x*_{T},*λ*_{T}) and evaluating *A*(*X*|Λ), we generate a uniform random variate *U*. If *A*(*X*|Λ) > *U*, then the candidate becomes the next value in the chain, (*x*^{(n+1)},*λ*^{(n+1)}) = (*x*_{T},*λ*_{T}). Otherwise, it is rejected, we perform a momentum flip, and the next value becomes . Alternately, we may perform a momentum flip upon acceptance, and preserve the momentum upon rejection, (*x*^{(n+1)},*λ*^{(n+1)}) = (*x*_{0},*λ*_{0}). We cannot, however, ignore the momentum flip completely; as explained in *Appendix*, this flip is necessary to preserve the equilibrium distribution.

We note that NCMC need not be used exclusively to sample from *π*(*x*,*λ*), but can be mixed with other MCMC moves or with MD (1). For example, one may reinitialize velocities from the Maxwell-Boltzmann distribution after each NCMC step; this is a Gibbs sampling MCMC move using the marginal distribution for velocities.

## Perturbation Kernels

A large variety of choices are available for the perturbation kernels *α*_{t}(*x*,*y*). Through judicious selection of these kernels, a practitioner can design nonequilibrium proposals that carry some component of the system from one high-probability region to another with high acceptance rates. We briefly describe a few possibilities.

### Stochastically Driven Degrees of Freedom.

Suppose we wish to drive a torsion angle *ϕ* (an angle subtended by four bonded atoms) stochastically by rotating it to a new torsion angle *ϕ*^{′} (holding angles and bonds fixed) according to some probability, such as the von Mises circular distribution centered on *ϕ*, [13]with *I*_{0}(*κ*) denoting the modified Bessel function of order zero and *κ* > 0 taking the role of a dimensionless force constant. Because the stochastic perturbation is made in a non-Cartesian coordinate, a Jacobian *J*(*ϕ*) must be included to compute *α*(*x*,*y*) in Cartesian coordinates, resulting in the ratio, [14]where *J*(*ϕ*^{′}) = *J*(*ϕ*) = 1 because the transformation (a rotation about a bond vector) preserves the Cartesian phase space volume.

### Deterministically Driven Degrees of Freedom.

Instead of perturbing the torsion angle stochastically, we can deterministically drive it in small, fixed increments Δ*ϕ*. In this case, we effectively define an invertible map that takes *x* → *y*, such that differs from *x* only in the rotation of the specified torsion *ϕ* by Δ*ϕ*. To implement this, we may choose a perturbation Δ*ϕ* from a distribution where ± Δ*ϕ* have equal probability, and drive *ϕ*(*x*) from its current value *ϕ*_{0} to a final value *ϕ*_{T} = *ϕ*_{0} + Δ*ϕ* over *T* steps in equal increments, such that *ϕ*(*x*_{t}) is constrained to *ϕ*_{t} ≡ (1 - *t*/*T*)*ϕ*_{0} + (*t*/*T*)*ϕ*_{T}. In this case, , where the Jacobian *J*(*x*) represents the factor by which Cartesian phase space is compressed on the application of the map , which is again unity for rotation about a torsion angle by Δ*ϕ*, and, due to the invertibility of the map, the ratio .

### Simulation Box Scaling.

Another possible deterministic perturbation kernel is simulation box scaling. A barostat can be implemented by proposing propagation kernels that scale the molecular centers and box geometry by a factor *s* = [(*V*(*x*) + Δ*V*)/*V*(*x*)]^{1/3} with Δ*V* chosen uniformly from [*V* - Δ*V*_{0},*V* + Δ*V*_{0}] applied as a factor of *s*^{1/T} over the course of *T* steps. In this case, the perturbation kernel *α*_{t}(*x*,*y*) is a delta function that compresses or expands the molecular centers and box geometry. Because the Jacobian is the ratio of infinitesimal volumes upon scaling, the ratio of perturbation kernels is , where *N* denotes the number of molecular centers.

### Thermodynamic Perturbation.

In many driven nonequilibrium processes, there is no direct perturbation to the coordinates, such that *α*_{t}(*x*,*y*) = *δ*(*x* - *y*) and the ratio . Instead, only the thermodynamic parameters *λ* are varied in time, carrying the system out of equilibrium through action of the *K*_{t} propagation kernels. We recover Neal’s method (7) if the reduced potential *u*_{t} is a simple linear interpolation such that *u*_{t}(*x*) = (1 - *t*/*T*)*u*_{0}(*x*) + (*t*/*T*)*u*_{T}(*x*), the probability of choosing protocol Λ is symmetric with , and MC (2, 3) is used for the propagation kernel *K*_{t}.

## Propagation Kernels

The choice of propagation kernels available is also very broad. If strong driving is performed in selection of *α*, one may elect to choose a time-independent propagation kernel *K*_{t}(*x*,*y*) ≡ *K*(*x*,*y*) that samples from a stationary distribution *π*(*x*) of interest. Alternatively, a strongly time-dependent *K*_{t} could be selected to transiently drive the system out of equilibrium, or from the equilibrium distribution at one thermodynamic state to another. Some possible choices are described below.

### Reversible Markov Chain Monte Carlo.

One may propagate some or all of a system’s degrees of freedom (e.g., those not affected by the perturbation kernel *α*_{t}) by a method that satisfies detailed balance in *π*_{t}, [15]where for a specified *u*_{t}(*x*). Many MCMC methods (1), including Metropolis (2, 3) and various hybrid Monte Carlo (HMC) algorithms that combine discrete-timestep integrators with Monte Carlo acceptance/rejection steps (27, 28), obey detailed balance and are commonly used for the simulation of physical systems.

By analogy with Crooks (29), we define a work *w* and heat *q* for the nonequilibrium driven process, [16][17]such that *w*(*X*|Λ) + *q*(*X*|Λ) = Δ*u*(*X*|Λ), a restatement of the first law of thermodynamics.

The conditional path action difference can then be written in terms of the heat of the process, *q*(*X*|Λ), [18]leading to an acceptance probability similar to standard MC, except that the work, *w*(*X*|Λ), replaces the instantaneous potential energy difference, [19]

### Deterministic Dynamics.

When an isolated system is propagated by a symplectic integrator—a reversible, deterministic integrator that preserves phase space volume—the propagation kernels follow . Hence, and the acceptance ratio is, [20]where Δ*u*(*X*|Λ) ≡ *u*_{T}(*x*_{T}) - *u*_{0}(*x*_{0}) is the energy difference. The equivalence of the work and energy difference for volume-preserving integrators was previously recognized in the context of fluctuation theorem calculations (30, 31).

Symplectic integrators include velocity Verlet (23). These integrators are also symplectic when utilizing constraints through the application of algorithms such as RATTLE (32), provided that the constraints are iterated to convergence each timestep (33).

### Stochastic Dynamics.

Stochastic integrators such as velocity Verlet discretizations of Langevin dynamics (34, 35) sample a modified distribution that differs from the desired equilibrium distribution *π*_{t} in a timestep-dependent manner (36). Whereas this modified distribution may be difficult or impossible to compute to recover equilibrium properties by reweighting, computation of the relative action is relatively straightforward, and the NCMC acceptance criteria ensures that the NCMC-sampled configurations are distributed according to the desired equilibrium ensemble^{†}. As examples, we compute for the overdamped Langevin (Brownian) dynamics integrator of Ermak and Yeh (24, 25) and the Brünger–Brooks–Karplus (BBK) Langevin integrator (38⇓–40) in *Appendix*.

## Illustrative Application: Bistable Dimer in a WCA Fluid

To demonstrate NCMC, we ran simulations of a bistable dimer (adapted from section 1.3.2.4 of ref. 36) in vacuum as well as a dense fluid. The dimer consists of a pair of “bonded” particles interacting via a double-well potential, with minima at *r* = *r*_{0} (compact) and *r* = 2*r*_{0} (extended), and a 5*k*_{B}*T* barrier (see Fig. 1). In the solvated simulations, the dimer was immersed in a dense bath (reduced density *ρσ*^{3} = 0.96) of particles that interact with the bonded particles and each other via the Weeks–Chandler–Andersen (WCA) soft repulsive potential (41). Each simulation “iteration” consisted of velocity reassignment from the Maxwell–Boltzmann distribution, 500 steps of generalized hybrid Monte Carlo (GHMC) dynamics (1, 28, 36, 37) (essentially, a Metropolis-corrected form of Langevin dynamics, henceforth referred to here as MD), optionally followed by either an instantaneous MC move or an NCMC move.

The rate at which effectively uncorrelated samples are generated can be quantified in terms of the correlation time *τ* for the dimer extension *r*(*t*) (shown in Fig. 2). This time represents the asymptotic decay time constant for the correlation function , which will behave like [21]for large *t*, where and . The correlation time is related to the statistical inefficiency, *g* = 1 + 2*τ*, a factor that describes the number of iterations necessary to generate an effectively uncorrelated sample (42).

For the MD simulation in vacuum (Fig. 2, top trace), we observe slow hopping between compact and extended configurations with a correlation time *τ* = 59.2 iterations, resulting in slow convergence of the histogram. Introducing instantaneous MC dimer extension/contraction moves that modify the dimer extension by Δ*r* = ± *r*_{0} reduces the correlation time to *τ* ≈ 0.0, such that an uncorrelated configuration is generated after each iteration of 500 MD steps and one instantaneous MC step (Fig. 2, second trace from top).

When the dimer is immersed in a dense fluid of WCA particles, however, iterations consisting of 500 MD steps alone result in extremely slow barrier crossings, requiring g ≈ 600 iterations to produce an uncorrelated sample (Fig. 2, middle trace). Unlike in vacuum, the introduction of instantaneous MC moves does not significantly reduce the correlation time *τ* (Fig. 2, second trace from bottom). However, performing these same dimer expansion and contraction moves over 2,048-step NCMC moves (Fig. 2, bottom trace) allows the system to rapidly mix between both compact and extended states with a correlation time of *τ* = 4.0 iterations. Whereas each iteration requires a fivefold increase in computational effort (500 MD steps + 2,048 NCMC switching steps = 2,548 force evaluations, versus 500 force evaluations for MD alone), a 67-fold reduction in correlation time is achieved, resulting in a remarkable order-of-magnitude gain in overall efficiency.

The length of the NCMC switching process is a free parameter that may be tuned to further improve efficiency. Toward this end, we estimated the acceptance probability of the extension/contraction moves in dense solvent as a function of switching length (Fig. 3). Whereas instantaneous MC proposals of ± *r*_{0} are only accepted with probability ≈10^{-27} (the error is this quantity is likely underestimated due to its extremity), dividing the move into smaller steps boosts the acceptance rate to a level useful in condensed-phase simulation. If the move is divided into a small number of steps (1–8), there is little to no increase in acceptance rate, but for an intermediate number of steps (16–1,024), there is a superlinear boost in the acceptance probability relative to the length of the switching process. The acceptance probability finally reaches useful levels around 2,000 steps, achieving an acceptance rate of 12% using nonequilibrium proposal trajectories of 2,048 steps, or 38% for 8,192 steps.

In general, there is no direct relationship between acceptance probability and efficiency. Under certain assumptions relevant to the bistable dimer, however, it is possible to link the NCMC acceptance probability to *τ*_{eff}, an indirect estimate of the correlation time, [22]where *τ*_{MD} and *τ*_{NCMC} are correlation times for iterations consisting solely of MD or NCMC moves, respectively. The latter correlation time may be estimated from the average acceptance probability *γ* by *τ*_{NCMC} ≈ -1/ ln(1 - 2*γ*) (see *Appendix* for derivation).

As shown in Fig. 4, the effective correlation time *τ*_{eff} is only diminished when the NCMC acceptance probability is large enough such that *τ*_{NCMC} ≈ *τ*_{MD}, which occurs when *γ*≥0.13% (about 256 switching steps or more). For shorter switching times, even though the acceptance probability is high relative to instantaneous MC, it is still too small to significantly reduce *τ*_{eff}.

When comparing efficiency, we are most interested in the rate of generating uncorrelated configurations for a given amount of computational effort. Relative to MD alone, this rate is described by the efficiency gain, [23]Here, *T*_{MD} = 500 steps per iteration, and *T*_{NCMC} is again varied from 1–8,192 steps. The results are shown in the bottom panel of Fig. 4. Surprisingly, there is actually a slight loss in efficiency at short switching times—dropping to a minimum of 86.9% the efficiency of MD alone at 128 steps—followed by a rapid gain in efficiency, plateauing at an efficiency gain of approximately 13× the efficiency of MD alone for 2,048–4,096-step NCMC proposals. (A similar plateau behavior is observed in the modified parallel tempering protocol of ref. 15.) After this point, longer switching times do not achieve as high of an efficiency gain; even though the acceptance rate continues to increase as the number of NCMC switching steps is doubled again to 8,192 steps, the reduction in correlation time is not sufficient to offset the additional cost of these moves.

## Epilogue

We have described a procedure—nonequilibrium candidate Monte Carlo (NCMC)—by which nonequilibrium proposals can be used within MCMC simulations to enhance acceptance rates. In our illustration, we have demonstrated how its use can lead to large improvements in statistical efficiency—the rate at which uncorrelated samples are generated for a fixed amount of computational effort. In other applications, whether similarly large efficiency gains are achieved will depend on the precise nature of the system under study and the nonequilibrium proposals introduced. The most straightforward approach—borrowing Metropolis Monte Carlo proposals that are reasonable for one component of the system in isolation, and converting them to nonequilibrium proposals—is likely to be a fruitful avenue for generating efficient schemes, as was demonstrated here for a simple system.

More generally, the problem of selecting efficient nonequilibrium proposals is similar to the problem of choosing good reaction coordinates, in that it is desirable to drive the system along (possibly complex) slow collective coordinates where orthogonal degrees of freedom relax quickly. The search for such collective coordinates is a topic of active research (43⇓⇓⇓⇓⇓–49). Given an initial nonequilibrium protocol, the issue of optimizing such a protocol to minimize dissipation (and maximize acceptance) is also a topic of active study, led by forays into the world of single-molecule measurement (50⇓–52). Recent work has also suggested that estimating the thermodynamic metric tensor along the nonequilibrium parameter switching path (53⇓⇓–56), could prove useful in adaptively optimizing the switching protocol (57).

We note that switching trajectories contain potentially useful information. Indeed, several methods (56, 58, 59) have recently been developed to estimate equilibrium properties from nonequilibrium samples through the application of statistical estimator theory to nonequilibrium fluctuation theorems (30, 60, 61); these are particularly relevant to switching between multiple thermodynamic states. Though the development of efficient estimators that utilize both nonequilibrium switching trials and sampled equilibrium data generated in NCMC simulations remains an open challenge, it is at least straightforward to incorporate information from rejected NCMC proposals in the estimation of equilibrium averages (26, 62).

## Materials and Methods

The dimer system considered here consists of two particles that interact via a double-well bonded potential in the interatomic distance *r*, [24]with *h* = 5*k*_{B}*T*, *r*_{0} = *r*_{WCA}, and *s* = *r*_{WCA}/2, where *r*_{WCA} ≡ 2^{1/6}*σ*. Simulations denoted as “vacuum” contain only these two particles, whereas simulations denoted as “solvated” also interact with a dense bath of particles via the WCA nonbonded potential (41), [25]with mass *m* = 39.9 amu, *σ* = 3.4 *Å*, and *ϵ* = 120 *k*_{B}*T*. The nonbonded WCA interaction is excluded between the two bonded particles. The solvated system contains a total of 216 WCA particles at a reduced density of *ρσ*^{3} = 0.96. For all simulations, the reduced temperature is *k*_{B}*T*/*ϵ* = 0.824. A custom Python code making use of the graphics processing unit (GPU)-accelerated OpenMM package (63⇓–65) and the PyOpenMM Python wrapper (66) was used to conduct the simulations. All scripts are available for download from http://simtk.org/home/ncmc.

To ensure that observed differences were not due to changes in the stationary distribution of the integrator, we used GHMC (1, 28, 36, 37) for all our simulations. GHMC is based on a velocity Verlet discretization (23) of Langevin dynamics—the two are equivalent in the limit of small timesteps—but includes an acceptance/rejection step to correct for errors introduced by finite timesteps so that the stationary distribution is the exact equilibrium distribution. We used a timestep of 0.002*τ*, where , and the collision rate was set to *τ*^{-1}. With this timestep, the acceptance probability is 99.929 ± 0.001%; the resulting dynamics closely approximates Langevin dynamics.

In simulations employing instantaneous Monte Carlo moves, a perturbation Δ*r* to the interatomic distance *r* of the dimer was chosen according to [26]The dimer was contracted or expanded about the bond midpoint to generate a new configuration *x*_{new} with dimer extension *r*_{new} from the old configuration *x*_{old} with dimer extension *r*_{old}, and the move accepted or rejected with the Metropolis–Hastings criterion (3), [27]where the Jacobian ratio term *J*_{r}(*x*_{old},*x*_{new}) = (*r*_{new}/*r*_{old})^{2} accounts for the expansion and contraction of phase space due to the Monte Carlo proposals.

For simulations employing *T*-step NCMC moves, proposals were made by selecting a new velocity vector from the Maxwell–Boltzmann distribution, integrating *T* steps of velocity Verlet dynamics (23) for all bath atoms as the dimer extension was driven from *r*_{old} to *r*_{new} in equal steps of size Δ*r*/*T*, and accepting or rejecting based on the modified Metropolis criteria for symplectic integrators (Eqs. **12**–**20**), [28]The Jacobian ratio is also (*r*_{new}/*r*_{old})^{2}. This scheme corresponds to a choice for the perturbation kernel of [29]where *r*(*x*) denotes the dimer separation of configuration *x*. The propagation kernel *K*_{t}(*x*,*y*) corresponds to velocity Verlet dynamics where the dimer atoms are held fixed in space. The final configuration after the MC or NCMC rejection procedure was stored and plotted to generate Fig. 2.

The mean acceptance probabilities for each switching time *τ* can be estimated via the sample mean [30]For numerical stability, logarithms of *A*(*X*_{n}) were stored, as *a*_{n} ≡ ln *A*(*X*_{n}). We then estimated (shown in Fig. 4) as [31]where *b* ≡ max _{n}*a*_{n}.

Integrated autocorrelation times were estimated using the rapid scheme described in section 5.2 of ref. 42.

The acceptance probabilities plotted in Fig. 4 were estimated from a trajectory consisting of 10,000 iterations of 2,048-step NCMC, with 500 steps of GHMC dynamics in between each NCMC trial, to ensure equilibrium sampling. Prior to each 2,048-step NCMC iteration, a velocity from the Maxwell–Boltzmann distribution was selected, and NCMC trial moves with varying switching times were conducted solely to accumulate statistics. The statistical error in the estimate of and the computed relative efficiency over instantaneous Monte Carlo was estimated by 1,000 bootstrap trials, in which the dataset of 10,000 work samples was resampled with replacement in each bootstrap trial and 95% confidence intervals computed from the distribution over bootstrap replicates.

The reference distribution for the interparticle distribution plotted in red on the right side of Fig. 2 was computed analytically for the vacuum system, [32]For the solvated system, this distribution was estimated from an umbrella sampling simulation (Fig. 5) employing a modified bonded potential intended to remove the barrier in between compact and extended states, [33]where *r*_{min} = *r*_{0}, *r*_{max} = 2.05*r*_{0}, and *K* = *k*_{B}*T*/*η*^{2}, with *η* = 0.3 *Å*, and *θ*(*r*) is the Heaviside function that assumes a value of unity for *r*≥0 and zero otherwise. The true solvated interparticle distribution *p*(*r*) was estimated by reweighting the data produced from this simulation, using the relationship [34]where *r*_{n} denotes the bond separation for sample *n*, and a finite-width histogram bin was used instead of the delta function *δ*(*r*).

## Appendix

### Proof that NCMC Preserves the Equilibrium Distribution.

Following the proof for GHMC in ref. 36, here we show that NCMC preserves the equilibrium distribution. The expected acceptance rate for NCMC moves initiated from (*x*,*λ*) is [35]

Suppose that we have a variate (*x*^{(n)},*λ*^{(n)}) drawn from the equilibrium distribution *π*(*x*,*λ*). The probability density of the next value in the chain, *p*(*x*^{(n+1)},*λ*^{(n+1)}), has contributions from two scenarios: when the candidate variate is rejected and when it is accepted. The contribution from rejecting the candidate and flipping the momentum such that is [36]The latter contribution from accepting the candidate such that (*x*^{(n+1)},*λ*^{(n+1)}) = (*x*_{T},*λ*_{T}) is, [37]where *ρ*(*X*,Λ|*x*_{0},*λ*_{0}) ≡ Π(*X*|*x*_{0},Λ)*P*(Λ|*x*_{0},*λ*_{0}) is the probability of generating the trajectory-protocol pair (*X*,Λ) from (*x*_{0},*λ*_{0}), and the pathwise detailed balance condition (Eq. **6**) is used to produce the quantity in brackets.

Taking the sum of Eqs. **36** and **37**, we find that the equilibrium distribution is preserved [38]

By analogous reasoning, maintaining the momentum upon rejection, (*x*^{(n+1)},*λ*^{(n+1)}) = (*x*^{(n)},*λ*^{(n)}), and flipping it upon acceptance, will also preserve the equilibrium distribution.

### Acceptance Criteria for Overdamped Langevin (Brownian) Integrator of Ermak and Yeh.

A common integrator for Brownian dynamics (the overdamped regime of Langevin dynamics), in which only coordinates *x* are explicitly integrated, is given by Ermak and Yeh (24, 25). In our notation, where the perturbed coordinates are propagated by one step of the stochastic integrator to obtain *x*_{t}, application of the propagation kernel can be written, [39]where *m* is the particle mass, *F*_{t}(*x*) ≡ -(∂/∂*x*)*H*_{t}(*x*) is the (potentially time-dependent) systematic force, and *γ* is an effective collision frequency or friction coefficient with units of inverse time. The noise history *ξ*_{t} for each degree of freedom is a normal random variate with zero mean and variance *β*^{-1}, drawn from the distribution [40]

In NCMC, every application of the propagation kernel *K*_{t} produces a transition determined by the noise history variable *ξ*_{t}, there is a corresponding that generates the opposite step, . By noting [41]we can compute the relationship [42]

Then, the ratio of transition kernels appearing in Eq. **10** can be written in terms of noise history *ξ*_{t} and the computed reverse noise history [43]where the tildes are dropped because the microstate *x* contains no momenta. The quantity |∂*x*_{t}/∂*ξ*_{t}| represents the Jacobian for the change of variables from the *ξ*_{t} to *x*_{t}, and the Jacobians in the numerator and denominator cancel. The quantity in Eq. **43** can easily be computed during integration.

### Acceptance Criteria for Langevin Integrator of Brooks, Brünger, and Karplus (BBK).

The Brünger–Brooks–Karplus (BBK) stochastic integrator (38, 39) is a popular integrator for simulating Langevin dynamics. In our notation, where the perturbed coordinates are propagated by one step of the stochastic integrator to obtain *x*_{t}, application of the propagation kernel can be written [44]where we have used a velocity Verlet discretization of the BBK integrator (36, 40). Here *r*_{t} and *v*_{t} denote the respective Cartesian position and velocity components of the microstate *x*_{t}, *γ* the effective collision frequency with units of inverse time, and *m* the particle mass. is an auxiliary variable used only in simplifying the mathematical representation of the integration scheme. Note that we require two random variates, *ξ*_{t} and , per degree of freedom per timestep in order for this scheme to be able to generate both the forward trajectory *X* and its time-reverse (e.g., section 2.2.3.2 of ref. 36).

The noise history terms *ξ*_{t} and are normal random variates with zero mean and variance *β*^{-1}. Their joint distribution can therefore be written [45]For every step , the positions and velocities undergo a transition determined by the noise variables . A corresponding choice of noise variables will generate the reverse step, carrying With a little algebra, it is seen that [46]To write the ratio of transition kernels appearing in Eq. **10** in terms of noise variables and the computed reverse noise variables , we must first compute the Jacobian because the random variates are not in Cartesian space [47]which can be shown to be independent of *ξ*_{t} and . The conditional path action difference can now be computed [48]where the ratio of Jacobians cancels because the Jacobians are independent of the noise variates.

### Derivation of Effective Correlation Time for Mixed MD/NCMC Sampling.

For simplicity, we make the assumption that the system of interest has two long-lived conformational states of equal population with dimer extensions *r*_{c} and *r*_{e}. This assumption holds to good approximation for the WCA dimer example considered here, and may apply to other systems of interest as well. We assume that the correlation time for a fixed number of MD simulation steps per iteration is given by *τ*_{MD}, and describe the probability of finding the system ends up in a given conformational state after one iteration by a 2 × 2 column-stochastic transition matrix **T**_{MD} [49]

For a 2 × 2 system whose time evolution is governed by the column stochastic transition matrix **T**, we can write the autocorrelation function for the dimer extension *r* as [50]where the transition matrix **T** has unitary eigenvalue decomposition **UMU**^{-1}, and *μ* is the nonunit eigenvalue of **T**. The constants are and *C*_{∞} = (1/4)(*r*_{c} + *r*_{e})^{2}.

Relating this to the autocorrelation time *τ* estimated from a timeseries, intended to reflect the fit to [51]we can see that *τ* = -1/ ln *μ*. We then determine that the correlation time *τ*_{MD} = -1/ ln *μ*_{MD}, with *μ*_{MD} = 1 - 2*α*.

Similarly, we can write the probability that the NCMC step will carry the system from one conformational state to another in terms of the acceptance probability *γ*, which we assume to be symmetric, [52]where we have correlation time *τ*_{NCMC} = -1/ ln *μ*_{NCMC} and *μ*_{NCMC} = 1 - 2*γ*.

The effective transition matrix **T**_{eff} for iterations consisting of MD simulation steps followed by an NCMC trial is given by [53]where the effective correlation time *τ*_{eff} = -1/ ln *μ*_{eff}, with *μ*_{eff} = 1 - 2[(*α* + *γ*) - 2*αγ*]. Substituting in *α* = (1 - *e*^{-1/τMD})/2 and *γ* = (1 - *e*^{-1/τNCMC})/2, we obtain [54]As a check of the accuracy of Eq. **54**, we note that for MD with 2,048-step NCMC switching, we compute *τ*_{eff} ≈ 4.0 iterations, using only *τ*_{MD} = 299.8 iterations and the NCMC acceptance probability *γ* = 12.1%. The actual correlation time measured from a 10,000 iteration simulation is computed to be *τ*_{eff} = 4.0.

## Acknowledgments

We thank Gabriel Stoltz (CERMICS, École des Ponts ParisTech); Jed W. Pitera (IBM Almaden Research Center); Manuel Athenès (CEA Saclay); Firas Hamze (D-Wave Systems); Yael Elmatad, Anna Schneider, Paul Nerenberg, Todd Gingrich, David Chandler, David Sivak, Phillip Geissler, Michael Grünwald, and Ulf Rörbæck Pederson (University of California, Berkeley); Vijay S. Pande (Stanford University); and Suriyanarayanan Vaikuntanathan, Andrew J. Ballard, and Christopher Jarzynski (University of Maryland), and Huafeng Xu (D. E. Shaw Research) for enlightening discussions on this topic and constructive feedback on this manuscript, as well as the two anonymous referees for their helpful suggestions for improving clarity. J.P.N. was supported by the US Department of Energy (DOE) by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. G.E.C. was funded by the Helios Solar Energy Research Center, which is supported by the Director, Office of Science, Office of Basic Energy Sciences of the DOE under Contract DE-AC02-05CH11231. D.D.L.M. was funded by a Director’s Postdoctoral Fellowship from Argonne National Laboratory. J.D.C. was supported through a distinguished postdoctoral fellowship from the California Institute for Quantitative Biosciences (QB3) at the University of California, Berkeley. Additionally, the authors are grateful to OpenMM developers Peter Eastman, Mark Friedrichs, Randy Radmer, and Christopher Bruns for their generous help with the OpenMM GPU-accelerated computing platform and associated PyOpenMM Python wrappers. This research was performed under the auspices of the DOE by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 and by University of Chicago Argonne, LLC, Operator of Argonne National Laboratory (Argonne). Argonne, a USDE Office of Science laboratory, is operated under Contract DE-AC02-06CH11357.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. E-mail: jchodera{at}berkeley.edu.

Author contributions: J.P.N., G.E.C., D.D.L.M., and J.D.C. designed research; D.D.L.M. and J.D.C. performed research; J.P.N., G.E.C., D.D.L.M., and J.D.C. contributed new analytic tools; J.D.C. analyzed data; and J.P.N., G.E.C., D.D.L.M., and J.D.C. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

See Author Summary on page 18199 (volume 108, number 45).

Data deposition: Simulation and analysis scripts, as well as simulation data generated for this paper, are available online at http://simtk.org/home/ncmc/; simtk.org is an online repository for exchanging scientific data produced by SimBios, a National Institutes of Health Center for Biomedical Computing.

↵

^{*}The described pathwise detailed balance condition is closely related to “super-detailed balance” (e.g., ref. 26), but also accounts for momentum reversal to extend its definition to include molecular dynamics integrators.↵

^{†}Alternatives to using NCMC to correct stochastic integration include introducing a Metropolization correction after each step, as in the generalized hybrid Monte Carlo (GHMC) integrator we use in the example (1, 28, 36, 37).

## References

- ↵
- Liu JS

- ↵
- ↵
- Hastings WK

- ↵
- ↵
- ↵
- ↵
- Neal RM

- ↵
- ↵
- ↵
- ↵
- ↵
- Nilmeier J,
- Jacobson MP

- ↵
- ↵
- ↵
- Ballard AJ,
- Jarzynski C

- ↵
- ↵
- ↵
- ↵
- ↵
- Li H,
- Fajer M,
- Yang W

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- Frenkel D

- ↵
- ↵
- Lelièvre T,
- Stoltz G,
- Rousset M

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- Izaguirre JA,
- Sweet CR,
- Pande VS

- ↵
- Lelièvre T,
- Stoltz G,
- Rousset M

- ↵
- ↵
- ↵
- ↵
- Schlick T

- ↵
- ↵
- ↵
- ↵
- Best RB,
- Hummer G

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- Athènes M,
- Marinica M-C

- ↵
- ↵
- Eastman P,
- Pande VS

- ↵
- ↵

## Citation Manager Formats

## Sign up for Article Alerts

## Article Classifications

- Physical Sciences
- Chemistry

- Biological Sciences
- Biophysics and Computational Biology