# Enhanced, targeted sampling of high-dimensional free-energy landscapes using variationally enhanced sampling, with an application to chignolin

^{a}Department of Chemistry and Applied Biosciences, Eidgenössische Technische Hochschule Zurich and Facoltà di Informatica, Instituto di Scienze Computazionali, Università della Svizzera Italiana, CH-6900 Lugano, Switzerland;^{b}National Center for Computational Design and Discovery of Novel Materials MARVEL, Università della Svizzera Italiana, CH-6900 Lugano, Switzerland

See allHide authors and affiliations

Contributed by Michele Parrinello, December 17, 2015 (sent for review October 5, 2015; reviewed by Aaron R. Dinner and Jim Pfaendtner)

## Significance

The problem of sampling complex systems characterized by metastable states separated by kinetic bottlenecks is a universal one that has received much attention. Popular strategies involve applying a bias to a small number of collective variables. This strategy fails when the choice of collective variable is not clear or when the system’s free energy cannot be projected meaningfully on a low-dimensional manifold. We use a variational principle that allows one to construct approximate but effective multidimensional biases. We show in a practical case that the method is much more efficient than alternative methods. We expect this method to be useful for a wide variety of sampling problems in chemistry, biology, and materials science.

## Abstract

The capabilities of molecular simulations have been greatly extended by a number of widely used enhanced sampling methods that facilitate escaping from metastable states and crossing large barriers. Despite these developments there are still many problems which remain out of reach for these methods which has led to a vigorous effort in this area. One of the most important problems that remains unsolved is sampling high-dimensional free-energy landscapes and systems that are not easily described by a small number of collective variables. In this work we demonstrate a new way to compute free-energy landscapes of high dimensionality based on the previously introduced variationally enhanced sampling, and we apply it to the miniprotein chignolin.

Molecular simulation has become an increasingly useful tool for studying complex transformations in chemistry, biology, and materials science. Such simulations provide extremely high spatial and temporal resolution but suffer from inherent time-scale limitations. Molecular dynamics simulations of biomolecules performed on standard hardware struggle to exceed the microsecond time scale. Modern, purpose-built hardware can reach the millisecond time scale (1), but these tools are not generally available and they still prove insufficient for many important problems (2). Widely used enhanced sampling methods also allow one to simulate processes with a time scale of milliseconds (3) and there is significant interest in extending these methods. One important class of enhanced sampling methods involves applying an external bias to one or more collective variables (CVs) of the system. Umbrella sampling and metadynamics fall into this category (4, 5), but there are many other examples (6⇓⇓⇓⇓⇓⇓–13). These methods are most effective when the transformations of interest are in some sense collective and can be described by a very small number of CVs.

When the choice of CV is not obvious or the free-energy landscape cannot be projected meaningfully on a low-dimensional manifold, new strategies are needed. Here we describe an enhanced sampling strategy that allows one to bias high-dimensional free-energy landscapes by exploiting the flexibility inherent in the recently introduced variationally enhanced sampling (VES) method (14). This method casts free-energy computation as a functional minimization problem, and it solves two important problems associated with high-dimensional free-energy computation. The first problem is that when exploring a high-dimensional space a system might never visit the regions of interest, but the variational principle introduced in ref. 14 permits us to specifically target the region we are interested in through the use of a so-called “target distribution.” The second problem is that one must adopt an approximation for the high-dimensional bias and the variational principle allows us to do this in an optimal way. Furthermore, as we show here, for the first time to our knowledge, choosing an intelligent target distribution, which includes any available a priori information, allows us to compensate for some of the variational flexibility that is lost when we adopt an approximate form for the bias. We apply this method to study the folding and unfolding of chignolin and show that we can accurately estimate the folding free energy in a small fraction of the computational time used by other methods.

## Background

To put our method in context we will provide some background on the methods that have been explored for overcoming barriers in high-dimensional CV spaces. Most of these methods involve applying an external bias to some CVs of the system. Standard methods like metadynamics and umbrella sampling are effective for low-dimensional CV spaces and they achieve the dual goal of allowing the system to explore the CV space and reconstructing the free-energy surface accurately. Methods that attempt to sample high-dimensional CV spaces often sacrifice the second goal. Bias-exchange metadynamics (15), for example, biases a different CV in each of several replicas and then allows exchange moves between replicas. The output of such a calculation is several one-dimensional representations of the free-energy surface on each of the biased CVs. A related method is the recently introduced parallel bias metadynamics, which also produces many low-dimensional free-energy surfaces but eliminates the exchange moves (16). Methods like replica exchange with CV tempering (17) and biasing potential replica exchange molecular dynamics (18) rely on a replica exchange scheme to collect statistics from an unbiased replica, and they use either a predefined or an iteratively constructed bias to help the system explore in biased replicas which can exchange with the unbiased replica. An alternative approach is to use some machine learning procedure to identify the important basins in a high-dimensional space, as is done in reconnaissance metadynamics (19) or in ref. 20. Other methods do without an external bias altogether, like temperature-accelerated molecular dynamics (MD) (21), in which an artificially large friction is applied to a set of CVs to ensure that they evolve slowly compared with the atomic motions. This list is by no means exhaustive; for a more thorough review see for example ref. 22, but the sheer number of attempts to solve this problem testifies to its importance.

## Algorithm

The VES approach was introduced by Valsson and Parrinello in ref. 14 and extensions have been described in refs. 23, 24. In this approach the system is biased by a potential *C* is an unimportant constant. In an ergodic simulation biased by **1** and **2** define the variational principle that we make use of in this paper.

To perform the minimization of *Supporting Information*.

As a prototypical example of a high-dimensional free-energy surface, we will consider the backbone dihedral angles of a protein. In particular, we will study chignolin, a small 10-residue-long protein. It must be said, however, that our method is completely general and after appropriate adaptation it could be applied to many other problems such as, for instance, nucleation and ligand binding. The advantage of studying chignolin is that its experimental structure is known (30) and it has been simulated for 100 μs using a special purpose machine (1), in addition to many other simulation studies (e.g., refs. 31⇓–33), which provides us with good reference data. In ref. 1 chignolin was studied very close to the folding temperature, and a folding time of

A complete characterization of the ensemble of conformations of chignolin would require calculating the free energy as a function of all of its backbone dihedral angles. In general there are two highly flexible dihedral angles per residue, *ϕ* and *ψ*, with a few exceptions like proline in which only the *ψ* angle can truly rotate, and the N-terminal and C-terminal residues which only have one flexible dihedral angle. For simplicity we will refer to the set of *N* biased dihedral angles as

In our case however we can add to the power of the variational principle the flexibility that comes from the freedom of choosing the target distribution **2** but it has been recast in a form that allows one to view the bias

Let us illustrate this point on chignolin. This small protein free-energy landscape is characterized by two basins, the folded state and the unfolded one. In the first, the protein is folded in a structure that can be described by a set of dihedral angles (**S1**. Possibly the simplest

The next step is to simplify the expression for the multidimensional **2**. This form is inspired by the Ising model and takes advantage of the linear structure of the protein. The long-range correlations that are not explicitly taken into account are introduced indirectly through our choice of *M* be the size of the basis set that is needed to express the angular dependence of each individual dihedral angle, then the number of basis functions required to represent the left-hand side above is

## Results

Fig. 1 shows a time series of the root-mean-square displacement (RMSD) from the folded state reference structure for chignolin. This time series was taken from a simulation in which the bias was being optimized to sample the target distribution described above and it shows the protein going back and forth between folded and unfolded states. The protein was initiated in an extended state but in less than 300 ns we see at least four transitions between the unfolded state and the folded state. The ability to generate these transitions underlies any study of the folding transition, whether they be studies of thermodynamics or kinetics. In the *Supporting Information* we demonstrate that the bias converges over this time scale, indicating that the functional *Supporting Information* we show a time series for

This approach gives us various ways to study the physics of chignolin. One interesting possibility involves examining the high-dimensional free-energy surface itself. In the present study, however, we will focus on extracting information from the unbiased ensemble on CVs that we have not directly biased. Recovering this information directly from the optimization simulations can be a difficult reweighting problem. One simple way to sidestep this problem is to use a replica exchange (RE) scheme. RE, in particular parallel tempering, is an extremely popular method for enhancing sampling in simulations (36, 37). Conventional parallel tempering is plagued by a few important problems that limit its applicability. The goal in parallel tempering is to have a sufficiently high-temperature replica where the system can explore ergodically and easily surmount the important barriers, and then have a ladder of replicas that allow exchanges between low-temperature and high-temperature systems. If the barriers between the states of interest are primarily energetic this can be a very useful approach, but if the barriers are largely entropic then the height of these barriers can increase as the temperature increases and sampling can become even less efficient than it is at the reference temperature; furthermore, some of the basins may diminish or even disappear at the higher temperature. This problem of interesting basins getting washed out is also a potential problem for RE with CV tempering (17) and biasing potential RE MD (18), which we discussed above. Ultimately, for an RE procedure to truly improve upon the efficiency of a standard simulation with comparable computational resources, it must lower the roundtrip time for exploring the basins of interest; this is a criterion which is often not demonstrated for such simulations (38). In the VES-RE approach, this is not a problem because under the combined action of

The details of the RE we use are described in *Materials and Methods*. We will refer to this approach as VES-RE. In Fig. 2 we show how an estimate of the folding free energy obtained from the lowest, unbiased replica converges over simulation time. The way we obtain this estimate is described in the *Supporting Information*. For comparison we also show the same quantity calculated from the 100-μs unbiased MD trajectory from ref. 1. Results from VES-RE are shown for a set of replicas which were initiated in the folded state. Within 300 ns per replica the folding free energy has converged almost exactly to the result obtained from the unbiased MD trajectory, with small fluctuations on the order of *A*. For comparison with a standard method, we have also performed a parallel tempering simulation in the well-tempered ensemble (PT-WTE) (39, 40), which we describe in the *Supporting Information*. These results are shown in Fig. 3*B*, where it is clear that 300 ns per replica is not sufficient to obtain a converged estimate of the folding free energy from PT-WTE. In the *Supporting Information* we show that transitions between the folded and the unfolded state are extremely rare for continuous trajectories from PT-WTE, i.e., those that are reconstructed from jumps between the different replicas, which is crucial for efficient RE simulations.

We can also measure free-energy surfaces of any order parameter we like from the unbiased replica in VES-RE. These are calculated by using a histogram to estimate the probability distribution of the order parameter (

## Conclusions

We have presented results that illustrate some of the unique capabilities of the VES method introduced in ref. 14. We have shown that we can bias high-dimensional systems by choosing an appropriate, simplified representation of the bias and by making an intelligent choice of target distribution. This makes it possible to generate biasing potentials that efficiently and reversibly drive the system back and forth between the folded state and a disordered state of a small protein. This capability is extremely important for any study of the folding transition, and we have shown that it forms the basis for an efficient RE procedure that allows one to obtain statistics of the unbiased ensemble. Statistics generated in this way reproduce those obtained from 100-μs MD simulations on specialized hardware with extremely high accuracy at a fraction of the computational cost. The tradeoff for this dramatic reduction in computational resources is that we must make use of some information about the equilibrium structure in designing the target distribution, and that all dynamical information is lost in the RE procedure. Recent improvements in extracting rate constants from biased simulations (24, 41) may make it possible to overcome the latter limitation.

Foreseeable extensions of the methods we have described here have many exciting and interesting applications. In particular, the idea of using the target distribution to sample transitions between a localized state and a disordered state could be extremely useful for problems like ligand binding and unbinding. Accurately estimating the binding free energy of ligands is an extremely important and still very difficult calculation. We also imagine combining this method with tools like Rosetta which can predict protein structure based on the primary sequence (42). Another avenue we have not explored in this work is studying the high-dimensional free-energy surface itself. Accurate, high-dimensional free-energy surfaces could be an ideal way to construct coarse-grained models, and they could form the basis for novel studies of protein physics. The free-energy landscape of a protein must have several unique properties so that the protein does not get trapped in metastable states, and so that it can find the folded state in minimal time. Much of our current understanding of protein-folding free-energy landscapes is informed by studies of simplified, lattice-based models (43), and the approach we describe here might complement such studies.

## Materials and Methods

### MD Simulations.

All simulations of chignolin (sequence TYR-TYR-ASP-PRO-GLU-THR-GLY-THR-TRP-TYR) were conducted with GROMACS 4.6.5 (44) with a modified version of the PLUMED 2 plugin (45). The CHARMM22* force field (46) and the three-site transferrable intermolecular potential (TIP3P) water model (47) were used to make direct comparisons with ref. 1. ASP and GLU residues were simulated in their charged states, as were the N- and C-terminal amino acids. A timestep of 2 fs was used for all systems and constant temperature of 340 K (in agreement with ref. 1) was maintained by the velocity rescaling thermostat of Bussi et al. (48). All bonds involving H atoms were constrained with the linear constraint solver (LINCS) algorithm (49). Electrostatic interactions were calculated with the particle mesh Ewald scheme using a Fourier spacing of 0.12 nm and a real space cutoff of 0.95 nm (50). The same real space cutoff was also applied to van der Waals interactions. Initial configurations were generated by running a short 1-ns constant pressure (NPT) simulation with the protein in the folded state to allow the box size to equilibrate using the Parrinello–Rahman barostat (51). Subsequent runs were all conducted with fixed box size. The final cubic box had a side length of 4.044 nm and it had 2,122 water molecules, along with 2 sodium ions to neutralize the charge.

### RE Procedure with VES Optimized Bias.

The RE procedure used is similar in spirit to Hamiltonian RE, solute tempering, and RE with CV tempering (17, 52, 53). In each replica the system is biased by the converged bias potential from the optimization run, scaled by a constant factor. The bias applied on replica *i* is

where

where *i* and *i*. In principle one could also use information extracted from the biased replicas to speed up the convergence of the free-energy surfaces shown in Fig. 4. One could also imagine optimizing the choices of

## Optimization of Biasing Potential

As mentioned in the main text, the optimization of the biasing potential proceeds by expanding the bias in a set of basis functions (Eq. **3**)**,** and treating the expansion coefficients as variational parameters. The gradient **S**4 and one can afford a much shorter time averaging between successive iterations. A mathematical justification of this procedure can be found in ref. 29. This stochastic optimization approach has proven to be an effective choice to perform the minimization of

The optimization procedure described above requires a choice of basis set. With the dihedral angles as CVs, the natural choice is a Fourier basis. Each pairwise term was expanded as**S3**, was chosen to be 0.001. Higher values of this parameter make the optimization less robust. Each iteration was 500 MD time steps, or 1 ps. The optimization was conducted with a multiple walkers scheme (55) in which six independent copies of the system are simulated simultaneously and statistics from each walker contribute to the averages in Eq. **S1**. It took about 200 ns to produce a converged bias potential. All of these parameters may be subject to refinement.

## Convergence of Biasing Potential

The simplest and most straightforward way to check for convergence of the bias potential (and thus minimization of the functional

## Folding of Chignolin with Different Values of w f

As mentioned in the main text,

## Components of Optimized Bias

For reference we present a few of the pairwise terms that make up the optimized bias. We present the inverses of the biases because these are most closely related to the free-energy surface. It is important to recall that these surfaces do not have a direct interpretation as components of the free-energy surface because they do not include the

## Estimate of the Folding Free Energy from the Unbiased Replica

The estimate of the folding free energy for chignolin was calculated as follows: every 100 ps the configuration of the protein in the unbiased replica was designated as either folded or unfolded based on RMSD and then the number of folded configurations,

## PT-WTE

Reference data were generated from PT-WTE. This method uses the WTE to increase fluctuations of the potential energy (39), which allows one to greatly reduce the number of replicas required to span a given temperature range (40). The temperatures of the 12 replicas, which were distributed between 340 and 681 K, were chosen in an optimal way according to ref. 56. This temperature range is chosen to be similar to that used in previous studies (33), where the highest temperature seems to allow thorough exploration of the unfolded state. To construct the history-dependent bias, Gaussians were deposited with a width of 140 kJ/mol and an initial hill height of 1.2 kJ/mol every 0.6 ps. The bias factor γ, which controls how much the fluctuations of the potential energy are enhanced, was chosen to be 20. Exchange moves between replicas were attempted every 10 ps.

## Slow Convergence in PT-WTE

PT-WTE and standard PT simulations of proteins are very difficult to converge because transitions between the folded and the unfolded state are extremely rare in continuous trajectories. In Fig. S6 we show how the RMSD evolves over time for a single continuous trajectory (reconstructed from exchange moves) from the PT-WTE simulations described in the main text. We see that this trajectory has a single transition from unfolded to folded. Of the 12 continuous trajectories that comprised the PT-WTE run, 5 have no transitions between folded and unfolded, 4 have 1 transition, and 3 have 2 transitions for a total of 10 transitions. By contrast, transitions are much more common in the continuous trajectories from VES-RE. Of the 12 continuous trajectories from VES-RE, 9 have between 3 and 1 transitions, whereas the remaining 3 have 4, 5, and 6 transitions, amounting to 33 transitions in total.

## Reference Structures and CVs Used for Analysis

The reference angles used in the target distribution (

## Acknowledgments

We acknowledge D. E. Shaw Research for sharing data from the simulations of chignolin. P.S., O.V., and M.P. acknowledge funding from the National Center for Computational Design and Discovery of Novel Materials MARVEL and the European Union Grant ERC-2009-AdG-247075.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. Email: parrinello{at}phys.chem.ethz.ch.

Author contributions: P.S., O.V., and M.P. designed research; P.S. performed research; P.S. analyzed data; and P.S., O.V., and M.P. wrote the paper.

Reviewers: A.R.D., The University of Chicago; and J.P., University of Washington.

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1519712113/-/DCSupplemental.

## References

- ↵.
- Lindorff-Larsen K,
- Piana S,
- Dror RO,
- Shaw DE

- ↵
- ↵.
- Tiwary P,
- Limongelli V,
- Salvalaglio M,
- Parrinello M

- ↵.
- Laio A,
- Parrinello M

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Tribello GA,
- Ceriotti M,
- Parrinello M

- ↵.
- Chen M,
- Yu TQ,
- Tuckerman ME

- ↵.
- Abrams CF,
- Vanden-Eijnden E

- ↵.
- Abrams C,
- Bussi G

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Bach F,
- Moulines E

*O*(*1/n*).*Advances in Neural Information Processing Systems 26 (NIPS 2013)*, eds Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (Curran Associates, Inc., Red Hook, NY), pp 773–781 - ↵
- ↵
- ↵.
- Zhang T,
- Nguyen PH,
- Nasica-Labouze J,
- Mu Y,
- Derreumaux P

- ↵
- ↵.
- Szabo A,
- Ostlund N

- ↵.
- Gerber S,
- Horenko I

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Liu P,
- Kim B,
- Friesner RA,
- Berne BJ

- ↵
- ↵
- ↵
- ↵

## Citation Manager Formats

## Article Classifications

- Physical Sciences
- Chemistry

- Biological Sciences
- Biophysics and Computational Biology

## Sign up for Article Alerts

## Jump to section

- Article
- Abstract
- Background
- Algorithm
- Results
- Conclusions
- Materials and Methods
- Optimization of Biasing Potential
- Convergence of Biasing Potential
- Folding of Chignolin with Different Values of wf
- Components of Optimized Bias
- Estimate of the Folding Free Energy from the Unbiased Replica
- PT-WTE
- Slow Convergence in PT-WTE
- Reference Structures and CVs Used for Analysis
- Acknowledgments
- Footnotes
- References

- Figures & SI
- Info & Metrics