Previous Article |
Table of Contents
| Next Article
PHYSICAL SCIENCES / BIOLOGICAL SCIENCES / COMPUTER SCIENCES / BIOPHYSICS
A consensus view of protein dynamics


,

,
,
,
,
,
,¶
*Molecular Modelling and Bioinformatics Unit and
Structural Biology Node, Institut de Recerca Biomèdica, Parc Científic de Barcelona, Josep Samitier 1-5, 08028 Barcelona, Spain;
Computational Biology Program, Barcelona Supercomputing Center, Jordi Girona 31, Edifici Nexus II, 08028 Barcelona, Spain; and
Departament de Bioquímica i Biologia Molecular, Facultat de Biologia, Universitat de Barcelona, Avgda Diagonal 645, 08028 Barcelona, Spain
Edited by Harold A. Scheraga, Cornell University, Ithaca, NY, and approved November 10, 2006 (received for review July 6, 2006)
| Abstract |
|---|
|
|
|---|
50 years of CPU) allowed us to obtain a robust-consensus picture of protein dynamics in aqueous solution.
force field | molecular dynamics | molecular modeling | protein structure
| Results and Discussion |
|---|
|
|
|---|
Force Field Convergence.
Previous to any dynamic study, we need to determine whether force fields are providing a similar picture of protein structure and whether such a picture is similar to that derived experimentally. Analysis of collected samplings indicates an average divergence (
in Eq. 1) of
2 Å between force fields (slightly larger deviations are found in G simulations), which, considering the thermal noise of the simulations (
in Eq. 2), suggests a similarity of
70% between the four samplings and an average "effective distance (
1) between them of only 1.4 Å (slightly larger values are always obtained for G-simulations; see Table 2). This finding indicates that all simulations are, in fact, sampling a similar region of the conformational space. This suggestion is supported by the analysis of the radii of gyration and solvent accessible surface (differences of
1% in radii of gyration and 2% in solvent-accessible surface between the four force fields; see Fig. 1).
|
5%. The average backbone rmsds between simulated and experimental structures are
2.0 Å (A, 1.9; C, 2.0; O, 1.9; G, 2.5 Å), close to the thermal noise of MD simulations. Quite surprisingly, the presence or absence of disulfide bridges does not modify the deviation of our samplings from experimental structures, which is, however, dependent on the origin of the experimental structure. Thus, proteins showing the largest displacements with experiment correspond always to structures solved by NMR spectroscopy (see Fig. 1 and Table 1). The deviations in these cases are due to displacements in regions of large flexibility, where NMR structures are in general poorly defined. This is clear when calculations are repeated using the TM-score, a more robust measure that reduces the weight of flexible entities in the rms-fit. In this case, no system is found for which deviation from experimental data are >2.5 Å (see Fig. 1). Moderate deviations between MD trajectories and experimental model might then reflect either lack of quality in some parts of the experimental model, or just the intrinsic differences between simulation and experimental conditions. Clearly, the commonly made assumption that any deviation from experimental structure of MD simulations is always due to simulation errors is not correct and can ignore important physical characteristics of proteins.
|
-helices and
-sheets). In summary: (i) different force fields provide similar picture of protein structure, and (ii) MD simulations sample regions close to experimental structures. Divergences between simulations and experimental structure signal proteins with very flexible moieties whose exact conformation is often not well defined. The two prerequisites noted above are then fulfilled, and we can then safely analyze the dynamics properties of proteins.
|
-sheet segments are the most rigid, whereas turns are in general the most flexible ones, in good agreement to what is deduced from experimental data. In fact, the only significant difference between MD and x-ray B-factors is that, although experimental B-factors above 60 Å2 are rare, they are not so uncommon in MD simulations. Inspection of B-factor distribution in the different proteins (see example in Fig. 3 and SI Fig. 10) show a good correlation (average Spearman's rank correlation of 0.7 between experimental and MD distributions). This means that proteins residues that appear as the most flexible ones in MD-simulations are those with the largest mobility in the x-ray structures. It seems that MD-simulation is just allowing the movement of flexible residues that were "frozen" by the crystal lattice. This hypothesis can be tested by computation of crystal effective temperatures (see Eq. 3). Considering harmonic residue oscillations as those with B-factors below 60 Å2 (see Fig. 3) we obtain an "effective temperature" of 290 K for the crystal. This demonstrates that the effect of crystal lattice is not, as often assumed, a general "effective cooling" of the general protein. The reduction in kinetic energy (i.e., the reduction of "effective temperature") induced by the lattice is localized in a few very flexible residues, whose mobility is severely reduced with respect to the situation found in diluted aqueous solution.
|
L=0.28 ± 0.06). However, this is the combination of the "solid-like" properties of the buried interior
L=0.18 ± 0.04 (0.16 ± 0.04 if only buried backbones are considered) and of pure liquid properties of the exposed side-chains
L=0.38 ± 0.07. In summary, as suggested by experimental studies in particular systems (26, 27), proteins are "melted-solids" with strong differences between the near-solid interior and the full-liquid exterior.
Essential dynamics provides and additional test of the global flexibility of proteins. Interestingly, all of the force fields provide a similar picture of the type of movements that are more important in describing protein similarity (
85% see Table 2 and SI Figs. 615). Space dimensionality (i.e., the number of relevant deformation nodes, ref. 28) depends linearly with the number of residues of the protein (r2 > 0.94; see Fig. 4), with no clear changes induced by the presence of disulfide bridges. Given A, C, and O simulations, we can derive a consensus regression equation (dim = 32.5 + 0.58 x Nres; r2 = 0.93), which indicates that each residue adds only 0.6 dimensions to the conformational space, indicating that the global structure of the protein strongly limits, in a noncooperative manner, the number of accessible conformations of the constituting peptides.
|
|
space) indicating that significant deformations are expected at room temperature along these modes. Entropies derived from these frequencies (see Materials and Methods) depend in all cases linearly with the number of residues (r2 > 0.98; see Fig. 4), confirming the lack of cooperative effects in determining the accessible conformational space of mono-domain proteins. Perfect match is found between A, O, and C simulations, whereas G trajectories lead to more disordered structures. Interestingly, the presence of disulfide bridges does not introduce any sizeable effect in determining the entropy of the folded protein. Thus, we can conclude that, in contrast with the general believe, disulfide bridges do not especially restrict the conformational space of native proteins.
A single experimental structure presents a clearly defined pattern of stabilizing interactions (salt bridges, hydrogen bonds, and hydrophobic interactions). Obviously, when flexibility is allowed, such a pattern becomes more diffuse and interacting groups are more promiscuous. Thus, despite the total number of stabilizing interactions preserved in MD, there is, in all cases, a reduction in the number of "permanent" interactions (defined as those occurring for >80% of the trajectory) with respect to those in the experimental structure. Thus, the number of permanent hydrogen bonds and hydrophobic interactions are
68 and 81% of those existing in experimental structures. The number of "permanent" salt bridges is
77% the experimental ones for A, C, and O simulations, whereas a bigger loss of salt bridges is found in G simulations (see Table 3). Once "permanent" interactions are located, we can determine their contribution to the global stiffness of the protein by deriving the associated force-constant (see Eq. 6). Results in Table 2 show that, in agreement with chemical intuition and irrespective of the force field used, hydrogen bonds are individually stiffer (force constants in the range 2229 kcal/mol Å2) than hydrophobic contacts (force constants in the range 36 kcal/mol Å2). More surprising is the soft nature of salt bridges (29), especially of those involving lysine, which in fact behave similarly (in terms of stiffness) to hydrophobic contacts. MD strongly suggests that individually, hydrogen bonds are the main responsible of protein stiffness, whereas salt bridges (especially with Lys as donor) are quite labile and promiscuous. It is then clear that none of the theoretically more stabilizing interactions disulfide and salt bridges are controlling protein dynamics.
|
| Materials and Methods |
|---|
|
|
|---|
System Setup. As in our MODEL database (http://mmb.pcb.ub.es/MODEL), we used a common, automatic setup procedure designed to guarantee reasonable ionization states, no electrostatic unbalances, and a good hydration before simulation starts. Thus, in all cases, experimental structures (for NMR, the first one deposited in the Protein Data Bank) were titrated to define the major ionic state a neutral pH, neutralized by ions, minimized, heated, and hydrated (with special care in introducing structural waters). These systems were then preequilibrated for 0.5 ns with parm99-AMBER force field, and then equilibrated (0.5 ns) with each force field (see below).
Simulation Details.
Equilibrated structures were used as starting points for 10-ns production trajectories, performed at constant pressure (1 atm) and temperature (300 K) using standard coupling schemes (the same in all cases). Trajectories for three (Protein Data Bank ID codes 1CQY, 1OPC, and 1KTE) ultrarepresentative structures of
,
, and
/
folds (28) were extended to 100 (A, C, and O) or 50 (G) ns to evaluate the reliability of 10 ns trajectories. Results in SI Fig. 16 strongly suggest that 10 ns is long enough for simulations for many analysis purposes. Particle Mesh Ewald approaches were used to deal with long-range effects (36). Integration of motion equations was performed every 1 fs, the vibrations of bonds involving hydrogen being removed by SHAKE/RATTLE algorithm (37, 38). Four force fields were used in production runs: AMBER (17) (A), CHARMM (18, 19) (C), OPLS/AA (2023) (O), and GROMOS (24, 25) (G). TIP3P (39, 40) was used as water model for A, C, and O simulations, whereas G calculations were done using the SPC model (41) for coherence with the force field. Simulations were done using parallel versions of AMBER8 (42) (A-simulations), NAMD (43, 44) (C and O simulations), and single processor version of GROMACS (45, 46) (G simulations).
Control simulations were performed to determine whether force fields were able to recognize unfolding conditions and lead to the destabilization of the folded form. For this purpose, 30-ns simulations of 1CQY, 1OPC, and 1KTE were performed using 8 M urea and a temperature of 368 K. OPLS (47) parameters for urea were used for O and A simulations, whereas specific CHARMM parameters (48) were used in C simulations. Due to the lack of specific parameters, these control simulations were not performed using GROMOS force field. Results in SI Fig. 16 confirm that force fields are able to recognize strong denaturing conditions starting the unfolding of the proteins (which might take micro- to millisecond to complete) in the 0- to 30-ns simulation window. The final set with control simulations (10 ns; A, O, C, and G trajectories) was performed for three proteins (Protein Data Bank ID codes 2GB1, 1LYS, and 1UBQ) for which residual dipolar couplings and S2 distributions have been reported (see references in the legend of SI Fig. 17) from NMR experiments. Comparison of MD- and NMR-derived measures of flexibility is quite satisfactory (see SI Fig. 17) giving confidence in the quality of our theoretical estimates of flexibility (it is not possible to perform this comparison for representative proteins in SI Table 4 due to the lack of experimental data).
Analysis of Trajectories.
Trajectories were analyzed to obtain structural and dynamic properties. Structural descriptors include backbone rmsd, TM score (49), radii of gyration, solvent accessible surface for all heavy atoms (50), secondary structure (51), intramolecular hydrogen bonds, contact maps, Ramachandran plots (52, 53), solvent contact profiles, and others. The global similarity between the structures collected with the different force fields was obtained by computing the pair-cross rmsd (i.e., the rmsd between all of the snapshots collected in the two trajectories; see Eq. 1), and a related similarity index (see Eq. 2).
![]() |
where N is the number of atoms and M is the number of frames
|
|
Local dynamic properties are represented by the B factors, which are compared with x-ray values to obtain a measure of "crystal effective" temperatures (see Eq. 3). Global flexibility was represented by Lindemann's indexes (26) (see Eq. 4) and by essential dynamics protocols considering either Cartesian or mass-weighted covariance matrices (54), which provided us with measures as protein entropy (5557) or the dimensionality of the conformational space (28). The similarity between the nature of the essential deformation patterns of two trajectories (of the same protein) was determined by
index (see Eq. 5; ref. 58), considering a small set of 25 eigenvectors (which represent
8085% of protein variance).
|
|
![]() |
where a' is the most probable nonbonded near-neighbor distance, N is the number of atoms, and
ri stands for the fluctuation of atom i from its average position.
![]() |
where n is the number of size of the important space (n = 25 here) and
stands for one unitary eigenvector. Note that
= 1 indicates that both trajectories are sampling the same type of essential movements, whereas
= 0 means that they are orthogonal.
The flexibility related to key interactions (hydrogen bonds, hydrophobic contacts, and salt bridges) was computed by assuming the harmonic oscillator model (see Eq. 6). Because the use of harmonic model implies a Gaussian distribution of distances, we computed force constant (K in Eq. 6) for "stable" interactions considered as those found in >80% of the trajectory. A salt bridge was defined when the distance N
(K)/C
(R) C
(D)/C
(E) was <6.5 Å, hydrogen bonds were annotated by using standard PTRAJ criteria (42), and hydrophobic contacts were counted when two hydrophobic groups (A, V, P, F, M, I, L, W), which are not neighbors in the sequence, have their C
atoms separated by <10 Å.
|
|
where kb is the Boltzmann constant, T is the absolute temperature, and
X is the oscillation in the interaction distance from average values.
| Acknowledgements |
|---|
|
|
|---|
4,524 64-bit Myrinet-connected processors. | Footnotes |
|---|
Abbreviations: MD, molecular dynamics; O, OPLS; C, CHARMM; A, AMBER; G, GROMOS.
¶To whom correspondence should be addressed. E-mail: modesto{at}mmb.pcb.ub.es
Author contributions: M.R., C.F.-C., and T.M. contributed equally to this work; M.R., C.F.-C., A.H., J.L.G., and M.O. designed research; M.R., C.F.-C., T.M., A.P., and J.C. performed research; M.R., C.F.-C., T.M., and A.P. analyzed data; and M.O. wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS direct submission.
This article contains supporting information online at www.pnas.org/cgi/content/full/0605534104/DC1.
© 2007 by The National Academy of Sciences of the USA
| References |
|---|
|
|
|---|
This article has been cited by other articles in HighWire Press-hosted journals:
![]() |
A. Perez, F. Lankas, F. J. Luque, and M. Orozco Towards a molecular dynamics consensus view of B-DNA flexibility Nucleic Acids Res., April 1, 2008; 36(7): 2379 - 2394. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Liu and H. A. Karimi High-throughput modeling and analysis of protein structural dynamics Brief Bioinform, November 1, 2007; 8(6): 432 - 445. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Fort, L. R. de la Ballina, H. E. Burghardt, C. Ferrer-Costa, J. Turnay, C. Ferrer-Orta, I. Uson, A. Zorzano, J. Fernandez-Recio, M. Orozco, et al. The Structure of Human 4F2hc Ectodomain Provides a Model for Homodimerization and Electrostatic Interaction with Plasma Membrane J. Biol. Chem., October 26, 2007; 282(43): 31444 - 31452. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Seeber, M. Cecchini, F. Rao, G. Settanni, and A. Caflisch Wordom: a program for efficient analysis of molecular dynamics simulations Bioinformatics, October 1, 2007; 23(19): 2625 - 2627. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Holst, J. Mokrosinski, M. Lang, E. Brandt, R. Nygaard, T. M. Frimurer, A. G. Beck-Sickinger, and T. W. Schwartz Identification of an Efficacy Switch Region in the Ghrelin Receptor Responsible for Interchange between Agonism and Inverse Agonism J. Biol. Chem., May 25, 2007; 282(21): 15799 - 15811. [Abstract] [Full Text] [PDF] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||