Conformational free-energy landscapes of a Na+/Ca2+ exchanger explain its alternating-access mechanism and functional specificity

Significance The class of membrane proteins known as secondary-active transporters mediate a wide range of critical cellular processes, including nutrient uptake, transmembrane signaling, and resistance to cytotoxic compounds, like human-made drugs. A detailed understanding of their molecular mechanisms is therefore of interest not only from a fundamental standpoint, but also because it will facilitate the design of inhibitors or stimulators that may be used as therapeutic agents. This study provides a conceptual mechanistic framework, grounded on statistical thermodynamics, that bridges the specific physiological function of these proteins and their molecular structure. While the study is focused on a particular subclass of transporters involved in cardiac physiology and cellular Ca2+ homeostasis, we envisage our conclusions will be broadly applicable.


Supporting Methods
Construction and simulation of repeat-swap model of NCX_Mj -To construct the repeat-swap model of NCX_Mj shown in Fig. 1, we first generated a reference sequence by swapping the residues of the two repeats, with the specific sequence alignment provided in Fig. S1, as described elsewhere (1).We then generated 250 models with MODELLER v9. 15 (2) using as template the OF X-ray structure of NCX_Mj bound to 3 Na + ions (3).To preserve the C2 pseudosymmetry of the binding site region with respect to an axis parallel to the membrane plane, we restrained the distances between the 3 Na + ions and their coordinating oxygens as well as the oxygen-oxygen distances.The lowest-energy model was selected for further refinement.First, sidechain rotamers were optimized using SCWRL4 (4).Then, the model was energy-minimized according to the CHARMM27/CMAP forcefield (5,6) and embedded in a solvated POPC bilayer using GRIFFIN (7).An MD simulation of 800 ns was then carried out to equilibrate the model at constant temperature (298 K), pressure (1 atm) and membrane surface area (∼69 Å 2 per lipid).This simulation was carried out with NAMD 2.12 (8) and the same force field.This initial equilibration involved a series of stages in which first position restraints and then RMSD restraints are gradually removed on protein sidechain and backbone atoms.Thereafter we performed additional 3-µs MD simulation without any restraint (see Fig. S5C for results).2+ and Sr 2+ -A statistical survey of the Ca 2+ -binding sites in the PDB (9) suggests that when Ca 2+ is bound to two carboxylate groups in a bidentate mode, the latter ion has a total coordination of eight.To fulfill this coordination geometry, in our previous simulation study of NCX_Mj, based on the OF X-ray structure (9), we placed Ca 2+ in the SCa site and introduced two water molecules coordinating Ca 2+ at near the Smid site.This ion coordination mode has also been used in this study.By contrast, in the original interpretation of the crystallographic data, Ca 2+ was assumed to partially occupy SCa and Smid sites, assisted by one water molecule (9).To validate or refute our previous interpretation, we carried out a new refinement of NCX_Mj bound to Ca 2+ based on the electrondensity map available in the Protein Data Bank (entry 5HXR).Model building was completed with COOT (10), and structure refinement was carried out with PHENIX (11).The results are shown in Fig. S2.It is clear that the geometry we propose for the Ca 2+ state is just as compatible with the electron-density data as the original model; it is however more plausible, statistically speaking, and is also supported by recent ATR-FTIR experiments (12), which show that Ca 2+ does not reside near residue D240 at the Smid site.To further support the proposed coordination geometry, we also carried out a new refinement of crystallographic data obtained for Sr 2+ (9), which is also transported by NCX_Mj (13).This analysis clearly reveals that in the published structural model (entry 5HXS) the coordination of the Sr 2+ ion is incomplete since additional coordinating atoms ought to be present near the Smid site (Fig. S2).Indeed, a new refinement adding two coordinating water molecules in the defective region significantly improves the agreement with the crystallographic data (Fig. S2).The conclusion from this analysis is that both Ca 2+ and Sr 2+ bind only at the SCa site and in a nearly identical mode, which includes two coordinating water molecules at the Smid site.The newly refined structures of NCX_Mj bound to Ca 2+ or Sr 2+ are available at https://github.com/Faraldo-Gomez-Lab-at-NIH/Download.

Improved refinement of crystallographic data for NCX_Mj bound to Ca
Enhanced-sampling simulations of NCX_Mj bound to 3 Na + or 1 Ca 2+ -The enhanced-sampling MD simulations used to calculate the free-energy landscapes shown in Fig. 2 used as only input the outward-facing, occluded structure of NCX_Mj with 3 Na + ions bound; the protein coordinates are those initially reported (PDB entry 3V5U (3), but the ion-binding site configuration reflects a subsequent correction ( 14)).A total of 18 reaction coordinates were used simultaneously in these calculations; collectively, these coordinates were designed to loosely guide the simulated trajectories to sample the vicinity of an idealized linear path between the OF structure and the repeat-swap (RS) model of the IF state.The functional form of each of these variables is: where  " (),  # () quantify the similarity of the configuration at simulation time  relative to either of the two reference structures.(Note that neither was assumed to be a free-energy minimum.)The smoothing parameter  is proportional to the inverse of  "# i.e. the similarly between the two end states (the values  for each variable are given in Table S1).The particular metric of similarity used in Eq. 1 is the mean-square difference (MSD) in the configuration of a specific atom selection.Each of the 18 different coordinates  ! was defined using a different atom selection.In each case, configurations close to the experimental OF structure correspond to  !~1, while those close to the RS model correspond to  !~2.All other configurations result in intermediate values of  ! .The specific atom selections that define each of the 18 pathcoordinates  ! are graphically depicted in Fig. S3.These selections include backbone and sidechains atoms, in both structured and unstructured fragments, both on the intracellular and the extracellular side of the protein; altogether, they describe global and local structural differences between the internal structures of the two topological repeats with the structure of the OF state (Fig. 1).Preliminary non-equilibrium simulations of the alternating-access transition using this set of path-coordinates and the minimum-mode Metadynamics method (18) indicated this selection was suitable for the problem at hand.It is important to note, however, that this selection did not result from an automated process or blind algorithm; such methodology would be very valuable, but to our knowledge it does not exist at present time.Instead, this selection was based on our accumulated experience (i.e. both successes and failures) evaluating conformational free-energy landscapes of proteins through enhanced-sampling simulation methods at quasi-equilibrium (9,(15)(16)(17).This experience tells us that to induce a conformational change in simulation, one must act on the elements that govern the energetics of the molecular system.Simple descriptors of the conformational state of a protein might appear intuitive but are oftentimes lacking in this regard.The backbone of a protein, for example, is a good descriptor for a graphical representation of a structure, but in reality, the tertiary structure of the protein is largely dictated by the interactions between sidechains and their environment.In particular, the re-organization of hydrophobic sidechains into alternate interaction patterns is typically a rate-limiting process that must be explicitly considered; rearrangements of sidechains in seemingly unstructured elements are also slow processes that can be highly consequential.
The BE-META simulations carried out to explore this 18-dimensional space included 32 replicas.In the simplest formulation of the BE-META method, each of the replicas includes a Metadynamics bias that fosters exploration of one reaction coordinate.In previous studies (16,17) we have observed that the performance of the method is much improved when each replica is designed to explore at least two coordinates, in different combinations (detailed in Table S1).To include a replica for which the sampling is totally unbiased also provides a useful metric of self-consistency.Monte-Carlo exchanges between replicas were attempted every 10-20 ps.
For state with 3 Na + ions bound, we carried out an initial equilibration of 1 µs for each replica; in this time window, all replicas reached uniform sampling in the corresponding reactioncoordinate space, indicating that the bias-potential developed in each replica approximately mirror the corresponding free-energy profile (or potentials-of-mean-force). We also observed that 7 trajectories visited independently an IF state distinct from the RS model.This state was also well populated in the unbiased replica, confirming that it is a low free-energy minimum.To initiate the calculations with 1 Ca 2+ ion bound, we extracted the final configurations from each of the replicas used for the Na + -bound state, and in each case, we reconstructed the binding site geometry of Ca 2+ -bound state (Fig. S2) through a series of restrained equilibrations totaling 50 ns, while sustaining the biasing potential previously developed in each replica.To fully converge the Metadynamics biasing potentials, we then duplicated the number of replicas to 64 and extended the sampling by 200 ns per replica for the Na + state and by 250 ns per replica for the Ca 2+ state.The free-energy landscapes shown in Fig. 2 derive from this 64-replica calculation, for the final 200-ns sampling window.
Enhanced-sampling simulations of NCX_Mj bound to 2 H + , 2 Na + or with no ions bound -Our previous simulation studies of OF NCX_Mj bound to 2 H + , 2 Na + or with no ions bound indicated that the alternating-access transition would be limited by the energetics of occlusion of the binding-site region (9), rather than a conformational intermediate between the OF and IF state.Thus, the enhanced-sampling simulations used to derive the free-energy landscapes in Fig. 5 focused instead on fostering the occlusion or opening of the binding site, by using an alternative set of reaction coordinates.The first type of coordinate quantifies the degree of hydration of each of the ion binding sites.For example, for the central site, known as SCa, this reaction coordinate is: where  & () denotes the time-dependent distance between the center of the site and each of the oxygen atoms in any of the water molecules in the system, denoted by the index ; β is 10 nm.
When the binding site is occupied by an ion, the ion defines its center; when the site is empty, its center is defined by the center of mass of the protein oxygen atoms coordinating the ion when bound.Analogous expressions apply to the other binding sites (except for Glu residues, for which the Cd atom was considered instead).The second type of reaction coordinate describes instead changes in the protein structure.For example, in the examination of the IF states,  '(",#*+ describes the movement of intracellular portions of TM1 (residues 18-26) and TM2 (residues 45-50) relative to TM7 (residues 211-217), using the same function as that in Eq. 1: In this case,  " () and  # () denote the MSD between the configuration at simulation time t (defined by Ca atoms of TM1, TM2 and TM7) and either the intracellularly closed or intracellularly open conformation, respectively, and  = 20 nm -2 .(Reference open and closed conformations were extracted from MD simulations in different ion-occupancy states.)An additional variable of this type describes the relative opening and closing movements of intracellular portions of TM2 (residues 45-50) and TM8 (residues 232-236), and is defined with three reference states: Here,  " (),  # () and  .() are the MSD between the current configuration at time t and three progressively more open reference structures, and  = 200 nm -2 .The BE-META simulations carried out to explore this alternative reaction-coordinate space included 16 replicas; 15 replicas included Metadynamics potentials on different combinations of reaction coordinates, one replica was unbiased.Additional details are provided in Table S2.Monte-Carlo exchanges were attempted every 5 ps.The total simulation time was 300 ns for each replica; equilibration of the biasing potentials was achieved after approximately 200 ns, and so only the last 100 ns were used for analysis.
Restraining potentials -In the bias-exchange Metadynamics (BE-META) simulations of NCX_Mj with 2 or 3 Na + ions bound, we applied restraining potentials on a coordination-number variable to preclude Na + dissociation (even though spontaneous dissociation was not observed in unbiased simulations).The coordination variable was defined as: where  / is the distance between ions and coordinating oxygen atoms and  0 = 0.24 nm.For this variable we set lower values of  0 = 4.3 for Na + ions bound to Sint and Sext sites (coordinated to 5 oxygens) and  0 = 4.75 for the ion bound to site SCa (coordinated to 6 oxygens; Fig. 2C and Fig. 5C), using a potential of the form  $ ( $ () −  0 ) 1 , applied when  $ () <  0 and for which  $ = 1,000 − 2,500 kJ/mol.The water molecule in the binding site (Fig. 2C and Fig. 5C) was confined with two half harmonic potentials,  2 (() −  0 ) # , on distances, (), between the water oxygen and the Cγ atoms of N81 and D240 respectively.This potential was set with  2 = 10,000 kJ/(mol nm 2 ) and applied when the above mentioned distances are larger than  0 = 0.46 nm.
A similar scheme was used in the BE-METAD simulations with Ca 2+ bound, but only for the SCa site.This potential was applied on the coordination variable defined by Eq. 5 in which  / is the distance between Ca 2+ and a coordinating oxygen atom, and  0 = 0.3 nm.For this variable we set a lower value of  0 = 7.4 using a potential of the form  $ ( $ () −  0 ) # , applied when  $ () <  0 and for which  $ = 400 kJ/mol.The two water molecules at the Smid site (Fig. 2C) were also confined with a half harmonic potential,  2 (() −  0 ) 1 , on the distances () between the Cγ atoms of either N81 or D240 residues with the oxygen atom of the adjacent water molecule.For this potential  0 = 0.41 nm and  2 = 10,000,000 kJ/(mol nm 4 ).To prevent spurious water penetration and facilitate convergence, we also applied a boundary potential on the coordination variable defined by Eq. 5 in which  / is the distance between Ca 2+ and all oxygen atoms of water molecules (excluding the those coordinating Ca 2+ ), and  0 = 0.3 nm.Distances,  / , beyond 0.5 nm were excluded.For this variable we set an upper value of  0 = 0.5 using a potential of the form  $ ( $ () −  0 ) # , applied when  $ () >  0 and for which  $ = 400 kJ/mol.To prevent unfolding of binding site portions of TM2 and TM7 we also applied harmonic distance restraints (  2 (() −  0 ) # , with  0 = 0.41 nm and  2 = 10,000 kJ/(mol nm 4 ) between: (1) carbonyl oxygen of A206 and side chain oxygen of S210; (2) carbonyl oxygen of A206 and amide nitrogen of S210; (3) carbonyl oxygen of V205 and amide nitrogen of T209; (4) homologous distance pairs based on inverted symmetry (A47-S51, M46-T50).

Derivation of free-energy landscapes from enhanced-sampling trajectories -
To translate the BE-META simulation data into 2D free-energy landscapes, we identified two intuitive structural descriptors of intra-and extracellular opening-closing motions.These descriptors are defined by the number of protein-protein contacts between 1) intracellular portions of TM1-TM2 (residues 17,20,21,24,25,42,43,46,47,48,50) 77).Note these sets of residues are in pseudo-symmetrical fragments of the two topological repeats.The number of protein-protein contacts for a given residue selection was defined by: where  !& denotes the Ca-Ca distance between residue  and  and  0 , the contact distance, is 0.7 nm.
The free-energy maps shown in Fig. 2A and Fig. 5A reflect a re-weighted histogram as a function of the two descriptors mentioned above.To ensure unbiased sampling, the weight of each simulation frame  was set as: where V 3(!) W is the free energy as a function of the reaction coordinates used in the BE-META simulations, () is the bin assigned to frame  and  3(!) is the total number of frames in that bin (19).The free energy, V 3(!) W, was derived using the FCAM method (19).Specifically, with either 3 Na + or 1 Ca 2+ bound, we defined V 3(!) W as a function of 4 variables, namely  7 ,  ". ,  "7 and  "+ (Fig. S3), and thus this quantity was deduced from the sampling collected in the BE-META replicas in which those variables were biased (Table S1).The same replicas were used to derive the free energy maps in Fig. 2 based on Eq. 7. The minimum free-energy paths in Fig. 4C and Fig. 4D were also derived using the FCAM software suite (https://github.com/FCAM-NIH/FCAM),again in the space of the 4 reaction coordinates mentioned above.

Maximum-entropy reweighting of simulated ensembles based on HDX-MS data -
In its original formulation, the HDXer (20) method assumes that the input datasets of hydrogen deuterium exchange coupled to mass spectrometry (HDX-MS) are corrected for deuterium back-exchange.While this is an ideal situation, not all datasets can be produced with this additional layer of information.For example, the HDX-MS data published for NCX_Mj (24) do not include this correction, to our knowledge.Therefore, it was necessary to develop an alternative strategy that allows for reweighting ensembles on data that have not been corrected for back-exchange levels.This new scheme assumes that two HDX-MS datasets are available, for which uncorrected deuterium fractions (with values ranging from 0 to 1) of peptide j measured at time t are denoted as  &,8 9:;" and  &,8 9:;# respectively.These datasets refer to experiments carried out under two different biochemical/physiological conditions (e.g., the presence or absence of ligands) or for two different mutants of the same protein.All HDX experiments are assumed to be performed with the same protocol so that the back-exchange contribution to a given peptide can be reasonably assumed to be the same across the two datasets.In case those datasets correspond to measurements carried out on two protein variants, only identical peptides must be considered for the analysis (i.e., peptides containing sequence variations should be excluded).
A key approximation in this approach is that back-exchange is accounted for by normalization of the deuterated fraction of a given peptide by a factor,  & , which retains approximately the same value for different timepoints (21,22) and, as previously mentioned, also across the two different HDX datasets.Provided a fully deuterated control were available, the factor  & could be experimentally determined using the relation: ⁄ , where  "00 ,  0 are the centroid masses of the fully-deuterated and non-deuterated peptide respectively,  % is the D/H fraction in the buffer and  & is the associated number of exchangeable amide hydrogens.However, in this case we assume that the experimental data are not corrected for back-exchange.Hence, the term  & is unknown and therefore is set as a free parameter during the reweighting procedure.The implication is that, rather than the absolute experimental deuteration levels, the target of the ensemble reweighting is primarily determined by the variations between the deuterium fractions of the two datasets.Hence, the sensitivity of the reweighting procedure will depend on the magnitude of these variations (in addition to the number of data points).
The proposed scheme involves two concurrent ensemble reweighting calculations, each targeting one of the HDX-MS datasets, and in which the factors  & are determined at each reweighting step by best fitting to both datasets.The overall likelihood optimization function is given by the sum of the two individual contributions for each dataset (1 and 2): where  %;; /?",# is the apparent work required to reweight the ensemble and  9>> is the error distribution of the experimental data, including both HDX datasets.
To compute the work requires a model of the exchange for a given frame in the ensemble.As in our previous work, we use the exchange model of Best and Vendruscolo (23): in which  !@!A/ is the predicted protection factor of residue i from reweighted ensemble k,  !B and  !$ are the number of H-bonds and heavy-atom contacts of the amide group of residue i, respectively, and 〈… 〉 / denotes a mean value over the reweighted ensemble k.Then, based on the maximum entropy principle, the apparent work can be expressed as (20): where  !B ( / ) and  !$ ( / ) are the number of H-bonds and heavy atom contacts, respectively, for the amide of residue i, at a molecular configuration,  / .We assume that the error distribution,  9>> , is Gaussian: The normalization parameters,  & , are determined by minimizing at each step the mean square deviation between simulated and experimental deuteration fractions (i.e. the exponent in Eq. 11), subject to the following constraints (enforced through Lagrange multipliers): The condition  & ≥ 1 stems from the consideration that, owing to the back-exchange,  &,8 9:;/ is a lower bound of the correct deuteration level.The other boundary constraint ( &  &,8 9:;/ ≤ 1) instead ensures that the corrected deuteration fraction has an upper boundary of 1.In the case that all boundary constraints are satisfied the expression for the normalization terms is given by: As in our previous work (20), the terms  / !B and  / !$ are iteratively determined in order to minimize the likelihood function of Eq. 8: where  denotes the iteration step,  is a parameter that sets the experimental uncertainty and  is a suitably chosen update rate.
Optional optimization of the model parameters -In the previous section we assumed that the model parameters,  B and  $ , have a fixed value, usually set to that derived by Best and Vendruscolo, which best fit a set of HDX measurements ( B = 2.0,  $ = 0.35) (23).To reduce model bias, we set  B and  $ as optimizable parameters during reweighting, which are sampled using a Monte Carlo algorithm and averaged out during reweighting.The parameter sampling stems from choosing an error probability distribution in which  B and  $ are integrated out: where ( B  $ ) is the prior error distribution of the model parameters.Applying a prior distribution can be useful for sparse and noisy experimental data, as it restricts the sampling to a reasonable range of values.Here, ( B  $ ) was set to mimic the trend of the square residuals as a function of  B and  $ (Fig. S6B) that was reported by Best and Vendruscolo (23).Specifically, those square residuals describe the discrepancy between predicted and experimental protection factors for a set of proteins.Based on Eq. 15, the update rule for  / !$/B becomes: where 〈… 〉 E ! ,E " denotes an average performed over  $ ,  B parameter space, and The term  C # was set as the minimum value (as a function of  $ ,  B ) of the mean square deviation, e , before reweighting, in which  is the number of data points.The average in Eq. 16 is performed using Monte Carlo with acceptance probability given by  s 9>> ( B ,  $ ) in Eq. 17.
Additional details on the application of the reweighting methodology to NCX_Mj -The original HDXer formulation (20) is based on reweighting single protein configurations, which could lead to different weights for very similar structures, for example due to a small structural variation that is sufficient to break a few amide H-bonds.To prevent this, instead of single configurations we reweight free-energy bins.The free-energy of bin  corrected according to the maximum entropy principle is defined as: where  H,! 3 and  C,! 3 are number of amide H-bonds and heavy atoms contacts averaged within bin .Fig. 3C in the main text was obtained by applying HDXer to the four-dimensional free energy landscape as a function of  7 ,  ". ,  "7 and  "+ (see above and Fig. S3), based on simulations with 3 Na + bound.To prevent a population bias we rescaled the landscape to have 50% probability for both OF and IF states (defined with a 2 kcal/mol cutoff from each minima).We removed intermediate and transition region of the landscape (Fig. For the calculation of mean values of a certain descriptor for each ensemble, we included only configurations within 1 kcal/mol from the lowest free-energy configuration in each case.These configurations were selected based on a Metropolis-Monte Carlo acceptance criteria, which accounts for the free energy and thus produces an unbiased sample.

Construction of alternative ion-occupancy states -
The states with 2 Na + ions bound to NCX_Mj were generated from the states with 3 Na + ions bound through a slow alchemical transformation that annihilates the ion bound at either the Sext or the Sint site, over a 40-ns MD trajectory.A Cl - ion in the bulk solution was concurrently annihilated to maintain neutrality.The apo states (i.e., no ions) were constructed from the equilibrated states with 2 Na + ions bound by gradually annihilating both ions, again over a 40-ns MD trajectory, together with two Cl -ions in the bulk solution.The states with 2 H + bound were generated from the states with 3 Na + bound by gradually annihilating the Na + ions from the binding sites, and a Cl -ion from the bulk, while creating protonated forms of E54 and E213, over a 100-ns MD simulation.Once initial configurations had been obtained, trajectories ranging from 1 to 2 µs were generated for all occupancy states before initiating the enhanced-sampling simulations.

Supporting Tables
Table S1.Gaussian widths (), path collective variables smoothing parameters () and combinations of reaction coordinates (Fig. S3) used in each of the BE-META simulations of the alternating-access transition of NCX_Mj with either 3 Na + ions or  a) Gaussian functions were added to the Metadynamics biasing potential on each variable in 5-ps intervals.
In the equilibration stage of Na + -bound NCX_Mj, we used 32 replicas, all started from the same conformation.To avoid a large initial bias, the strength of the history-dependent potential of Metadynamics was gradually raised during the first 500 ns.Specifically, the Gaussians height was set to 0.25 kJ.mol for the first 360 ns and then gradually incremented to 1 kJ/mol from 360 ns to 500 ns.After 1 µs (for each replica) we duplicated the number of replicas and lowered the Gaussians height to 0.5 kJ/mol for the remaining 200 ns of sampling.For Ca + -bound NCX_Mj, the Gaussians height was set to 0.5 kJ/mol.  a) Gaussian functions were added to the biasing potential acting on each variable at 5 ps intervals and were initially 2 kJ/mol in height.The height of these Gaussians functions was progressively scaled down as prescribed by the well-tempered Metadynamics method (25).To avoid a large initial bias, the historydependent potential of well-tempered Metadynamics was reset and restarted at ~60 ns (when several simulation replicas had significantly deviated from the initial protein configuration).Note the specified 'temperature bias' is not the simulation temperature, which was in all cases 298 K. Following a previous study (9), we used this scheme for an efficient exploration of the multidimensional collective-variable space: replicas with larger values of the temperature-bias and Gaussian widths are able to access high free-energy regions, albeit at low spatial resolution.Replicas with smaller values primarily sample low free-energy regions but at high spatial resolution.S1).The selections include Cα atoms (green spheres), backbone N and O atoms (blue and red spheres), and sidechain carbon atoms (green sticks).Note these selections are approximately symmetric, i.e. they include equivalent residues in both topological repeats.0 "# ()/ !2] %& , where  "# is the distance between any pair of atoms and  ! is a constant that depends on the type of interaction; the values used for water-protein and protein-protein contacts were 4.2 and 5.2 Å, respectively.(C) Persistence of H-bonding interactions between residues on the extracellular side of the protein, for each state; only interactions that change significantly during the transition are examined.(D) Same as (C), for residues on the intracellular side.(E) Analysis of the contribution of electrostatic interactions to the free-energy difference between the OF state and either the IF or the intermediate state, evaluated after scaling down the charges of selected atom groups (via a multiplication factor).To quantify the contribution of H-bonds and salt bridges, we scaled down the atomic charges in the polar groups of residues D21, E28, Y60, E180, K187, D197, K198, R223 and E257 (specifically, to the atoms in each of the 'charge groups' bearing either a neutral or finite integer charge e.g. the -CH2-COO -group of glutamate or aspartate).To assess the contribution of polar residues changing accessibility during the transition, we scaled down the charges of the polar groups of T44, T57 and T203.The contribution of the binding site was evaluated by scaling down the charges of the polar groups of residues coordinating the bound Na + and water molecules, as well the ions and water molecules.To estimate the change in these free-energy differences, we resampled the BE-META trajectories (with 3 Na + bound, for the replicas with bias on  & ,  %' ,  %& or  %( ; see Table S1), introducing a reweighing factor for each configuration, i, given by  " =  {− ( " ) −  " )/ B }, where the term ( " ) −  " ) is the difference between the potentialenergy functions with modified and conventional charges, respectively.To quantify this deformation, we evaluated the mean Z-coordinate of the ester layer across the XY plane relative to the mean Z-coordinate of both layers, in the bulk region (at a distance larger that 45 Å from the ion-binding sites center, where the membrane is approximately flat).(B) Hydrophobic thickness of the lipid bilayer in the OF, intermediate and IF states.The thickness was evaluated as the difference between the mean Z-coordinates of the outer and inner ester layers.Contours are shown in intervals of 2 Å.All maps were calculated after first aligning the protein onto a reference frame, using the portion of the structure that is invariant when OF and IF states are compared (Fig. 4AB).To prevent tilting of the membrane during this process, this alignment was two-dimensional (i.e.only X and Y components of the atomic coordinates are considered).After structural alignment we calculated for each state a threedimensional occupancy map for the ester oxygen atoms of all POPC lipids in either the outer or the inner leaflet, using the Volmap tool of VMD.The mean Z-coordinate was calculated as the weighted average of the Z-coordinate for each value of X and Y, with a weight provided by the occupancy at that position.Values of X and Y featuring less than 1% of the overall occupancy are excluded (and correspond to the region occupied by the protein in the center of each map).

Supporting Figures
Figure S1.Alignment of the amino-acid sequences of the two inverted topological repeats in NCX_Mj.The sequence identity is 32%.The transmembrane regions in each repeat are indicated and color-coded as in Fig. 1.Residues directly involved in ion-binding are marked with asterisks.The repeat-swap model shown in Fig. 1 was constructed using this alignment.

Figure S2 .
Figure S2.Improved molecular refinement of the Ca 2+ and Sr 2+ coordination geometries in outward-facing NCX_Mj based on previously published crystallographic data (9).The original models (PDB entries 5HXR and 5HXS), shown in the left, are contrasted with our proposed interpretation, shown on the right.2Fo−Fc maps contoured at 1s is shown as a grey mesh and Fo−Fc omit maps at 3s is shown in magenta.See Supplementary Methods for further details.

Figure S3 .
Figure S3.Atom selections used to define the path-collective variables employed in the bias-exchange Metadynamics simulations of the alternating-access transition (see Supplementary Methods).The specific reaction coordinate associated with each selection is indicated (see also TableS1).The selections include Cα atoms (green spheres), backbone N and O atoms (blue and red spheres), and sidechain carbon atoms (green sticks).Note these selections are approximately symmetric, i.e. they include equivalent residues in both topological repeats.

Figure S4 .
Figure S4.Molecular simulation system and error estimates.(A) All-atom simulation system used to calculate the free-energy landscapes shown in Fig. 2 and Fig. 5.The transporter (orange/blue) is embedded in a hydrated POPC lipid bilayer; electrolyte ions are omitted for clarity.(B) Estimated error of the free-energy maps shown in Fig. 2, based on block analysis.(C) Estimated error of the free-energy maps shown in Fig. 5, also from block analysis.

Figure S5 .
Figure S5.Evaluation of the predicted inward-facing (IF) state of NCX_Mj.(A) Conventional MD simulation trajectories initiated in the IF occluded state identified in the free-energy landscape shown in Fig. 2, with 3 Na + ions bound.The plot quantifies the average root-mean-square (RMS) difference between the Ca trace at a given simulation time and the Ca traces in an ensemble of structures extracted from each of the two main free-energy basins in that landscape (within 2 kcal/mol of minimum), i.e. the original IF state, and the outward-facing (OF) occluded state.Unstructured loop regions were omitted in this analysis.(B) Same as (A), for trajectories initiated instead in the OF occluded state.(C) Same as (A), for a trajectory initiated in the repeat-swap model shown in Fig. 1. (D) Overlay of the repeat-swap model shown in Fig. 1 and the predicted IF occluded state identified in the free-energy landscape shown in Fig. 2, as noted.The RMS difference (backbone only) is about 2 Å.

Figure S6 .
Figure S6.Evaluation of the predicted conformational ensembles for OF and IF states, in reference to HDX-MS data for two forms of NCX_Mj with different propensities for the OF and IF states.See also Fig. 3C.(A)The plot quantifies the degree to which a conformational ensemble that initially consists of equally weighted populations OF and IF structures (extracted from simulation) must be re-reweighted to produce optimal agreement with the experimental data obtained for each protein construct (see Supplementary Methods).The degree of reweighting is quantified by an 'apparent work', and the agreement with experiment by the mean-squared deviation between calculated and measured deuteration fractions for a collection of protein fragments.(B) Prior probability distribution of the model parameters used to predict hydrogen-deuterium exchange protection factors based on the forward model of Best and Vendruscolo(23).

Figure S7 .
Figure S7.Changing interaction patterns in the alternating-access mechanism of NCX_Mj.Reported quantities are calculated for ensembles of either OF or IF structures, with 3 Na + ions bound, extracted from the free-energy basins revealed in the landscapes shown in Fig. 2. (A, B) The charts enumerate the set of residues in the extracellular or intracellular side of the protein, respectively, that show a significant change in hydrophobic lipid-tail contacts or protein-protein H-bonding and salt-bridges when OF and IF stats are compared.(C, D) The charts enumerate the set of residues in the extracellular or intracellular side of the protein, respectively, that show a change in hydration number of 2-fold or more; the hydration number was defined using a cutoff distance of 4.2 Å. (E) Representative structures of the OF and IF ensembles illustrating the changes in interaction patterns quantified in panels (A,B) and (C,D).The residues that exhibit changes in water accessibility and lipid contacts are shown as grey and brown volumes, respectively, with water and lipid molecules in their vicinity shown with density isosurfaces (blue and yellow mesh, respectively).Residues that exhibit changes in protein-protein H-bonding and salt bridges are shown as sticks.The number of lipid-tail and H-bond contacts between two groups of atoms in a given simulation snapshot was calculated using distance cut-off values of 5.2 and 3.2 Å, respectively.Hydrogen atoms were excluded from this analysis.

Figure S8 .
Figure S8.Changing interaction patterns in the alternating-access mechanism of NCX_Mj.(A) Extracellular and intracellular views of representative OF and IF conformations, with 3 Na + ions bound (green spheres), highlighting water molecules within 10 Å of the ion-binding sites (red spheres).The approximate distances between TM helices involved in the opening and closing of intra-and extracellular pathways are indicated in Å. (B) Same views as panel (A), showing phenylalanine sidechains (spheres) that occupy or become released from hydrophobic pockets during the alternating access transition (F182 and F202 on the extracellular side, and F23 and F39 on the intracellular side).Protein residues that are in contact with those phenylalanine sidechains when buried in a hydrophobic pocket are also highlighted, as are lipid molecules in contact with either F23 or F182 when disengaged (yellow).(C) Close-up of the backbone of N-terminal portions of TM2 (TM2a-TM2b) and TM7 (TM7a-TM7b) in the OF and IF state, highlighting selected intramolecular H-bonds (gray lines).

Figure S10 .
Figure S10.Changes in membrane shape and thickness during the alternating access transition.(A) Deformation of each of the lipid bilayer leaflets relative to a flat shape, for the OF, intermediate and IF states.To quantify this deformation, we evaluated the mean Z-coordinate of the ester layer across the XY plane relative to the mean Z-coordinate of both layers, in the bulk region (at a distance larger that 45 Å from the ion-binding sites center, where the membrane is approximately flat).(B) Hydrophobic thickness of the lipid bilayer in the OF, intermediate and IF states.The thickness was evaluated as the difference between the mean Z-coordinates of the outer and inner ester layers.Contours are shown in intervals of 2 Å.All maps were calculated after first aligning the protein onto a reference frame, using the portion of the structure that is invariant when OF and IF states are compared (Fig.4AB).To prevent tilting of the membrane during this process, this alignment was two-dimensional (i.e.only X and Y components of the atomic coordinates are considered).After structural alignment we calculated for each state a threedimensional occupancy map for the ester oxygen atoms of all POPC lipids in either the outer or the inner leaflet, using the Volmap tool of VMD.The mean Z-coordinate was calculated as the weighted average of the Z-coordinate for each value of X and Y, with a weight provided by the occupancy at that position.Values of X and Y featuring less than 1% of the overall occupancy are excluded (and correspond to the region occupied by the protein in the center of each map).

Table S2 .
Gaussian widths () and temperature bias (ΔT) used in each of the replicas in the bias-exchange well-tempered Metadynamics simulations of opening and occlusion of NCX_Mj a .