A complete description of thermodynamic stabilities of molecular crystals

Significance Predicting stable polymorphs of molecular crystals remains one of the grand challenges of computational science. Current methods invoke approximations to electronic structure and statistical mechanics and thus fail to consistently reproduce the delicate balance of physical effects determining thermodynamic stability. We compute the rigorous ab initio Gibbs free energies for competing polymorphs of paradigmatic compounds, using machine learning to mitigate costs. The accurate description of electronic structure and full treatment of quantum statistical mechanics allow us to predict the experimentally observed phase behavior. This constitutes a key step toward the first-principles design of functional materials for applications from photovoltaics to pharmaceuticals.

M olecular crystals are ubiquitous in the pharmaceutical industry (1) and show great promise for applications in organic photovoltaics (2); gas adsorption (3); and the food, pesticide, and fertilizer industries (4). Their tendency to exhibit polymorphism, i.e., to exist in multiple crystal structures, on one hand provides a mechanism to tune properties by controlling crystal structure (5) and on the other hand introduces the challenge of synthesizing and stabilizing crystal structures with desired properties (6). While thermodynamic stability at the temperature and pressure of interest is sufficient (although not necessary) to ensure long-term stability, simply understanding thermodynamic stability already poses a formidable challenge. * This is particularly true for pharmaceuticals, where free energy differences between drug polymorphs are often smaller than 1 kJ/mol (7), leading to the risk of the drug transforming into a less soluble and consequently less effective form during manufacturing, storage, or shelf life (8,9). Indeed, the problem of late-appearing drug polymorphs is widespread (10,11).
The pharmaceutical industry therefore spends considerable resources on high-throughput crystallization experiments to screen for polymorphs (12), into which the target structure may decay. However, crystallization experiments do not probe thermodynamic stability, and conclusive studies of the impact of temperature changes after crystallization on the stability of polymorphs (i.e., their monotropic or enantiotropic nature) (13) are often prevented by limited sample quantities. Hence, there is the appeal of theoretical crystal structure prediction (CSP) (14) based * Kinetics may protect thermodynamically metastable structures from decaying almost indefinitely.
on the thermodynamic stability, which promises to complement crystallization experiments (15) by exhaustively searching for competing polymorphs.
Despite the demonstrable value of CSP for many classes of materials (16)(17)(18)(19)(20)(21), and the continuing progress evidenced by a series of blind tests (15), the success of CSP for molecular crystals has been limited by the inability to routinely predict the relative stability of competing candidate structures (22). This is largely because the methods used for stability rankings typically ignore or approximate the subtle interplay of several effects, such as intricate intermolecular interactions (23), the (quantum) statistical mechanics of the nuclei (24) and the unit cell (25), and thermal expansion (26), thereby incurring errors larger than the free energy differences of interest. The importance of each of these effects has been demonstrated in isolation, but predictive stability rankings must also comprehensively account for their interplay.
Recent implementations of advanced path-integral (PI) approaches (27,28) allow exactly accounting for the quantum statistical mechanics of the nuclei and the unit cell (29,30) for arbitrary potential energy surfaces (PESs). At the same time, modern machine-learning potentials (MLPs) (31) permit accurately reproducing ab initio PESs and dramatically reduce the cost of performing simulations approaching ab initio accuracy (32).

Significance
Predicting stable polymorphs of molecular crystals remains one of the grand challenges of computational science. Current methods invoke approximations to electronic structure and statistical mechanics and thus fail to consistently reproduce the delicate balance of physical effects determining thermodynamic stability. We compute the rigorous ab initio Gibbs free energies for competing polymorphs of paradigmatic compounds, using machine learning to mitigate costs. The accurate description of electronic structure and full treatment of quantum statistical mechanics allow us to predict the experimentally observed phase behavior. This constitutes a key step toward the first-principles design of functional materials for applications from photovoltaics to pharmaceuticals.
Despite these advances, calculations of rigorous thermodynamic stabilities for general molecular materials have been complicated by the absence of an integrated framework, which facilitates both the rapid development of MLPs and free energy calculations including all physically relevant effects, while ensuring universal applicability to diverse systems.
In this work we present an efficient framework for ranking candidate structures of arbitrary compounds using rigorous ab initio Gibbs free energy calculations, based on the streamlined development of MLPs and their integration with PI methods. Our approach builds upon our previous work on combining PI approaches with MLPs for ice polymorphs (29,33), but greatly enhances its accuracy, efficiency, and robustness for out-of-thebox applications to general compounds. In particular, we simplify the development of MLPs using a straightforward and inexpensive protocol for compiling ab initio reference data, which is designed to work for general organic compounds and accounts for (the often-neglected) cell flexibility and quantum nuclear motion. Additionally, robust data-driven techniques minimize the human effort involved in training the MLPs. In contrast to previous CSP ranking methods that use MLPs (34,35), we exactly account for the quantum statistical mechanics of the nuclei and the cell and use MLPs only as a stepping stone for computing ab initio Gibbs free energies, eliminating all dependence on the MLPs and their limitations.
The reliability and general applicability of our approach are showcased by the rapid development of MLPs and correct stability predictions for crystal polymorphs of three prototypical compounds: benzene, glycine, and succinic acid. These bear the hallmarks of more complex biomolecular systems-molecular flexibility, competing polymorphs, and intermolecular interactions ranging from weak dispersive to hydrogen bonded and ionic. Importantly, the relative stability of their polymorphs is well established (36)(37)(38). We further assess the temperature and pressure dependence of relative stabilities based on gradients of Gibbs free energies, which correspond to indicators widely used by experimentalists to predict the monotropic or enantiotropic nature of the polymorphs.
Our work complements state-of-the-art CSP methods, which efficiently survey structural space to extract small sets of promising candidate structures using ab initio calculations and/or MLPs (34,35), but struggle to reliably resolve subtle differences in stability among them (22). Combining rigorous free energy calculations, as demonstrated here, with structure searching and inexpensive CSP ranking methods constitutes an avenue to predictive CSP for complex molecular crystals of industrial importance.

Computational Framework and Systems
To predict rigorous relative stabilities, we combine PI thermodynamic integration (29) (referred to as quantum thermodynamic integration [QTI]) in the constant pressure ensemble (thereby accounting for anharmonic quantum nuclear motion and the fluctuations and thermal expansion of the cell) with densityfunctional-theory (DFT) calculations with the hybrid PBE0 functional (39,40) and the many-body dispersion (MBD) correction of Tkatchenko et al. (41) and Tkatchenko and coworkers (42) (referred to as PBE0-MBD). PBE0-MBD provides an accurate description of intermolecular interactions, as benchmarked using experimental and coupled cluster theory with singlets, doublets, and perturbative triplets [CCSD(T)] lattice energies for various molecular crystals, including form I of benzene and α-glycine (43,44). Since direct calculation of Gibbs free energies using ab initio QTI is prevented by the cost of the required energy and force evaluations (29), ab initio Gibbs free energies are calculated in a four-step process, as depicted schematically in Fig. 1 and detailed further in SI Appendix. Fig. 1. Schematic representation of the workflow for computing ab initio, quantum anharmonic Gibbs free energies for candidate crystal structures. Upper section shows the main steps: 1) generating ab initio reference data on which to 2) train a combined MLP, which can then be used to 3) compute MLP Gibbs free energies, which one can finally 4) promote to ab initio Gibbs free energies. Lower section (shaded in blue) details the key aspects of how each of these steps is performed in practice.
First, we use a simple strategy to generate a minimal but exhaustive set of unit cell "training configurations," for which we then perform PBE0-MBD calculations: We perform PI simulations based on density-function tight binding (DFTB) (45) theory for unit cells with perturbed cell parameters. This allows us to gather a large number of configurations, which incorporate quantum nuclear fluctuations and cell flexibility and from which we can distill the most distinct ones using a data-driven approach (46). This strategy leverages the low cost of DFTB and its qualitative accuracy for diverse molecular crystals (47) to avoid the bottleneck that is PBE0-MBD-based configurational sampling. Due to the versatility of DFTB, it can be used to generate robust training data for almost any compound of interest.
The subsequent training of the MLPs hinges on identifying the most important "features" of the configurations, fed to the MLPs as input. These features are usually abstract functions quantifying the local density of atoms and require the careful tuning of multiple parameters (46). Here, we render training MLPs for general compounds accessible to nonexperts by automating this procedure using a "size-extensive" data-driven approach, which avoids the manual selection of features based on "prior experience." Combining these first two steps with a "tried and tested" neural network architecture (48-50) greatly simplifies and speeds up the generation of MLPs, while remaining agnostic to the system of study.
In a third step, we exploit the orders-of-magnitude lower cost of the resultant MLPs compared to the ab initio reference method, to compute Gibbs free energies for much larger simulation supercells using QTI (29). We account for anisotropic fluctuations of the simulation cell, which are important for flexible functional materials (51), and directly calculate the free energy difference between the harmonic reference systems and the physical, anharmonic system at the PI level, which substantially reduces the complexity and cost compared to the multistep integration performed in ref. 33. We note that the affordability of MLP free energies comes at the price of residual errors with respect to the ab initio reference values due to the imperfect reproduction of the reference PES. These may arise from the short-ranged nature of the MLPs (52), from information lost during the "featurization" of the configurations (53), or from insufficient training data. The typical errors in MLP predictions of configurational energies (Table 1) are small but comparable to the subtle free energy differences between polymorphs. Therefore, in a fourth and final step, we eliminate the associated errors to obtain true ab initio Gibbs free energies by computing the difference between the MLP and PBE0-MBD free energies using free energy perturbation (FEP) (33). All calculations and simulations are performed using readily available and well-documented software, and Jupyter notebooks for analysis are provided in SI Appendix.
As an exposé of the universal applicability of this scheme, we predict the relative stabilities of a set of prototypical systems, whose small number and size belie how representative they are of general organic molecular crystals: Benzene is the archetypal rigid, van der Waals bonded molecular crystal, while succinic acid represents general hydrogen-bonded systems, and glycine prototypes are flexible zwitter-ionic systems. This small, "irreducible" set of prototypical systems covers not only the three different types of bonding, but also the chemical space that includes pharmaceuticals such as aspirin and paracetamol. Moreover, molecular flexibility and the large-amplitude curvilinear motion of the amide group in glycine trigger the same pathologies of approximate free energy methods as more complex systems exhibiting free rotation of molecular units (24,29) and serve as a stringent test for stability predictions.
For each compound we compute the free energy differences between the stable ambient-pressure polymorph and its closest experimentally established competitor(s): We consider forms I and II of benzene (36) and α-and β-succinic acid (37) at 100 K and α-, β-, and γ-glycine (54) at 300 K to compare with available calorimetric data (38,55). The nearly orthorhombic simulation supercells shown in Fig. 2, which contain equivalent numbers of molecules for all polymorphs of the same compound, ensure near cancellation of center-of-mass free energies and suffice to converge stabilities with respect to finite-size effects to within 0.1 kJ/mol (SI Appendix).

Ab Initio Thermodynamic Stabilities
As shown in Fig. 3A, the final ab initio Gibbs free energies (shown in red) reproduce the greater stability of form I over form II of benzene and of β-over α-succinic acid, the metastability of β-glycine, and the near degeneracy of α-and γ-glycine (55). Moreover, our Gibbs free energy differences are in agreement with available calorimetry data (38,55) to within statistical and experimental uncertainties.
The QTI approach also yields gradients of Gibbs free energies, including the molar volume, entropy, and heat capacity, which provide indication regarding pressure-and temperaturedriven changes in relative stability and thus the monotropic or enantiotropic nature of compounds. For instance, since molar volumes are derivatives of the free energy with pressure, we can predict form II of benzene to become thermodynamically stable over the ambient pressure form I at 1.4 GPa (at 100 K), which is in good agreement with the experimentally determined transition pressure of 1.5 GPa (56). Similarly, we determine the entropy of β-succinic acid to be smaller than that of α-succinic acid, making the latter the preferred high-temperature polymorph, in agreement with the experimental phase behavior (57). While in the case of glycine we are able only to predict near degeneracy of α-and γ-glycine at ambient conditions, molar volumes suggest α-glycine to be the most stable phase at high pressures, which is in line with experiments showing that it remains stable up to 23 GPa (54).
By comparing rigorous free energies with estimates that exclude nuclear quantum effects (NQEs), anharmonicity, and cell expansion and flexibility, we are able to understand the extent to which these effects and their interplay contribute toward the stability of molecular crystals. Crucially, as shown in Fig. 3B, the size and sign of these effects depend entirely on the compound and the polymorphs at hand, highlighting that rigorous QTI is indispensable for predicting phase stabilities and that molecular crystals are typically stabilized by a nontrivial interplay of different physical effects, whose individual importance is belied by the subtle resultant free energy differences. For instance, the greater stability of form I of benzene hinges on an accurate description of the electronic structure, while NQEs and anharmonicity cancel out almost perfectly and thermal expansion affects both forms similarly. In contrast, in succinic acid NQEs and anharmonicity cooperatively stabilize the α form and thermal expansion differentiates the two polymorphs. In glycine NQEs and thermal expansion differently affect the stability of the α-and β-polymorphs with respect to the γ form, and neglecting any of the three effects would lead to large errors on the scale of the experimental free energy differences.
Meanwhile, the MLP-based stability predictions (shown in blue in Fig. 3A) are only limited by the accuracy, with which the MLPs reproduce the ab initio PES (Table 1), and consequently correctly reproduce the greater stability of form I of benzene compared to form II. At the same time, the incorrect MLP-based stability predictions for succinic acid and glycine highlight the critical importance of the final FEP step. Promoting MLP free energies to the ab initio level by FEP incurs only the cost of a few tens of ab initio energy and force evaluations for configurations sampled by the MLPs. We note that the cost of this step is comparable to that of common equation-of-state calculations and thus constitutes a reliable and computationally efficient means of predicting the relative stability of polymorphs.
Given that errors of 1 kJ/mol are often considered to be within "chemical accuracy," it is worth emphasizing that the compounds considered here are not hand-picked, "pathological" examples, but expected to be representative of many biomolecular compounds. The small free energy differences between polymorphs, which are smaller than k B T but can be resolved experimentally (38,55) due to the kinetic suppression of interconversion between polymorphs, constitute a very stringent test of our framework and its ability to accurately capture phase stability. By matching the subkilojoule per mole accuracy of calorimetry experiments, it provides a robust foundation for studying α-, β-, and γ-glycine (G-α, G-β, and G-γ); and αand β-succinic acid (S-α and S-β) calculated using PBE0-MBD-based MLPs (blue) with the QTI approach and corrected to the ab initio PBE0-MBD DFT level using free energy perturbation (red). Experimental data (38,55) are shown in green. (B) Contributions of quantum nuclei (olive), anharmonicity (gray), and cell expansion and flexibility (pink) to the relative stabilities of the said polymorphs. These have been respectively obtained by comparing Gibbs free energy differences to estimates from a classical thermodynamic integration, a harmonic approximation, and a quantum thermodynamic integration using a fixed 0-K optimized cell. transition temperatures, pressures, and rates and permits benchmarking sophisticated electronic structure theories against experiment.

Comparison with Approximate Approaches
To further highlight the advantages of the approach proposed here over established approximate methods for ranking stabilities in CSP, we assess the limitations of the most widely used approximate methods, prefaced by acknowledging their successes for a wide range of applications (58). We note that the MLPs reproduce the ab initio PESs with sufficient accuracy to assess the impact of approximations to nuclear motion and compare the respective approximate (free) energy differences between polymorphs to the corresponding exact MLP Gibbs free energies.
The current state of the art is to correct (free) energies on the basis of a single-point hybrid-functional DFT calculation for the structure relaxed using semilocal DFT (57). Thermal and quantum nuclear effects are included within a harmonic approximation (HA) (59), while thermal expansion is modeled by relaxing the cell within a quasi-harmonic approximation (QHA) (58). These corrections are generally computed at the semilocal DFT level. As shown in Fig. 4, these approaches neither universally predict the most stable form (as they exhibit errors larger than 1 kJ/mol) nor systematically converge to the full hybridfunctional QHA reference. This highlights the need to go beyond a single-point hybrid-functional DFT correction to semilocal configurational or (quasi-)harmonic free energies to consistently deliver correct stability orders and free energy differences with subkilojoule per mole accuracy. We further note that the above results benefit substantially from the fortuitous cancellation of errors (29), but the residual errors cannot be estimated and apparent physical insights may be misleading.
Since the hybrid-functional-based QHA seems to be competitive with the rigorous PI approach, is further worthwhile to put the cost of the calculations into perspective. For glycine, as the most costly example, the 4,000 PBE0-MBD calculations on unit cells constituting the reference data for the MLP, the MLP-based PI thermodynamic integration, and the 50 PBE0-MBD calculations on supercells required for the FEP contribute roughly equally to the total cost of around 148,000 core hours per polymorph. For comparison, computing PBE0-MBD HA free energies for the same simulation supercells using finite differences and nondiagonal supercells to probe individual k-points (60), but not leveraging the MLP, would require about three times the core hours. A PBE0-MBD QHA free energy calculation would be an order of magnitude higher in computational cost. Although (Q)HA free energies may also be computed inexpensively using MLPs, they cannot be promoted to their first-principles counterparts in a straightforward and cost-effective manner as exact MLP free energies. Despite a focus on universal applicability over efficiency, the cost of the above rigorous Gibbs free energies is thus small compared to the estimated cost of calculating free energies within the (Q)HA using hybrid-functional DFT.

Discussion
The ability of our approach to predict free energy differences with subkilojoule per mole accuracy renders it valuable in identifying "competing" polymorphs with similar lifetimes to the most stable form. It bridges the gap between theory and experiments by allowing direct comparison of free energy differences with calorimetric data-a significant improvement over current approaches, which require error-prone ad hoc extrapolations to 0 K (61). Moreover, 1) rigorous predictions of the entropy, molar volume, and heat capacity and 2) robust MLPs with ab initio accuracy are complements to our approach. The former are directly related to "thermodynamic rules of thumb," which are widely used by experimentalists to assess stability trends (13), while the latter enable structure determinations for experimental samples based on NMR (62) and vibrational spectra (63). Furthermore, a rigorous account of thermally induced phase transitions can be obtained without repeating the procedure at every state point. The combination of QTI with parallel tempering (64) can enhance the efficiency of performing single-temperature (or pressure) sweeps, yielding full phase diagrams, while also sampling "slow" degrees of freedom such as conformational transitions, inaccessible to approximate methods (29). All these features are highly sought after by the pharmaceutical industry, as they are made possible at a manageable computational cost.
Our protocol easily extends to stability predictions for other complex molecular crystals, as its data-driven nature accelerates MLP development, irrespective of the material under consideration. As a proof of this, we have developed MLPs for polymorphs of three complex pharmaceuticals-aspirin, paracetamol, and XXIII, the most complicated system (58) from the latest blind test of organic crystal structure prediction methods (22)-and tested them by performing PI simulations in the constant pressure ensemble, as required for QTI (SI Appendix). Although these MLPs have been trained on DFTB data as a proof of concept and consequently lack chemical accuracy, they remain robust and capture the molecular flexibility of these systems. Given that dynamic disorder, thermal expansion, conformational relaxation of the molecular units, and potential (dynamic) instabilities of candidate polymorphs are automatically accounted for within the QTI approach, we expect stability predictions to be very robust with respect to the nature of the candidate polymorphs and thus directly applicable to said pharmaceutical and blind-test systems.
In applications involving large numbers of polymorphs or polymorphs with large unit cells, suitable sets of reference configurations can be generated based on configurations of liquid or amorphous states at different pressures (65). This exploits that the accuracy of MLPs, which predict energies and forces on the basis of local contributions, rests on having reference data for all distinct local atomic environments (65), rather than for all polymorphs of interest. The computational cost of building the training set then remains largely independent of number and unit cell size of the polymorphs of interest. For large numbers of polymorphs the cost per polymorph thus effectively reduces to that of the MLP-based thermodynamic integration and of FEP. In practice, the computational cost of FEP can be reduced by running only as many ab initio calculations as required to reduce the statistical error to below the predicted free energy differences between polymorphs. For instance, fewer than a handful of PBE0-MBD calculations would have sufficed to conclusively establish that form I of benzene is more stable than form II. Indeed, subject to estimates of the uncertainty of the MLP predictions (66,67), it may be possible to omit FEP altogether. Recent work on the use of higher body-order correlations in atomistic representations (68) and on including long-ranged interactions (52) promises to enable subkilojoule per mole accuracy, eliminating the need for FEP even in applications involving subtle free energy differences.
Finally, the empiricism involved in selecting the exchangecorrelation functional and dispersion correction used in the DFT calculations can be removed by using PESs evaluated using beyond-DFT electronic structure theory. Crucially, our scheme extends naturally to predictions of Gibbs free energies based on quantum-chemical electronic structure methods (61,69) such as second-order Møller-Plesset perturbation theory, random-phase approximation, coupled cluster, or quantum Monte Carlo, some of which are systematically improvable and can thereby be rendered truly ab initio (70). While these come at an increased computational cost per calculation, recent developments in machine learning for materials science (71) promise to minimize the number of quantum-chemical calculations required to train accurate MLPs and thus to keep the overall costs in check. Indeed, recent work demonstrates the corresponding construction of robust and accurate MLPs for CCSD(T) reference data (70).
In conclusion, marrying state-of-the-art electronic structure, free energy, and machine-learning methods in a widely applicable framework enables rigorous, predictive free energy calculations for complex (organic) molecular crystals at general thermodynamic conditions. The unprecedented accuracy of our approach sets the stage for future studies of kinetic effects as well as full p-T phase diagrams in a reliable and computationally efficient manner, paving the way for guiding experimental synthesis of such materials. The protocol and the scripts provided in SI Appendix permit its application practically out of the box. Determining the relative stability of generic polymorphic compounds is a recurrent problem across different domains of science and engineering-from nucleation theory to the practical design of pharmaceuticals-and we hope that the robust and easy-to-use nature of our end-to-end protocol will facilitate reliable, accurate free energy calculations beyond those of the computational chemistry community.

Methods
Machine-Learning Potentials. We have constructed Behler-Parinello-type neural network potentials (48) for benzene, glycine, and succinic acid using the n2p2 code (72). In this framework, structures are encoded in terms of local atom-centered symmetry functions (SFs) (48). Initial sets of SFs were generated following the recipe of ref. 73. Based on the same reference structure-property data subsequently used for training, the 128 (benzene and succinic acid) and 256 (glycine) most informative SFs were extracted via principal covariates CUR selection (74).
Our data are based on Langevin-thermostated PI NVT simulations at 300 K, performed using the i-Pi force engine (28) coupled to DFTB+ (75) calculations with the 3ob parameterization (76). For each polymorph multiple cells were simulated, rescaling the experimental cell lengths and angles by up to 10 and 5%, respectively. The trajectories of PI replicas for all polymorphs of a given compound were concatenated and farthest-point sampled (77)(78)(79) to extract the most distinct configurations for feature selection and MLP training. Subsequently, ab initio reference energies and forces were evaluated for said configurations.
To minimize the computational cost of the reference calculations the MLPs are composed of a baseline potential trained to reproduce energies and forces from more affordable PBE-DFT (80) calculations with a Tkatchenko-Scheffler (TS) dispersion correction (81) (PBE-TS) and a Δ-learning (82) correction trained (on 10 times fewer training data) to reproduce the difference between the baseline and more expensive calculations with the hybrid PBE0 functional (39,40)  Free Energy Methods. For each polymorph the average cell was determined using MLP-based PI NST simulations (88) at the desired inverse temperature β, accounting for anharmonic quantum nuclear motion and anisotropic cell fluctuations. The difference between the Gibbs and Helmholtz free energies computed from an MLP-based PI NPT simulation based on its average cell is where ρ(V|P ext , β) is the probability of observing the cell volume V at external pressure P ext and inverse temperature β. A standard Kirkwood construction (89) that transforms the Hamiltonian from a harmonic to an anharmonic one provides the difference between the anharmonic and the harmonic quantum Helmholtz free energies: whereĤ λ is the Hamiltonian of the MLP alchemical system with the potential U λ ≡ λU MLP + (1 − λ)U har MLP , and · is the ensemble average computed from a PI NVT simulation. The reference absolute harmonic Helmholtz free energy is obtained from a harmonic approximation using where ω i is the frequency of the ith phonon mode. In a final step, the ab initio Gibbs free energy is obtained from its MLP counterpart by free energy perturbation using For systems exhibiting large-amplitude curvilinear motion, the harmonic-toanharmonic thermodynamic integration can be performed efficiently using a Padé interpolation formula (24).
Understanding the Role of Different Effects. We disentangle the role of anharmonicity directly from Eq. 2 and that of thermal expansion by comparing the Helmholtz free energies from Eq. 2 for the variable-cell geometryoptimized and mean PI NST cells. The role of the quantum nature of nuclei is quantified by comparing the classical and quantum Gibbs free energies. We calculate the former using the Helmholtz free energy of the classical harmonic oscillator as a reference and evaluating Eqs. 1 and 2 using classical molecular dynamics.
Free Energy Gradients. Volume and entropy are related to gradients of the free energy V = ∂G ∂P N,T , S = − ∂G ∂T N,P .

[4]
Differences between equilibrium (molar) volumes of polymorphs can directly be observed in PI NPT simulations. Meanwhile entropic differences can be computed from with G from Eq. 1 and the enthalpy H from the associated PI NPT simulation. Linear extrapolation then permits estimating whether and at which pressures Pc and temperatures Tc the Gibbs free energy difference between polymorphs will vanish and a phase transition should be expected: Data Availability. Anonymized (structures, scripts, and codes to reproduce all the results) data have been deposited in https://github.com/ venkatkapil24/data_molecular_fluctuations. All other study data are included in this article and/or SI Appendix.