Chinese Journal of Chemical Physics  2017, Vol. 30 Issue (6): 789-799

The article information

Peng-fei Li, Xiang-yu Jia, Mei-ting Wang, Ye Mei
李鹏飞, 贾相瑜, 王美婷, 梅晔
Comparison of Accuracy and Convergence Rate between Equilibrium and Nonequilibrium Alchemical Transformations for Calculation of Relative Binding Free Energy
平衡和非平衡方法计算相对结合自由能的精度和效率比较
Chinese Journal of Chemical Physics, 2017, 30(6): 789-799
化学物理学报, 2017, 30(6): 789-799
http://dx.doi.org/10.1063/1674-0068/30/cjcp1711204

Article history

Received on: November 7, 2017
Accepted on: December 27, 2017
Comparison of Accuracy and Convergence Rate between Equilibrium and Nonequilibrium Alchemical Transformations for Calculation of Relative Binding Free Energy
Peng-fei Lia, Xiang-yu Jiaa,b, Mei-ting Wanga, Ye Meia,b,c     
Dated: Received on November 7, 2017; Accepted on December 27, 2017
a. State Key Laboratory of Precision Spectroscopy, School of Physics and Materials Science, East China Normal University, Shanghai 200062, China;
b. NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China;
c. Department of Chemistry and Biochemistry, University of Oklahoma, Norman OK 73019, USA
*Author to whom correspondence should be addressed. Ye Mei, E-mail:ymei@phy.ecnu.edu.cn
Part of the special issue for "the Chinese Chemical Society's 15th National Chemical Dynamics Symposium"
Abstract: Estimation of protein-ligand binding affinity within chemical accuracy is one of the grand challenges in structure-based rational drug design. With the efforts over three decades, free energy methods based on equilibrium molecular dynamics (MD) simulations have become mature and are nowadays routinely applied in the community of computational chemistry. On the contrary, nonequilibrium MD simulation methods have attracted less attention, despite their underlying rigor in mathematics and potential advantage in efficiency. In this work, the equilibrium and nonequilibrium simulation methods are compared in terms of accuracy and convergence rate in the calculations of relative binding free energies. The proteins studied are T4-lysozyme mutant L99A and COX-2. For each protein, two ligands are studied. The results show that the nonequilibrium simulation method can be competitively as accurate as the equilibrium method, and the former is more efficient than the latter by considering the convergence rate with respect to the cost of wall clock time. In addition, Bennett acceptance ratio, which is a bidirectional post-processing method, converges faster than the unidirectional Jarzynski equality for the nonequilibrium simulations.
Key words: Free energy    Equilibrium    Nonequilibrium    Convergence rate    Accuracy    
Ⅰ. INTRODUCTION

The decrease of free energy is the driving force of any chemical or biological processes [1-7]. Estimation of free energy within the chemical accuracy is critical for quantitative understanding of biomolecular functions. To this end, much effort has been devoted to the development of new methods for the predictions of protein-ligand binding affinity over the past several decades. However, there are still many grand challenges in structure-based rational drug design [8-12]. Much of the difficulty lies in the accuracy of the employed force fields, the attainable spatiotemporal scale of computer simulations, and the reliability of the post-processing methods. There are an arsenal of methods with different features for the prediction of protein-ligand binding affinity, including docking and scoring [13, 14], the linear interaction energy (LIE) [15-17] and linear response approximation (LRA) [18] methods, molecular mechanics/Poisson-Boltzmann (generalized Born) surface area (MM/PB(GB)SA) simulations [19-21], and thermodynamic perturbation (TP) [22] based free energy methods. The end-point approximation methods (e.g., LIE, LRA, MM-PBSA/GBSA) compute ''bound'' and the ''free'' states. These methods often overestimate the enthalpic benefit upon ligand binding (by inadequate accounting for water movement) and underestimate the entropic penalty (due to undersampling of the relevant conformers) [23]. Recently, alchemical free energy simulations have been widely used to predict the relative binding free energy [24-29], in which a non-physical thermodynamic path is constructed to connect the two thermodynamic states in question. Due to that sufficient phase space overlap between the two states is the precondition for an accurate free energy estimation, a series of immediate states joining the two states with a slow varying coupling parameter $\lambda$ are usually imperative. From the perspective of computational cost, the end-point free energy methods are the most efficient approaches for calculating binding free energies, because they only consider the free and the bound states of system. However, the accuracy of these methods is always limited by the approximations adopted in their underground basis. The (alchemical-)pathway-based methods such as thermodynamic integration (TI) [30] and free energy perturbation (TP) are usually more accurate. However, they are much more time-consuming than the end-point approaches, because they require extensive sampling of some intermediate states in oder to obtain converged free energy changes along a transition path from the initial state to the final state [31].

All the equilibrium (EQ) free energy estimators, such as TI, TP, Bennett acceptance ratio (BAR) [32, 33] and its multiple state variant (Multistate BAR or MBAR) [34, 35], assume that equilibrium has been reached in each molecular dynamics simulation and the sampled conformations are uncorrelated. Rather than spending large computational costs on long equilibrium samplings, in 1997, Jarzynski proved that the equilibrium free energy difference can be extracted from a series of nonequilibrium (NE) trajectories, which has now become the well-known ''Jarzynski equality'' [36]. In our previous study [37], this extraordinary equality has been used to ''correct'' the solvation free energies of thirteen amino acid side chains analogs from a molecular mechanical (MM) level to a quantum mechanical/molecular mechanical (QM/MM) level. It has been found that when the phase space overlap between the two Hamiltonians is insufficient, the free energy difference between the MM Hamiltonian and the QM/MM Hamiltonian estimated by TP method becomes obscure. In contrast, averaging over many short nonequilibrium pulling simulations using the Jarzynski equality can yield significantly improved results. The performance of nonequilibrium MD simulations is determined by the transformation velocity between the thermodynamic states and the number of nonequilibrium simulation trajectories. With fixed total computational time, these two factors conflict with each other, and there must be a system-dependent balance between them that leads to the highest efficiency. However, in contrast to the free energy calculations based on equilibrium trajectories, nonequilibrium free energy methods are still in their infancy. Their performance in all aspects, for instance their performance in the calculations of relative binding free energy studied in this work, needs thorough investigations.

Calculating the binding free energy difference as the difference of absolute binding free energies (the vertical legs shown in FIG. 1) is not cost-effective, and quite often it may encounter numerical difficulties [38-44]. Instead, it can be calculated as the difference of two alchemical mutation free energies in the binding site and in the aqueous solution for the ligands [45-47]. In this work, the equilibrium and nonequilibrium simulations for the relative binding free energy calculations are compared in terms of their accuracy and convergence rate. Two well-studied systems were chosen. One is T4-lysozyme mutant L99A (L99A) as shown in FIG. 2. It is a good model system for the comparison of free energy methods [42, 43, 48, 49], due to its rigidity upon binding of benzene and phenol. Another receptor is cyclooxygenase-2 (COX-2), of which the structure is also shown in FIG. 2. This system is relatively more challenging [50-53]. Two ligands bound to COX-2 were studied. One ligand has a hydroxyl group, and the other has a methyl substituent.

FIG. 1 The thermodynamic cycle for the calculations of relative binding free energy of ligand A and ligand B to the same receptor.
FIG. 2 The structures of L99A/ligand complex and COX-2/ligand complex.
Ⅱ. METHOD

The relative binding free energy $\Delta \Delta G_{\rm{bind}}$ can be calculated by

$ \begin{equation} \Delta \Delta G_{\rm{bind}} = \Delta G_{\rm{bind}}({\rm{B}})-\Delta G_{\rm{bind}}(\rm{A}) \end{equation} $ (1)

where $\Delta G_{\rm{bind}}$(A) and $\Delta G_{\rm{bind}}$ (B) represent the binding free energy of ligand A and B bound to the receptor, respectively. As mentioned above, to calculate the difference in binding free energy directly as the difference in the absolute binding free energies may often encounter numerical difficulties, and the final results can be contaminated by variances of some intermediate quantities during the postprocessing that might be orders of magnitude larger than the final number. Fortunately, free energy is a state function, and the free energy difference between two states is independent of transformation path, and the total free energy change along a cycle is zero. Thereby, the relative binding free energy can be calculated as

$ \begin{equation} \Delta \Delta G_{\rm{bind}} = \Delta G_{\rm{pert}}({\rm{bound}})-\Delta G_{\rm{pert}}(\rm{free}) \end{equation} $ (2)

using the thermodynamic cycle depicted in FIG. 1. The process in the upper rung with a free energy change $\Delta G_{\rm{pert}}(\rm{bound})$ is the alchemical transformation of one ligand to another one in the binding pocket of the receptor, which is denoted by the ''bound'' state. In the bottom rung with free energy change $\Delta G_{\rm{pert}}(\rm{free})$, the transformation is in the solvent for the same ligands as in the upper rung, which is denoted by the ''free'' state. As in many other studies, each alchemical transformation is decomposed into three stages, which we denote as ''neutralization'', ''alchemy'', and ''charging''. In the neutralization stage, the partial charges of the perturbed atoms in state A are gradually turned off. In the alchemy stage, the interconversion of the Lennard-Jones parameters and bonded interactions take place, so that these interactions in state A is switched to those in state B. Softcore potential is employed to avoid the cusp in the gradient of the free energy change with respect to the coupling parameter when atoms appear or disappear. In the final stage, the charge neutral atoms in state B have their atomic charges growing up gradually. Inevitably, the two states in each transformation stage may have insufficient overlap in the phase space, which requires several intermediate states to mitigate the conversion. In each stage, both nonequilibrium and equilibrium simulations as schematically presented in FIG. 3 were performed in order to compare their respective performances, which is the central objective of this work.

FIG. 3 The schematic paradigms of the equilibrium simulations (up) and the nonequilibrium simulations (bottom).
A. Free energy estimators for equilibrium simulations

Within these stages, a series of intermediate states characterized by $H$($\lambda$) were introduced, where $\lambda$ is the coupling parameter. The free energy changes between two neighboring states can be estimated using BAR method [32, 54], of which a brief introduction is presented here. The free energy difference $\Delta G$ between two general states labeled as $i$ and $i$+1 can be estimated by solving the following two equations iteratively, which are the core formulas of BAR,

$ \begin{equation} \Delta G=\frac{1}{\beta}\ln \frac{\left< f(U_{i}-U_{i+1}+C)\right>_{i+1}}{\left< f(U_{i+1}-U_{i}-C)\right>_{i}}+C \label{Eq:BAR1} \end{equation} $ (3)

and

$ \begin{equation} C=\Delta G+\frac{1}{\beta}\ln \frac{N_{i+1}}{N_{i}} \end{equation} $ (4)

where $\beta$ is the inverse temperature, $f$ is the Fermi function $f(x)=1/[1+\exp(\beta x)]$ and $N_{i}$ and $N_{i+1}$ are the number of sampled configurations on the surfaces $U_{i}$ and $U_{i+1}$ respectively. Meanwhile, the variance $\delta^{2}$ of $\Delta G$ can be estimated as

$ \begin{eqnarray} \delta^{2}\hspace{-0.15cm}&=&\hspace{-0.15cm}\frac{1}{\beta^{2}N_{i}}\left[\frac{\langle f^{2}(U_{i+1}-U_{i}-C)\rangle_{i}}{\langle f(U_{i+1}-U_{i}-C) \rangle^{2}_{i}}-1\right]+ \nonumber \\ &&\frac{1}{\beta^{2}N_{i+1}}\left[\frac{\langle f^{2}(U_{i}-U_{i+1}+C)\rangle_{i+1}}{\langle f(U_{i}-U_{i+1}+C)\rangle_{i+1}^{2}}-1\right] \end{eqnarray} $ (5)
B. Free energy estimators for nonequilibrium simulations

In 1997, Jarzynski proved for the first time that the equilibrium free energy difference $\Delta G$ can be extracted from nonequilibrium simulations as an exponential average of work performed to the system during irreversible transformations between the two states [36, 55], viz.

$ \begin{equation} \exp(-\beta \Delta G)=\left<\exp(-\beta W)\right> \end{equation} $ (6)

Here, the angled bracket $\left<\cdots\right>$ denotes the average over an ensemble of nonequilibrium transformation processes initiated from an equilibrium ensemble. At each jump of the Hamiltonian from $H(\lambda(t))$ to $H$($\lambda$($t$+$\delta t$)), the instantaneous work done to the system is

$ \begin{equation} \delta W(t\rightarrow t+\delta t)=U({{\bf{r}}}(t), \lambda(t+ \delta t))-U({{\bf{r}}}(t), \lambda(t)) \end{equation} $ (7)

where $U$ is the potential energy of the system and is a function of both the coordinates and $\lambda$. The work $W$ in a single nonequilibrium trajectory is thus accumulated as [37, 56]

$ \begin{equation} W=\sum\limits_{t=0}^{\tau - \delta t}U\left({\bf{r}}(t), \lambda(t+ \delta t)\right)-U\left({\bf{r}}(t), \lambda(t)\right) \end{equation} $ (8)

The variance of $\Delta G$ can be calculated by

$ \begin{equation} \sigma^{2}=\frac{1}{\beta^2N}\left(\frac{\sigma_{x}}{\left<x\right>}\right)^2, \, \, \, {\rm{with}} \, \, x \equiv \exp\left(-\beta W\right) \end{equation} $ (9)

where $N$ is the number of nonequilibrium trajectories. The nonequilibrium simulation methods have been used in some free energy calculations [57-76]. Unfortunately, among these nonequilibrium trajectories, the ones which occur only rarely carry most of the statistical weight [57-60, 77]. Therefore, the severe sampling difficulty for large biomolecules often requires long-time computations, similar to TP [37, 54]. It has been suggested to alleviate this problem by combining forward (0$\rightarrow$1) and backward (1$\rightarrow$0) switching simulations [58]. Following the Crooks fluctuation theorem (CFT) [56, 78, 79], Bennetts acceptance ratio (BAR) is also applicable to NE calculations. Thus, bidirectional BAR method can be also used to overcome the severe sampling difficulty occurring in the unidirectional Jarzynski equality (or TP method). It is expected that the BAR method can converge faster than Jarzynski equality for nonequilibrium simulations. As a maximum likelihood method [80], BAR can be applied to the estimation of free energy differences via

$ \begin{eqnarray} \sum\limits_{i=1}^{n_{\rm{F}}}&&\hspace{-0.5cm}\frac{1}{1+\exp \left[\beta(M+W_{i}-\Delta G)\right]} \nonumber\\ &=&\hspace{-0.15cm}\sum\limits_{j=1}^{n_{\rm{R}}}\frac{1}{1+\exp \left[-\beta(M+W_{j}-\Delta G)\right]} \end{eqnarray} $ (10)

where $n_{\rm{F}}$ and $n_{\rm{R}}$ are the numbers of forward and backward trajectories respectively, $W_{{i}}$ and $W_{{j}}$ are the work of forward and backward measurements respectively, and $M$=$\beta^{-1}$ln($n_{\rm{F}}$/$n_{\rm{R}}$). The corresponding statistical variance of $ \Delta G $, $ \sigma^2 $, can be calculated using Eq.(10) in Ref.[80].

C. Molecular dynamics simulation details

For T4-lysozyme mutant L99A (L99A), two ligands (benzene and phenol) are studied. The initial structure of L99A is obtained from protein data bank (PDB ID: 181L). For cyclooxygenase-2 (COX-2), two celecoxib analogues are studied as the ligands. The initial structure of COX-2 is also obtained from protein data bank (PDB ID: 1CX2). The initial structures of the ligands were optimized at B3LYP/6-31G$^{**}$ level. The electrostatic potential (ESP) was calculated at HF/6-31G$^*$ level for the optimized geometries. The atomic charges were obtained by the restrained ESP fitting method (RESP) [81, 82] and all other MM parameters were taken from the general AMBER force field (GAFF) [83]. The proteins are modeled by AMEBR ff14SB [84] force field. The entire system was solvated in a periodic box of TIP3P water molecules with all the solute atoms no less than 15 Å away from the boundary of the unit cell. In the alchemy stage, the interconversion of Lennard-Jones parameters utilized the softcore potential, which is given by

$ \begin{eqnarray} V_{\rm{SC-vdW}}&=&4\epsilon_{ij}(1-\lambda)\cdot\nonumber\\ &&\left\{\frac{1}{\left[\alpha\lambda+\left(\displaystyle{\frac{r_{ij}}{\sigma_{ij}}}\right)^6\right]^2}- \frac{1}{\alpha\lambda+\left(\displaystyle{\frac{r_{ij}}{\sigma_{ij}}}\right)^6}\right\} \end{eqnarray} $ (11)

where $\epsilon_{ij}$ and $\sigma_{ij}$ are the common Lennard-Jones parameters, $r_{ij}$ is the distance between atom $i$ and $j$, and $\alpha$ is the parameter determining the softness of van der Waals potential. $\alpha$ was set to 0.5 in this work.

In equilibrium simulations, at each stage, eleven alchemical windows ($\lambda$=0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0) were used for the transformation. At each window, the energy of system was minimized using steepest decent method. The relaxed system was heated up to 300 K in 200 ps with a constant box volume. The system was then equilibrated for 500 ps followed by an isothermal-isobaric production simulation of 5 ns to 7 ns in length. The time step was set to 2 fs, and snapshots were saved every 2 ps for subsequent analysis. The temperature was regulated by Langevin dynamics [85] with a collision frequency of 1.0 ps-1. The pressure was regulated using Berendsen's barostat [86]. Particle mesh Ewald [87] with a cutoff of 13.0 Å in real space was utilized to calculate the long range Coulomb interaction. The van der Waals (vdW) interaction was truncated at 13.0 Å.

In nonequilibrium simulations, at each stage, one hundred forward transformation measurements and one hundred backward transformation measurements were combined to calculate the free energy changes. In each non-equilibrium trajectory, $\lambda$ changes from 0 to 1 in 100 jumps uniformly with an interval of 0.2 ps between two successive jumps of $\lambda$. Therefore, the total transformation time for each trajectory was 20 ps. The nonequilibrium simulations started with the coordinates and velocities saved during an equilibrium simulation of the initial state in a canonical ensemble. In the equilibrium simulation at initial state, the system was minimized using steepest decent method. Then the systems were heated up to 300 K in 1 ns and equilibrated for 1 ns followed by a 2-ns production simulation. The time step was set to 2 fs, and the snapshots (both coordinates and velocities) were saved every 20 ps. Other simulation conditions were kept the same as that in equilibrium simulations as described in the above paragraph.

All the MD simulations were carried out with the pmemd program in AMBER 14 package [88], and all the QM calculations were carried out with Gaussian 09 package [89].

Ⅲ. RESULTS AND DISCUSSION A. Relative binding free energies for L99A lysozyme system

As shown in FIG. 4, convergence of free energy calculations by the equilibrium and nonequilibrium methods is satisfactory. In FIG. 4(a), the calculated free energy difference at each stage can reach converged results in 5 ns for each window. Moreover, the standard deviations at each stage are in a range of 0.01-0.05 kcal/mol, which indicates that the precision of the calculated results is quite satisfactory. In fact, the results calculated in a 2 ns simulation for each window have already converged as can be seen in this figure. The calculated free energy difference at each stage, depicted as the black lines in FIG. 4(b), can also reach convergence with 100 nonequilibrium simulation trajectories. The standard deviations at each stage range in 0.01-0.07 kcal/mol, which have a similar precision to the equilibrium simulations. The free energy changes calculated by Jarzynski equality from the forward and backward nonequilibrium transformations are plotted in red and blue, respectively. It is well known to the community that unidirectional methods such as thermodynamic perturbation and the Jarzynski equality overestimate the free energy difference between the target Hamiltonian and the sampled Hamiltonian, mainly due to that in most cases the important region of the target Hamiltonian has not been well-sampled [57-60, 77]. Therefore, usually the free energy difference ($\Delta G$=$\Delta G_\rm{f}$) calculated from a forward NE calculation is too positive, while that from a backward one ($\Delta G$=$-\Delta G_\rm{b}$) will be too negative. It clearly shows in FIG. 4(b) that the bidirectional method, BAR, converges faster than both the forward and the backward methods, and the results from BAR are always between these two unidirectional methods.

FIG. 4 Convergence of the relative binding free energy calculations of the ligands to L99A by (a) the equilibrium and (b) the nonequilibrium simulations.

Therefore, all the data listed in Table Ⅰ are converged and reliable. As listed in Table Ⅰ, the free energy changes associated with the three transformation stages of the ''bound'' state calculated by the equilibrium simulations are -7.30, 4.57, and -34.34 kcal/mol, respectively. And those from the nonequilibrium simulations are -7.33, 4.53, and -34.33 kcal/mol, respectively. In the ''free'' state, the free energy changes are slightly weaker. The free energy changes in the neutralization, alchemy and charging stages are -6.45, 1.39, -33.85 kcal/mol from the equilibrium simulations, and those are -6.47, 1.36, and -33.92 kcal/mol from the nonequilibrium simulations. At all three stages, the free energy differences calculated by the nonequilibrium simulations are very close to those from the equilibrium simulations, indicating that the nonequilibrium simulations can yield statistically the same results as the equilibrium simulations.

Table Ⅰ The relative binding free energy (in kcal/mol) for the L99A system from the equilibrium and the nonequilibrium simulations.

The relative binding free energy for benzene and phenol binding into L99A system are (1.84$\pm$0.08) and (1.90$\pm$0.11) kcal/mol for the equilibrium and nonequilibrium simulations, which are also statistically identical. The experimental value is over 2.45 kcal/mol [48] and some other studies yielded (2.88$\pm$0.28) and (2.19$\pm$0.22) kcal/mol [42, 43, 49]. The results from different simulations depend on the specific force field parameters employed in those studies. Since the comparison of the relative binding affinity with other studies is not the concern of this work and we are only interested in the comparison between the equilibrium and nonequilibrium simulations methods, we will not look further into these numbers.

Listed in Table Ⅱ are the cost of wall clock time at each stage. All the calculations were done on servers containing two Intel Xeon CPU E5-2660 2.20 GHz CPU with 16 cores in total. For the equilibrium simulations, the wall clock time for the neutralization, alchemy and charging stages of the ''bound'' state are 97, 97, and 96 h, respectively. Before the nonequilibrium simulations, an equilibrium simulation is required to generate an ensemble of structures of the initial state as the starting structures for the nonequilibrium simulations. Therefore, the wall clock time used for the equilibrium simulation should also be taken into account as the cost of the nonequilibrium simulations. In the ''bound'' state, the initial equilibrium simulations took 3 h each. The total wall clock time of the neutralization, alchemy and charging stages of the ''bound'' state are 10 (3+7), 7 (3+4), and 7 (3+4) h, which are far less than that of the equilibrium simulation (shown in the parentheses are the wall clock time for the initial equilibrium simulation and the actual nonequilibrium simulation, respectively). In addition, the wall clock time of the neutralization, alchemy and charging stages of the ''free'' state respectively calculated by the equilibrium simulations are 5 h each, and those for the nonequilibrium simulations are 1.3 (0.4+0.9), 1.2 (0.4+0.8) and 0.9 (0.4+0.5) h, respectively. It shows again the nonequilibrium simulation methods are far more efficient than the equilibrium methods. The total time for the equilibrium and the nonequilibrium simulations are 305 and 27 h, respectively. The nonequilibrium simulation method yielded consistent results with comparable variances as the equilibrium simulation method using only about 9 percent of the wall clock time. In addition, while the explicit TIP3P water model was used in this work, implicit solvent models can be used to further enhance the efficiency of nonequilibrium sampling due to the reduction of dissipation as shown in a recent work of Liu et al. [90].

Table Ⅱ Wall clock time (hours) for the equilibrium and nonequilibrium simulations of the ligands binding to L99A Lysozyme. All the calculations were carried out using 16 cores of Intel Xeon CPU E5-2660 2.20 GHz.
B. Relative binding free energies for COX-2 system

As shown in FIG. 5, convergence of the free energy calculations by the equilibrium and nonequilibrium method is also satisfactory. In FIG. 5(a), the calculated free energy difference at each stage can reach converged results in 5 ns for each window. Moreover, the standard deviations at each stage range in 0.01-0.04 kcal/mol, which also indicates that the precision of free energy calculations is quite satisfactory. The calculated free energy difference at each stage, depicted as the black line in FIG. 5(b), can also reach convergence with 100 nonequilibrium simulation trajectories. The standard deviations at each stage range in 0.01-0.06 kcal/mol, which also has the similar precision as done by equilibrium simulations. The free energy changes calculated by Jarzynski equality from the forward and backward nonequilibrium transformations are plotted in red and blue, respectively. It can also been seen that all the results done by unidirectional nonequilibrium simulations can converge to the results from the analysis using the bidirectional BAR method, which combines forward and backward directions. The bidirectional BAR results indeed converge faster. Therefore, all the data listed in Table Ⅲ are reliable. The free energy changes associated with the three transformation stages of the ''bound'' state calculated by equilibrium simulations are 40.57, -11.49, and -27.75 kcal/mol, respectively. And those from the nonequilibrium simulations are 40.08, -11.09, and -27.80 kcal/mol. In the ''free'' state, the free energy changes in the neutralization, alchemy and charging stages are 39.61, -9.59, and -26.69 kcal/mol from the equilibrium simulations, and are 39.58, -9.59, and -26.72 kcal/mol from the nonequilibrium simulations. At each stage, the free energy difference calculated by the nonequilibrium simulation is also very close to that from the equilibrium simulation, which also indicates that the nonequilibrium simulations can yield statistically the same results as the equilibrium simulations. The relative binding free energy for ligand ''OH'' and ligand ''CH$_3$'' binding into COX-2 system are (-2.00$\pm$0.05) and (-2.08$\pm$0.08) kcal/mol for the equilibrium and nonequilibrium simulations, which are also statistically identical. The experimental value is less than -4.64 kcal/mol [50] and some other studies yielded (-2.72$\pm$0.58) and (-2.90$\pm$0.37) kcal/mol [51-53].

FIG. 5 Convergence of the relative binding free energy calculations of the ligands to COX-2 by (a) the equilibrium and (b) the nonequilibrium simulations.
Table Ⅲ The relative binding free energy (in kcal/mol) for the COX-2 system from the equilibrium and the nonequilibrium simulations.

Listed in Table Ⅳ are the wall clock time for each stage of the transformation using the same computers as above. For the equilibrium simulations, the wall clock time for the neutralization, alchemy and charging stages of the ''bound'' state respectively are all 235 h. For the nonequilibrium simulations, the wall clock time are 100 (45+55), 83 (37+46), and 17 (8+9) h, which are also far cheaper than the equilibrium simulations. For the equilibrium simulations of the ''free'' state, the wall clock time are around 17 h to 18 h for the neutralization, alchemy and charging transformations. Using the nonequilibrium simulations, the computer time reduces to 1.5 (0.5+1), 1.5 (0.5+1) and 2.3 (1.7+0.6) h. The total time for equilibrium and nonequilibrium simulations are 757 and 205 h, respectively, and the nonequilibrium simulation method saved about 73 percent of the wall clock time as compared to the equilibrium simulation method to reach the same uncertainties for the results.

Table Ⅳ Wall clock time (hours) for the equilibrium and nonequilibrium simulations of the ligands binding to COX-2. All the calculations were carried out using 16 cores of Intel Xeon CPU E5-2660 2.20 GHz.
Ⅳ. CONCLUSION

Computer-aided rational drug design is still facing the gap in spatiotemporal resolution between the experimental measurement and that can be afforded by computer simulations. With decades of improvement, equilibrium simulations-based methods have become the standard technique for the calculations of free energies. However, these methods are still too expensive, for instance when used in high-throughput screening. With the discovery of the Jarzynski equality, it becomes more and more clearly that equilibrium simulations methods are nothing but special cases of nonequilibirum methods, and the latter gradually attracts wide attention from the community. In this work, we applied both the equilibrium and the nonequilibrium simulation methods to the calculations of the relative binding free energies of two ligands to L99A Lysozyme and two ligands to COX-2. The results show that the nonequilibrium molecular dynamics simulations can yield statistically the same results as their equilibrium counterparts in the calculations of the relative binding free energies, but with higher efficiency in term of the wall clock time. The nonequilibrium simulation method saved about 91 percent and 73 percent of wall clock time as compared to the equilibrium method for L99A Lysozyme and COX-2, respectively. Moreover, the bidirectional analysis of the NE trajectories with BAR converges much faster than the unidirectional Jarzynski equality.

Ⅴ. ACKNOWLEDGMENTS

This work is supported by the National Natural Science Foundation of China (No.21773066). CPU time was provided by the Supercomputer Center of East China Normal University.

Reference
[1] P. A. Bash, U. C. Singh, R. Langridge, and P. A. Kollman, Science 236 , 564 (1987). DOI:10.1126/science.3576184
[2] W. L. Jorgensen, Acc. Chem. Res. 22 , 184 (1989). DOI:10.1021/ar00161a004
[3] P. Kollman, Chem. Rev. 93 , 2395 (1993). DOI:10.1021/cr00023a004
[4] A. Pohorille, C. Jarzynski, and C. Chipot, J. Phys. Chem. B 114 , 10235 (2010). DOI:10.1021/jp102971x
[5] C. D. Christ, and A. E. Mark, W. F. van Gunsteren, J. Comput. Chem. 31 , 1569 (2010).
[6] N. Hansen, W. F. van Gunsteren, J. Chem. Theory Comput. 10 , 2632 (2014). DOI:10.1021/ct500161f
[7] C. Chipot, A. Pohorille, and E ds, Free Energy Calculations:Theory and Applications in Chemistry and Biology, Springer Series in Chemical Physics, Vol. 86, Verlag Berlin Heidelberg:Springer (2007).
[8] W. L. Jorgensen, Acc. Chem. Res. 42 , 724 (2009). DOI:10.1021/ar800236t
[9] W. L. Jorgensen, Science 303 , 1813 (2004). DOI:10.1126/science.1096361
[10] L. Wang, Y. Wu, Y. Deng, B. Kim, L. Pierce, G. Krilov, D. Lupyan, S. Robinson, M. K. Dahlgren, J. Greenwood, D. L. Romero, C. Masse, J. L. Knight, T. Steinbrecher, T. Beuming, W. Damm, E. Harder, W. Sherman, M. Brewer, R. Wester, M. Murcko, L. Frye, R. Farid, T. Lin, D. L. Mobley, W. L. Jorgensen, B. J. Berne, R. A. Friesner, and R. Abel, J. Am. Chem. Soc. 137 , 2695 (2015). DOI:10.1021/ja512751q
[11] D. J. Huggins, W. Sherman, and B. Tidor, J. Med. Chem. 55 , 1424 (2012). DOI:10.1021/jm2010332
[12] C. Bissantz, B. Kuhn, and M. Stahl, J. Med. Chem. 53 , 5061 (2010). DOI:10.1021/jm100112j
[13] E. Yuriev, M. Agostino, and P. A. Ramsland, J. Mol. Recognit. 24 , 149 (2011). DOI:10.1002/jmr.v24.2
[14] E. Yuriev, and P. A. Ramsland, J. Mol. Recognit. 26 , 215 (2013). DOI:10.1002/jmr.v26.5
[15] J. Aqvist, C. Medina, and J. Samuelsson, Protein Eng. 7 , 385 (1994). DOI:10.1093/protein/7.3.385
[16] J. Aqvist, and J. Marelius, Comb. Chem. High Throughput Screening 4 , 613 (2001). DOI:10.2174/1386207013330661
[17] X. Jia, J. Zeng, J. Z. H. Zhang, and Y. Mei, J. Comput. Chem. 35 , 737 (2014). DOI:10.1002/jcc.v35.9
[18] N. Makri, J. Phys. Chem. B 103 , 2823 (1999). DOI:10.1021/jp9847540
[19] J. Srinivasan, T. E. Cheatham, P. Cieplak, P. A. Kollman, and D. A. Case, J. Am. Chem. Soc. 120 , 9401 (1998). DOI:10.1021/ja981844+
[20] N. Homeyer, and H. Gohlke, Mol. Inf. 31 , 114 (2012). DOI:10.1002/minf.v31.2
[21] T. Hou, J. Wang, Y. Li, and W. Wang, J. Chem. Inf. Model. 51 , 69 (2011). DOI:10.1021/ci100275a
[22] R. W. Zwanzig, J. Chem. Phys. 22 , 1420 (1954). DOI:10.1063/1.1740409
[23] B. J. Williams-Noonan, E. Yuriev, and D. K. Chalmers, J. Med. Chem (2017). DOI:10.1021/acs.jmedchem.7b00681(2017)
[24] S. Bruckner, and S. Boresch, J. Comput. Chem. 32 , 1303 (2011). DOI:10.1002/jcc.v32.7
[25] P. V. Klimovich, M. R. Shirts, and D. L. Mobley, J. Comput. Aided Mol. Des. 29 , 397 (2015). DOI:10.1007/s10822-015-9840-9
[26] M. R. Shirts, and V. S. Pande, J. Chem. Phys. 122 , 114107 (2005). DOI:10.1063/1.1861872
[27] F. M. Ytreberg, R. H. Swendsen, and D. M. Zuckerman, J. Chem. Phys. 125 , 184114 (2006). DOI:10.1063/1.2378907
[28] D. Rodriguez, E. Gomez-Darve, and A. Pohorille, J. Chem. Phys. 120 , 3563 (2004). DOI:10.1063/1.1642607
[29] H. Paliwal, and M. R. Shirts, J. Chem. Theory Comput. 7 , 4115 (2011). DOI:10.1021/ct2003995
[30] J. G. Kirkwood, J. Chem. Phys. 3 , 300 (1935). DOI:10.1063/1.1749657
[31] N. Homeyer, F. Stoll, A. Hillisch, and H. Gohlke, J. Chem. Theory Comput. 10 , 3331 (2014). DOI:10.1021/ct5000296
[32] C. H. Bennett, J. Comput. Phys. 22 , 245 (1976). DOI:10.1016/0021-9991(76)90078-4
[33] A. Hahn, and H. Then, Phys. Rev. E 80 , 031111 (2009). DOI:10.1103/PhysRevE.80.031111
[34] M. R. Shirts, and J. D. Chodera, J. Chem. Phys. 129 , 124105 (2008). DOI:10.1063/1.2978177
[35] M. R. Shirts, Reweighting from the Mixture Distribution as a Better Way to Describe the Multistate Bennett Acceptance Ratio, arXiv.org (2017).
[36] C. Jarzynski, Phys. Rev. Lett. 78 , 2690 (1997). DOI:10.1103/PhysRevLett.78.2690
[37] M. Wang, P. Li, X. Jia, W. Liu, Y. Shao, W. Hu, J. Zheng, B. R. Brooks, and Y. Mei, J. Chem. Inf. Model. 57 , 2476 (2017). DOI:10.1021/acs.jcim.7b00001
[38] D. Hamelberg, and J. A. McCammon, J. Am. Chem. Soc. 126 , 7683 (2004). DOI:10.1021/ja0377908
[39] S. Boresch, F. Tettinger, M. Leitgeb, and M. Karplus, J. Phys. Chem. B 107 , 9535 (2003). DOI:10.1021/jp0217839
[40] J. Wang, Y. Deng, and B. Roux, Biophys. J. 91 , 2798 (2006). DOI:10.1529/biophysj.106.084301
[41] M. Aldeghi, A. Heifetz, M. J. Bodkin, S. Knapp, and P. C. Biggin, Chem. Sci. 7 , 207 (2016). DOI:10.1039/C5SC02678D
[42] D. L. Mobley, A. P. Graves, J. D. Chodera, A. C. McReynolds, B. K. Shoichet, and K. A. Dill, J. Mol. Biol. 371 , 1118 (2007). DOI:10.1016/j.jmb.2007.06.002
[43] Y. Deng, and B. Roux, J. Phys. Chem. B. 113 , 2234 (2009). DOI:10.1021/jp807701h
[44] W. Liu, X. Jia, M. Wang, P. Li, X. Wang, W. Hu, J. Zheng, and Y. Mei, RSC Adv. 7 , 38570 (2017). DOI:10.1039/C7RA06215J
[45] M. R. Reddy, M. D. Varney, V. Kalish, V. N. Viswanadhan, and K. Appelt, J. Med. Chem. 37 , 1145 (1994). DOI:10.1021/jm00034a012
[46] M. R. Reddy, and M. D. Erion, J. Am. Chem. Soc. 123 , 6246 (2001). DOI:10.1021/ja0103288
[47] C. Oostenbrink, W. F. van Gunsteren, Proc. Natl. Acad. Sci. USA 102 , 6750 (2005). DOI:10.1073/pnas.0407404102
[48] A. Morton, W. A. Baase, and B. W. Matthews, Biochemistry 34 , 8564 (1995). DOI:10.1021/bi00027a006
[49] Y. Deng, and B. Roux, J. Chem. Theory Comput. 2 , 1255 (2006). DOI:10.1021/ct060037v
[50] T. D. Penning, J. J. Talley, S. R. Bertenshaw, J. S. Carter, P. W. Collins, S. Docter, M. J. Graneto, L. F. Lee, J. W. Malecha, J. M. Miyashiro, R. S. Rogers, D. J. Rogier, S. S. Yu, G. D. Anderson, E. G. Burton, J. N. Cogburn, S. A. Gregory, C. M. Koboldt, W. E. Perkins, K. Seibert, A. W. Veenhuizen, Y. Y. Zhang, and P. C. Isakson, J. Med. Chem. 40 , 1347 (1997). DOI:10.1021/jm960803q
[51] J. Michel, M. L. Verdonk, and J. W. Essex, J. Chem. Theory Comput. 3 , 1645 (2007). DOI:10.1021/ct700081t
[52] J. Michel, M. L. Verdonk, and J. W. Essex, J. Med. Chem. 49 , 7427 (2006). DOI:10.1021/jm061021s
[53] M. L. Plount Price, and W. L. Jorgensen, J. Am. Chem. Soc. 122 , 9455 (2000). DOI:10.1021/ja001018c
[54] X. Jia, M. Wang, Y. Shao, G. Konig, B. R. Brooks, J. Z. H. Zhang, and Y. Mei, J. Chem. Theory Comput. 12 , 499 (2016). DOI:10.1021/acs.jctc.5b00920
[55] C. Jarzynski, Phys. Rev. E 56 , 5018 (1997). DOI:10.1103/PhysRevE.56.5018
[56] G. Crooks, J. Statis. Phys. 90 , 1481 (1998). DOI:10.1023/A:1023208217925
[57] B. P. Cossins, S. Foucher, and C. M. Edge, J. W. Essex, J. Phys. Chem. B 113 , 5508 (2009). DOI:10.1021/jp803532z
[58] M. Goette, and H. Grubmuller, J. Comput. Chem. 30 , 447 (2009). DOI:10.1002/jcc.v30:3
[59] D. A. Hendrix, and C. Jarzynski, J. Chem. Phys. 114 , 5974 (2001). DOI:10.1063/1.1353552
[60] G. Hummer, J. Chem. Phys. 114 , 7330 (2001). DOI:10.1063/1.1363668
[61] F. M. Ytreberg, and D. M. Zuckerman, J. Chem. Phys. 120 , 10876 (2004). DOI:10.1063/1.1760511
[62] W. Lechner, H. Oberhofer, C. Dellago, and P. L. Geissler, J. Chem. Phys. 124 , 044113 (2006). DOI:10.1063/1.2162874
[63] P. Nicolini, D. Frezzato, and R. Chelli, J. Chem. Theory Comput. 7 , 582 (2011). DOI:10.1021/ct100568n
[64] H. Oberhofer, C. Dellago, and P. L. Geissler, J. Phys. Chem. B 109 , 6902 (2005). DOI:10.1021/jp044556a
[65] C. Oostenbrink, W. F. van Gunsteren, Chem. Phys. 323 , 102 (2006).
[66] M. Spichty, M. Cecchini, and M. Karplus, J. Phys. Chem. L 1 , 1922 (2010). DOI:10.1021/jz1005016
[67] P. S. Hudson, H. L. Woodcock, and S. Boresch, J. Phys. Chem. L 6 , 4850 (2015). DOI:10.1021/acs.jpclett.5b02164
[68] V. A. Ngo, I. Kim, T. W. Allen, and S. Y. Noskov, J. Chem. Theory Comput. 12 , 1000 (2016). DOI:10.1021/acs.jctc.5b01050
[69] I. Echeverria, and L. M. Amzel, J. Phys. Chem. B 116 , 10986 (2012). DOI:10.1021/jp300527q
[70] D. M. Zuckerman, and T. B. Woolf, Chem. Phys. Lett. 351 , 445 (2002). DOI:10.1016/S0009-2614(01)01397-5
[71] R. Chelli, S. Marsili, A. Barducci, and P. Procacci, Phys. Rev. E 75 , 050101 (2007).
[72] R. Chelli, S. Marsili, and P. Procacci, Phys. Rev. E 77 , 031104 (2008). DOI:10.1103/PhysRevE.77.031104
[73] R. Chelli, and P. Procacci, Phys. Chem. Chem. Phys. 11 , 1152 (2009). DOI:10.1039/b810914c
[74] P. Nicolini, P. Procacci, and R. Chelli, J. Phys. Chem. B 114 , 9546 (2010). DOI:10.1021/jp102263y
[75] P. Procacci, and C. Cardelli, J. Chem. Theory Comput. 10 , 2813 (2014). DOI:10.1021/ct500142c
[76] R. B. Sandberg, M. Banchelli, C. Guardiani, S. Menichetti, G. Caminati, and P. Procacci, J. Chem. Theory Comput. 11 , 423 (2015). DOI:10.1021/ct500964e
[77] C. Jarzynski, Phys. Rev. E 73 , 046105 (2006). DOI:10.1103/PhysRevE.73.046105
[78] G. Crooks, Phys. Rev. E 60 , 2721 (1999). DOI:10.1103/PhysRevE.60.2721
[79] G. Crooks, Phys. Rev. E 61 , 2361 (1999).
[80] M. R. Shirts, E. Bair, G. Hooker, and V. S. Pande, Phys. Rev. Lett. 91 , 140601 (2003). DOI:10.1103/PhysRevLett.91.140601
[81] C. I. Bayly, P. Cieplak, W. Cornell, and P. A. Kollman, J. Phys. Chem. 97 , 10269 (1993). DOI:10.1021/j100142a004
[82] P. Cieplak, W. D. Cornell, C. Bayly, and P. A. Kollman, J. Comput. Chem. 16 , 1357 (1995). DOI:10.1002/(ISSN)1096-987X
[83] J. Wang, R. M. Wolf, J. W. Caldwell, P. A. Kollman, and D. A. Case, J. Comput. Chem. 25 , 1157 (2004). DOI:10.1002/(ISSN)1096-987X
[84] J. A. Maier, C. Martinez, K. Kasavajhala, L. Wickstrom, K. E. Hauser, and C. Simmerling, J. Chem. Theory Comput. 11 , 3696 (2015). DOI:10.1021/acs.jctc.5b00255
[85] R. W. Pastor, B. R. Brooks, and A. Szabo, Mol. Phys. 65 , 1409 (1988). DOI:10.1080/00268978800101881
[86] H. J. C. Berendsen, J. P. M. Postma, W. F. van Gunsteren, A. DiNola, and J. R. Haak, J. Chem. Phys. 81 , 3684 (1984). DOI:10.1063/1.448118
[87] T. Darden, D. York, and L. Pedersen, J. Chem. Phys. 98 , 10089 (1993). DOI:10.1063/1.464397
[88] D. A. Case, J. T. Berryman, R. M. Betz, D. S. Cerutti, T. E. Cheatham, Ⅲ, T. A. Darden, R. E. Duke, T. J. Giese, H. Gohlke, A. W. Goetz, N. Homeyer, N. Izadi, P. Janowski, J. Kaus, A. Kovalenko, T. S. Lee, S. LeGrand, P. Li, T. Luchko, R. Luo, B. Madej, K. M. Merz, G. Monard, P. Needham, H. Nguyen, H. T. Nguyen, I. Omelyan, A. Onufriev, D. R. Roe, A. Roitberg, R. Salomon-Ferrer, C. L. Simmerling, W. Smith, J. Swails, R. C. Walker, J. Wang, R. M. Wolf, X. Wu, D. M. York, and P. A. Kollman, AMBER 2014, San Francisco:University of California (2014).
[89] M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria, M. A. Robb, J. R. Cheeseman, G. Scalmani, V. Barone, B. Mennucci, G. A. Petersson, H. Nakatsuji, M. Caricato, X. Li, H. P. Hratchian, A. F. Izmaylov, J. Bloino, G. Zheng, J. L. Sonnenberg, M. Hada, M. Ehara, K. Toyota, R. Fukuda, J. Hasegawa, M. Ishida, T. Nakajima, Y. Honda, O. Kitao, H. Nakai, T. Vreven, J. A. Montgomery, J. E. Peralta, F. Ogliaro, M. Bearpark, J. J. Heyd, E. Brothers, K. N. Kudin, V. N. Staroverov, T. Keith, R. Kobayashi, J. Normand, K. Raghavachari, A. Rendell, J. C. Burant, S. S. Iyengar, J. Tomasi, M. Cossi, N. Rega, J. M. Millam, M. Klene, J. E. Knox, J. B. Cross, V. Bakken, C. Adamo, J. Jaramillo, R. Gomperts, R. E. Stratmann, O. Yazyev, A. J. Austin, R. Cammi, C. Pomelli, J. W. Ochterski, R. L. Martin, K. Morokuma, V. G. Zakrzewski, G. A. Voth, P. Salvador, J. J. Dannenberg, S. Dapprich, A. D. Daniels, O. Farkas, J. B. Foresman, J. V. Ortiz, J. Cioslowski, and D. J. Fox, Gaussian 09, Revision B.01, Wallingford, CT:Gaussian, Inc. (2010).
[90] H. Liu, F. Chen, H. Sun, D. Li, and T. Hou, J. Chem. Theory Comput. 13 , 1827 (2017). DOI:10.1021/acs.jctc.6b01139
平衡和非平衡方法计算相对结合自由能的精度和效率比较
李鹏飞a, 贾相瑜a,b, 王美婷a, 梅晔a,b,c     
a. 华东师范大学物理与材料科学学院精密光谱科学与技术国家重点实验室, 上海 200062;
b. 华东师范大学-上海纽约大学计算化学联合中心, 上海 200062;
c. 俄克拉荷马大学化学和生物化学系, 诺曼 73019
摘要: 本文比较了平衡分子动力学模拟的自由能计算方法和非平衡动力学模拟方法计算相对结合自由能时的效率和精度.研究的体系包括T4溶菌酶的L99A变异体和COX-2蛋白.对于每个体系,分别研究了两个配体.结果表明非平衡动力学模拟方法可以获得与平衡动力学模拟相同的精度,然而收敛速度却快得多.并且,双向的Bennett接受率方法收敛速度比基于Jarzynski等式的单向方法更快.
关键词: 自由能    平衡动力学模拟    非平衡动力学模拟    收敛速率    准确度