Chinese Journal of Chemical Physics  2019, Vol. 32 Issue (3): 277-286

#### The article information

An-hui Wang, Zhi-chao Zhang, Guo-hui Li

Advances in Enhanced Sampling Molecular Dynamics Simulations for Biomolecules

Chinese Journal of Chemical Physics, 2019, 32(3): 277-286

http://dx.doi.org/10.1063/1674-0068/cjcp1905091

### Article history

Received on: May 10, 2019
Accepted on: June 5, 2019
Advances in Enhanced Sampling Molecular Dynamics Simulations for Biomolecules
An-hui Wanga,b , Zhi-chao Zhangb , Guo-hui Lia
Dated: Received on May 10, 2019; Accepted on June 5, 2019
a. Laboratory of Molecular Modeling and Design, State Key Laboratory of Molecular Reaction Dynamics, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China;
b. State Key Laboratory of Fine Chemicals, School of Chemistry, Dalian University of Technology, Dalian 116024, China
Abstract: Molecular dynamics simulation has emerged as a powerful computational tool for studying biomolecules as it can provide atomic insights into the conformational transitions involved in biological functions. However, when applied to complex biological macromolecules, the conformational sampling ability of conventional molecular dynamics is limited by the rugged free energy landscapes, leading to inherent timescale gaps between molecular dynamics simulations and real biological processes. To address this issue, several advanced enhanced sampling methods have been proposed to improve the sampling efficiency in molecular dynamics. In this review, the theoretical basis, practical applications, and recent improvements of both constraint and unconstrained enhanced sampling methods are summarized. Furthermore, the combined utilizations of different enhanced sampling methods that take advantage of both approaches are also briefly discussed.
Key words: Enhanced sampling    Umbrella sampling    Replica exchange    Metadynamics    Accelerated molecular dynamics
Ⅰ. INTRODUCTION

With the recent advances in graphics processing unit (GPU) and high-performance computers, molecular dynamics (MD) simulation has already become a powerful computational tool for investigating biomolecules, with applications in various biological processes such as protein folding, enzyme reactions, ligand binding/unbinding, and functional conformational transition [1-5]. These simulations can provide atomic insights into the temporal evolution of molecular systems, thus complement experiments in verifying existing mechanisms, revealing missing details and even proposing new hypotheses which can be conversely examined by further experiments [6-10].

Despite its success in a set of applications, MD simulations are still limited by the efficiency of conformational sampling [11, 12]. MD simulations have always been viewed as a general sampling method to capture the conformational transition details of biomolecules [13]. Generally, complicated biomolecular systems are known to have rough energy landscapes, with numerous metastable states separated by high-energy barriers, making the system trapped in an energy well which is difficult to jump out of in common straightforward MD simulations (see FIG. 1 for details) [14]. In these situations, the biological processes of interest usually take place over timescales of milliseconds or beyond, and make it hard for MD simulations to cross the energy barrier to visit new states, not to speak of exploring the conformational dynamics involved in biological functions.

 FIG. 1 The free energy landscape along the conformational transition coordinate defines the timescales of the biological process. Figure adapted with permission from Ref.[15], copyright 2007 Springer Nature.

To facilitate the crossing of energy barriers and extend the sampling timescales of conventional MD (cMD) simulations, various enhanced sampling algorithms have been proposed. Depending on whether explicit constraints are applied to the collective variables (namely the reaction coordinates), these sampling methods can be roughly divided into two groups: constraint enhanced sampling and unconstrained enhanced sampling. In this review, we present a brief overview of several such sampling methods including their theoretical basis, practical applications and recent improvements for biomolecules.

Ⅱ. CONSTRAINT ENHANCED SAMPLING

To improve the efficiency in sampling conformations of interest, a bias potential as a function of selected collective variables (CVs) can be added to the simulated system to constrain it into a desired CVs space. Generally, these selected CVs provide a coarse-grained description of this system and are expected to describe the transitions between all metastable states in full phase space. The selection of CVs ranges from simple atomic positions to complicated ones such as native contacts or secondary structures of the system. More details of selecting CVs have been discussed in recent contributions [16-18]. There are a variety of ways to add bias potential to the system to achieve an efficient conformation sampling, including (but not limited to) umbrella sampling, metadynamics, and steered molecular dynamics. In this part, some advances of these methods will be reported.

A. Umbrella sampling

Because of its quick convergence and accurate estimation of the potential of mean force (PMF), the umbrella sampling introduced by Torrie and co-workers [19] serves as one of the most historically important enhanced sampling approaches. As seen in FIG. 2, the umbrella sampling simulation is usually performed along a series of equally spaced windows that connect the two endpoints of selected CVs. For each window $i$, a simple harmonic potential $\delta$$U_i$($x$) depending on the CV $\theta$($x$) and a reference point $\theta_i$ in the CV space is added to the system's Hamiltonian:

 FIG. 2 Separation of the simulating windows along CVs in umbrella sampling. The endpoints A and B are defined as two minima on the two-dimensional free energy landscape. Figure adapted with permission from Ref.[24], copyright 2011 John Wiley and Sons.
 $\delta U_i (x)=\frac{k}{2}\left(\theta(x)-\theta_i\right)^2$ (1)

Once finished, simulations of all windows can be integrated using the Weighted Histogram Analysis Method [20] (WHAM), the Umbrella Integration method [21] (UI), or the multistate Bennett acceptance ratio method [22, 23] (MBAR) to recover the unbiased free energy landscapes.

Although the global free energy profiles can be reconstructed by integrating the individual windows, this approach also has its inherent drawback that the sampling results of each independent window are extremely limited to its predefined window center. To address this issue, several adaptive umbrella samplings (AUS) have been proposed to alter the initial set of window centers during the simulation [25-27]. Dasgupta and co-workers [28] developed the virtual-system adaptive umbrella sampling (VAUS) in which the real system can be coupled with an arbitrary virtual system composed of virtual states. This technique was then applied to the simulation of a two Ace-(Ala)$_5$-Nme peptides system, and broader conformational space including bound states, strongly bound states, and unbound states was successfully covered. Zacharias and co-workers [29] combined Hamiltonian replica-exchange with the umbrella sampling to improve the convergence of PMF calculations in which the configurations are allowed to exchange between different simulation windows. Its application to a DNA-ligand binding system suggested both improved convergence and more accurate binding free energy calculations compared to standard umbrella sampling.

Benefiting from its quick convergence and easy expansibility, umbrella sampling has achieved great success in the simulation of biomolecules [30-33]. For example, in a recent contribution using umbrella sampling to restrain the motions of Na$^+$ and K$^+$ at some dispersed positions along the gA channel [34], the authors successfully estimated free energy profiles of ions permeating through the channel with the AMOEBA polarizable force field and predicted conductance results that are in excellent agreement with experiments.

Metadynamics (MetaD) was first developed by Parrinello and co-workers [35] to improve the sampling of rare events by imposing a history-dependent bias potential on the system to discourage visiting those already explored regions in the free energy landscape (see FIG. 3 for details). The bias potential $\Delta V(\theta(x), t)$ is expressed as a sum of Gaussians deposited in the already visited CV space:

 FIG. 3 Schematic representation of the metadynamics protocol. In metadynamics, the simulated system (shown as a purple dot) can jump out of the energy well following the Gaussians deposits (a). The deposits are continued until all energy wells are filled with Gaussians (b). Then the free energy profiles can be reconstructed by reversing the sum of deposited bias (c). (Figure adapted with permission from Ref.[37], copyright 2014 Royal Society of Chemistry).
 $\begin{eqnarray} \Delta V(\theta(x), t)&&=\nonumber\\ &&{\sum\limits_{{t^{'}} < t}} h(t') \textrm{exp}\left(\frac{-1}{2\omega^2}||\theta (x(t))\hspace{-0.05cm}-\hspace{-0.05cm}\theta(x(t'))||^2 \right) \end{eqnarray}$ (2)

where $h$($t'$) and $\omega$ is the height and width of each Gaussian function. The previous time points $t'$=$\tau$, $2\tau$, $\cdots$ are defined by the time interval $\tau$ between successive Gaussians deposits. The height and width of the Gaussian define the energy deposition rate and the details needed to resolve the free energy surface features, respectively. Additionally, the bias potential asymptotically converges to the negative of the free energy, thus the free energy profiles can be reconstructed by reversing the deposited bias [36]:

 $\begin{eqnarray} \textrm{lim} _{t\rightarrow \infty} \Delta V(\theta(x), t)=-F(x)+C \end{eqnarray}$ (3)

where $C$ is a constant. The deposited Gaussian bias potential leads to the high sampling efficiency of MetaD in which the energy barriers between separate conformational states can be easily overcome.

Meanwhile, oscillations of the deposited bias potential can also lead to difficulties in achieving simulation convergence. To address this problem, the well-tempered metadynamics [38, 39] (WTMetaD) was developed, in which the heights of the deposited Gaussians are rescaled at each step. This technique indeed ensures more smooth convergence of the bias potential. Another advantage of this approach is that the smaller bias growth reduces the dependence of the initial parameter settings of Gaussian height and width. On the basis of WTMetaD, Parrinello and co-workers [40] proposed an improved algorithm in which the widths of the Gaussians are self-adaptive based on the time evolution of CVs or geometrical coordinates. This approach is shown to significantly improve the convergence speed, especially when CVs are nonlinear functions of the atomic coordinates. More recently, parallel bias metadynamics [41, 42] (PBMetaD) was developed to accelerate the crossing of high barriers that characterize multidimensional free-energy profiles by simultaneously applying multiple parallel bias potentials in low dimensionality.

Not only thermodynamic properties but also long-time scale kinetics of studied systems can be accurately predicted through the newly proposed infrequent metadynamics [43] (InMetaD) and Frequency-Adaptive Metadynamics [44] (FaMetaD) strategies at a fixed computational cost. Furthermore, apart from these adaptations in modifying the deposited bias potential, other methods aimed at obtaining better convergence by selecting more reasonable CVs have also been discussed in recent articles [16-18].

MetaD is one of the most attractive enhanced sampling methods whose popularity continues to grow, and it has been successfully adopted in a wide range of biomolecule systems, including evaluations of drug molecules unbinding kinetics [45-47], predictions of protein-ligand binding pose [48], refinements of protein-protein complex [49], transfers of proton-coupled electron [50], and designs of selective cyclic pentapeptide [51].

C. Steered molecular dynamics

Steered molecular dynamics [52] (SMD) is inspired by the atomic force microscopy [53] (AFM) experiments. In SMD simulations, a harmonic external force $f$ with predefined direction $\boldsymbol{n}$ is applied to guide the motion of specific atoms:

 $\begin{eqnarray} {f}=-1/2 k\nabla[vt-(\boldsymbol{r}-\boldsymbol{r}_0 )\cdot \boldsymbol{n}]^2 \end{eqnarray}$ (4)

where $k$ is the harmonic constant, $v$ is the velocity of the pulling force, $\boldsymbol{r}$ and $\boldsymbol{r}_0$ are the current and initial positions of the pulled atoms. It should be noted that SMD is a nonequilibrium simulation method, and the two parameters harmonic constant and pulling velocity are of crucial importance to the simulation results. During the simulation, the selected atoms move smoothly along the predefined direction so as to enable enhanced sampling of conformational transition. The movement of the pulled atoms will generally cause some steric hindrance when these atoms collide with adjacent atoms. To optimize the pulling direction, the minimal steric hindrance [54] (MSH) method can be used to quantify the steric hindrance contributed by each atom once a pulling direction is given. Then the optimal direction will be approximatively accessible when the minimal steric hindrance is obtained.

The SMD framework is widely adopted in the studies of ligand's dissociation process from its binding pocket in which the PMF along the dissociation pathway can be evaluated using the external force or work applied during the pulling process [55-59]. For example, SMD simulation has been performed to compare the binding characteristics of five structurally related flavonoids to the protein FabZ [60]. Similar to single-molecule pulling experiments, the force required to pull out the compound from its binding pocket was calculated. Interestingly, the force profiles derived from SMD could distinguish compounds with stronger binding affinity from those with weaker binding affinity. Based on this simulation results, a new compound was designed and confirmed with biological activity by subsequent experiments. This paradigm distinctly demonstrates the role of SMD in computer-aided drug design. Apart from the binding/unbinding of different ligands, SMD is also extensively applied in identifying the sequence dependence of hydrophobic interactions in peptide systems and the self-assembly of peptide amphiphile (PA) nanofiber [61, 62].

Ⅲ. UNCONSTRAINED ENHANCED SAMPLING

Although constraint enhanced sampling techniques can notably extend the timescales of MD simulations, they require predefined CVs to precisely describe the actual biological process and usually suffer from hidden energy barriers. In real MD simulation runs, these CVs are usually not straightforward to determine, especially when the transition process to be studied is too complicated. See FIG. 4 as an example, there are one basin along the $x$ direction and two basins along the $y$ direction. If only $x$ is predefined as the CV, the convergence of $y$ cannot be guaranteed even though $x$ has been well sampled. Therefore, much valuable information might be missing when the transition along $y$ is crucial for the system. To address this "hidden barrier" issue, unconstrained enhanced sampling methods, including replica-exchange MD, accelerated MD, random acceleration MD, and the adaptive biasing force method are also widely adopted in investigating complicated conformational transitions. Compared to constraint enhanced sampling, these approaches don't need predefined CVs, thus require no a priori knowledge of the biomolecules to be studied. In this part, a brief overview of these unconstrained enhanced sampling methods will be provided.

 FIG. 4 An example of a hidden energy barrier along the $y$ direction.
A. Replica-exchange molecular dynamics

Replica-exchange molecular dynamics [63] (REMD) is also known as parallel tempering. This method originates from the efforts in improving the Monte Carlo framework [64]. In REMD, several independent simulations at different temperatures are performed in parallel, and the system configurations are exchanged with two neighboring replicas depending on their temperature and potential energy. During a simulation, the exchange is attempted at a regular frequency and only accepted when the Metropolis criterion is satisfied. The exploration of conformation space is thus enhanced by transferring configurations which are only accessible in higher temperature replicas into lower room temperature replicas.

It should be noted that the number of replicas and the range of temperatures needed for REMD are often positively correlated with the size of the simulated system. It means much more computational cost is required for complicated biomolecules. Fortunately, many efforts have been made to further improve the efficiency of the standard REMD since its initial application to the folding of a pentapeptide. In a more general form of REMD, the Hamiltonians instead of the configurations are exchanged, and the reversible folding/unfolding behavior of two medium-sized proteins can be observed with a limited number of replicas [65-67]. Furthermore, in a more efficient approach named replica-exchange with solute scaling [68, 69] (REST2), all replicas are set at the same temperature while the interaction energy between solute-solute and solute-solvent are scaled by different factors. This method is believed to be more suitable for large protein-water systems especially when striking conformational energy changes are involved in the proteins. Other variants of traditional REMD include surface-tension replica-exchange [70], constant pH replica-exchange [71], $\lambda$-REMD [72], replica-exchange umbrella sampling [73], multiplexed-REMD [74], and ab initio replica-exchange [75].

Since REMD and its adaptations allow unconstrained enhanced sampling without setting specific CVs to use them, many applications of these methods exist, including the studies of protein folding [3, 76], protein structural ensemble generation [77, 78], protein-protein recognition mechanism [79], peptide-nanomaterials interaction [80, 81], binding/unbinding kinetics [82], RNA structural dynamics [83], and polymers assembly [84].

B. Accelerated molecular dynamics and adaptations

Accelerated molecular dynamics (aMD) introduced by the McCammon group [85, 86] is another promising enhanced sampling method with no predefined CVs required. Hundreds-of-nanosecond aMD simulations are reported to match millisecond-timescale conformational dynamics for a variety of biomolecules [87-90]. For example, in a benchmark study of bovine pancreatic trypsin inhibitor (BPTI, the first protein simulated by MD technique) [89], all five long-lived structures sampled in 1 ms long unbiased cMD can be captured by 500 ns aMD simulation while 500 ns cMD simulation is still trapped within a local conformational space around the crystal structure (see FIG. 5 for details).

 FIG. 5 The projection of the first two principal components (PC1, PC2) for 1 ms cMD (a), 500 ns cMD (b), and 500 ns aMD (c) simulations. The crystal structure of BPTI is labeled as red diamond, and the long-lived structures sampled in 1 ms cMD are labeled as triangles. Figure reprinted with permission from Ref.[89], copyright 2012 American Chemical Society.

aMD works by adding a boost potential to the minima of potential energy surface to enable the crossing of energy barriers between different conformational states. The boost potential $\Delta V$($x$) is triggered when the system's potential energy gets below the threshold energy $E$ and its magnitude is determined using the following equation:

 $\begin{eqnarray} \Delta V({x})= \left( {\begin{array}{*{20}c} \displaystyle{\frac{{{{(E - V(x))^2}}}}{{{{\alpha + (E - V(x))}}}}{\rm{, }}} & {{{V(x) < E}}} \\ {0, } & {{{V(x)}} \ge {{E}}} \\ \end{array}} \right) \end{eqnarray}$ (5)

where $V(x)$ is the potential energy of state $x$, $\alpha$ is a predefined acceleration factor that controls the flatness of the final potential energy. In general, the boost energy can either be implemented on the torsional terms of the potential or the whole potential or even both of them [89, 90]. To facilitate the sampling of specific conformational transition such as the motion of flexible loops, the boost potential can also be added to the loop regions of the biomolecule [91].

After the simulation biased by aMD, the original free energy profiles can be recovered by multiplying a Boltzmann factor $\textrm{e}^{\Delta V(x)/k_\textrm{B} T}$ of the boost potential imposed on each conformational state. Because states with lower boost potential contribute little to the Boltzmann factors, the free energy reweighting is typically dominated by a very few states with higher boost potential (conformational states with lower potential energy). In real simulations, as seen in FIG. 6, aMD simulations with higher enhanced sampling levels (smaller $\alpha$ values) sample only very few low-potential conformations. It will lead to observable energetic fluctuations and statistical noises especially when the degree of freedom of the studied biomolecules increases [90, 92, 93].

 FIG. 6 The distribution of potential energy obtained from simulations of cMD, aMD, and IaMD. Simulations of aMD with higher enhanced sampling levels (smaller $\alpha$ values) sample few low-potential conformations, while IaMD provides a uniform sampling in broader potential regions. Figure reprinted with permission from Ref.[93], copyright 2018 American Chemical Society.

To reduce the statistical noise, several enhancements including adaptions to both the reweighing procedure and the boost potential have been proposed. For the reweighing procedure, the exponential term in Boltzmann factors can be replaced by a summation of the Maclaurin series expansion of the boost potential. This form of Maclaurin series expansion is shown to produce PMF profiles with less noise than those obtained from an exponential average method [89]. In a recent contribution, the reweighting accuracy can be further improved by a cumulant expansion to the second-order [94].

The statistical noise can also be reduced by altering the form of the boost potential. For example, in Gaussian accelerated molecular dynamics [95] (GaMD), the original boost potential was modified into a harmonic function of the difference between the threshold energy $E$ and the potential energy of current state $V$($x$). GaMD can integrate various statistical information of the potential energy such as its mean, minimum, maximum and standard deviations values, thus shows more efficiency in identifying the protein-folding and ligand-binding pathways [95, 96].

More recently, the integrated accelerated molecular dynamics (IaMD) which integrates several aMD with varying levels of acceleration into a single boost potential was proposed by Dr. Li's group [93]. Compared to the original aMD, IaMD can provide a uniform sampling in broader energy regions especially in the low-energy regions that are inaccessible in standard aMD (see FIG. 6 for details). This method is demonstrated to markedly improve the sampling efficiency and maintain the reweighting accuracy simultaneously in simulating protein's conformational transitions [91, 93].

There are some other enhancements to the standard aMD scheme, including the windowed aMD [97], windowed replica-exchange aMD [98], adaptive aMD [99], accelerated adaptive integration method [100], and ab initio aMD [101]. More detailed discussions of the development of aMD can also be found in other review articles [92, 102].

Unlike REMD, aMD needs only one single replica of the system, and more successful applications have been achieved with this method. These applications cover a broad area including binding/unbinding of ligands [103], conformational transitions of peptides and proteins [5, 104], and the adsorption of proteins on nanometer materials [105, 106].

C. Random acceleration molecular dynamics

Random acceleration molecular dynamics [107] (RaMD) is another effective enhanced sampling scheme used for characterizing the ligand unbinding from the buried binding pocket of proteins. In contrast with SMD, RaMD works by imposing a small expulsion force with random direction on the center of mass of the ligand. Therefore, the ligand's exploring capacity to the protein space will be enhanced under tractions of the expulsion force. During the simulation, the force direction will vary stochastically whenever the movement along initial direction causes conspicuous steric hindrance. In general, the expulsion force continues within the timescale of nanoseconds, allowing further comprehension of the structure-affinity relationship (SAR) of druggable molecules.

Compared with SMD, RaMD requires no pre-definition of ligand dissociation pathway and thus enables detecting alternative ligand release pathway. However, careful selections of the force magnitude and threshold distance that are used to determine whether the force direction should be changed, are still needed [108]. Improper selection of force might lead to unexpected conformational changes in protein structure, and specific parameters should be set for different biomolecules [109-111].

Since RaMD does not require a pre-defined expulsion direction, the dissociation channels derived by this method remain unbiased [112]. RaMD approach has been widely used to explore the complicated ligand unbinding mechanism which provides detailed insights into the ingress/egress routes of drug molecules [113-116].

D. Adaptive biasing force method

Adaptive biasing force (ABF) method proposed by Pohorille and co-workers [117, 118] is based on the thermodynamic integration. In ABF, insufficient sampling is considered as a result of the impediment from the mean force along the CV (i.e., the negative of the gradient of the potential). During ABF simulation, the instantaneous mean force is calculated and recorded on-the-fly, and an external biasing force is adaptively applied to counter against it to encourage the crossing of energy barriers. Subsequently, free energy can be derived by integrating the force along the CV. The mean force gradually converges to the average force at equilibrium, so does the free energy. The procedure of calculating mean force and applying biasing force is shown in FIG. 7. To improve ABF's feasibility when high-dimensional CVs are involved in the practical applications, several enhancements of this method including the eABF [119, 120], gABF [121], and meta-eABF [122] have been proposed.

 FIG. 7 Schematic representation of the ABF method. The purple dots represent different states along the CV. The blue and red arrows represent instantaneous mean forces and external biasing forces, respectively.

Compared with many other enhanced sampling methods, ABF is physically intuitive and depends little on specific information about the free energy profiles. As a consequence, ABF is widely applied in protein-ligand binding [1, 123], the host-guest system [124], the conformational transition of peptides and proteins [4, 125, 126], and the membrane permeability of small molecules [127, 128].

Ⅳ. COMBINATION OF ENHANCED SAMPLING METHODS

What we have depicted above highlights the features of various popular enhanced sampling schemes. For practical purposes, some of the aforementioned enhanced sampling algorithms can be combined in one simulation to further optimize the computational efficiency. For example, REMD simulations can improve the sampling of all degrees of freedom using multiple replicas whereas the number of replicas increases rapidly as the system size gets larger. On the other hand, MetaD is generally used to enhance the sampling along predefined CVs. To take advantage of both replica-exchange and MetaD, these two computational schemes are combined in bias-exchange metadynamics [129] (BE-MetaD). In BE-MetaD, multiple metadynamics simulations are performed in parallel replicas at the same temperature. Bias potentials are imposed on different CVs in different replicas. Configuration exchange between different replicas is attempted at regular intervals according to the bias potentials. Ultimately, a set of one-dimensional free energy surfaces recovered from each replica can be integrated to construct the multi-dimensional free energy surface of the studied system [130]. Due to the simultaneous multi-dimensional biasing acting on the same system, BE-MetaD markedly promotes MD's power in sampling complicated conformational transitions and its efficiency has been well validated by numerous applications [131-134].

There are also some other successful approaches in which diverse enhanced sampling methods are coupled to explore broader conformational space including replica-exchange metadynamics [135], meta-eABF [122], replica-exchange US [29], replica-exchange with collective variable tempering [136], replica averaged metadynamics [137], and MetaITS [138]. More details about their theoretical backgrounds, development process, and latest applications are reviewed in other articles [30, 83, 139, 140].

Ⅴ. PROGRAMS AND PACKAGES

Many programs are under active development with the popularity of MD simulations, some of which such as Amber [141], Gromacs [142], NAMD [143], and OpenMM [144] are predominantly used for biomolecules. These long-accepted programs have already implemented several enhanced sampling methods. For example, Amber supports the usage of US, REMD, SMD, aMD, GaMD, RAMD, replica-exchange US, targeted MD [145], and self-guided Langevin dynamics [146]. For simulations with complicated reaction coordinates, the colvars module in NAMD is an excellent choice to define CVs and impose bias potentials during the MD run. Furthermore, a flexible and modularized plugin named plumed [147] can also be incorporated into most MD programs to analyze or bias the MD simulation on-the-fly.

Ⅵ. SUMMARY AND OUTLOOK

In this review, we summarized the current enhanced sampling methods used to extend the timescales of standard MD simulations. Both constraint and unconstrained enhanced sampling methods are briefly overviewed with the focus on their origins, development process, and latest improvements. Although the efficiency of these methods has been continually demonstrated by numerous studies mentioned earlier in this review, there are still some challenges in bridging the timescale gaps between MD simulations and complicated biological processes. More and more biomolecules with millions of atoms have been resolved with the recent developments of cryo-EM [148-151], indicating the requirements of longer timescales and more massive computational cost in MD simulations. However, the intrinsic nature of conventional enhanced sampling methods such as the demand for multiple replicas in REMD further increases MD's computational expense. As a consequence, standard enhanced sampling methods should be coupled with other advanced techniques such as multiple time-stepping methods [152] and coarse-grained models [153-155] to further improve their sampling efficiency for extremely large-scale biomolecular systems. On the other hand, the precision of force fields is another critical factor that affects the accuracy of MD simulations. The most significant advantage of enhanced sampling can only be achieved when these methods are in conjunction with high-precision force fields [156]. As reported by Shaw and co-workers, none of the state-of-the-art force fields can simultaneously provide accurate descriptions for both folded and disordered proteins [157]. It suggests the essential urgency of developing higher-precision force fields besides the development of enhanced sampling methods. Therefore, with the help of other advanced techniques and higher-precision force fields, MD simulations with enhanced sampling methods will continue to provide more useful and reliable information for the computational modeling of biomolecules.

Ⅶ. ACKNOWLEDGMENTS

This work was supported by the National Natural Science Foundation of China (No.31700647, No.21625302, and No.21573217).

Reference