Analyse ResultsΒΆ
Example on how to analyse Gromacs results of non-equilibrium free energy
calculations. For equilibrium results, you can analyse the data with the Gromacs
gmx bar
command, or with the alchemlyb library.
The data you need for estimating free energy differences is contained in the
dgdl.xvg
files that Gromacs outputs during a non-equilibrium simulation.
The free energy difference can then be estimated using the estimators available in
pmx:
Bennet Acceptance Ratio
(BAR)Jarzynski equality
(Jarz)
In practice, we would recommend to either use BAR or CGI as they are more robust. Furthermore, an advantage of BAR over CGI is that it does not rely on the Gaussian assumption for the distribution of the work values.
To get the free energy estimate, you first need to recover the work values by integrating the data.
>>> from glob import glob
>>> from pmx.analysis import read_dgdl_files
>>> from pmx.estimators import BAR
>>> # get dgdl.xvg files for the forward and reverse transitions
>>> ff = glob('for*.xvg') # forward
>>> fr = glob('rev*.xvg') # reverse
>>> # now get integrated work values from them
>>> wf = read_dgdl_files(ff, verbose=False)
>>> wr = read_dgdl_files(fr, verbose=False)
Then, you can use the integrated work values for the estimation of the free energy difference. E.g. for BAR:
>>> # use BAR to estimate the free energy difference
>>> bar = BAR(wf=wf, wr=wr, T=298)
>>> # print estimated dG and its uncertainty
>>> print('%.2f +/- %.2f kJ/mol' % (bar.dg, bar.err))
0.85 +/- 0.56 kJ/mol
Similarly, for the CGI estimator:
>>> cgi = Crooks(wf=wf, wr=wr)
>>> jarz = Jarz(wf=wf, wr=wr, T=298)
>>> # print Crooks estimate
>>> print('%.2f kJ/mol' % cgi.dg)
0.93 kJ/mol
Numerical uncertainty estimates can be obtain by bootstrap, by setting the
number of bootstrap samples with the nboots
argument.
>>> bar = BAR(wf=wf, wr=wr, T=298, nboots=100)
>>> print('%.2f kJ/mol' % bar.err_boot)
0.66 kJ/mol
The CGI estimator has two types of bootstrapped uncertainties: attributes err_boot1
and err_boot2
. The first uses parametric boostrap by resampling a Gaussian fitted to the
distribution of work values, while the second uses non-parametric bootstrap by
resampling directly the set of work values. Typically, they return similar estimates of
the uncertainty.
>>> cgi = Crooks(wf=wf, wr=wr, nboots=100)
>>> print('%.2f kJ/mol' % cgi.err_boot1)
0.77 kJ/mol
>>> print('%.2f kJ/mol' % cgi.err_boot2)
0.82 kJ/mol
A simple way to inspect the results and identify potential issues with the reliability of the calculations is to plot the distribution of forward and reverse work values.
>>> from pmx.analysis import plot_work_dist
>>> # write distribution plot to 'Wdist.png'
>>> plot_work_dist(wf=wf, wr=wr, nbins=10, dG=cgi.dg, dGerr=cgi.err_boot1)