Author: | David Caplan, Lukas Grossar, Oliver Beckstein |
---|---|
Year: | 2010-2012 |
Copyright: | GNU Public License v3 |
Given a Universe (simulation trajectory with 1 or more frames) measure all hydrogen bonds for each frame between selections 1 and 2.
The HydrogenBondAnalysis class is modeled after the VMD HBONDS plugin.
The results are hydrogen bond data per frame (# indicates comments that are not part of the output), stored in HydrogenBondAnalysis.timeseries:
results = [
[ # frame 1
[ # hbond 1
<donor index>, <acceptor index>, <donor string>, <acceptor string>, <distance>, <angle>
],
[ # hbond 2
<donor index>, <acceptor index>, <donor string>, <acceptor string>, <distance>, <angle>
],
....
],
[ # frame 2
[ ... ], [ ... ], ...
],
...
]
Note
For historic reasons, the donor index and acceptor index are a 1-based indices. To get the Atom.number (the 0-based index typically used in MDAnalysis simply subtract 1. For instance, to find an atom in Universe.atoms by index from the output one would use u.atoms[index-1].
Using the HydrogenBondAnalysis.generate_table() method one can reformat the results as a flat “normalised” table that is easier to import into a database for further processing. HydrogenBondAnalysis.save_table() saves the table to a pickled file. The table itself is a numpy.recarray.
Hydrogen bonds are recorded based on a geometric criterion:
The cut-off values angle and distance can be set as keywords to HydrogenBondAnalysis.
Donor and acceptor heavy atoms are detected from atom names. The current defaults are appropriate for the CHARMM27 and GLYCAM06 force fields as defined in Table Default atom names for hydrogen bonding analysis.
Hydrogen atoms bonded to a donor are searched with one of two algorithms, selected with the detect_hydrogens keyword.
distance
Searches for all hydrogens (name “H*” or name “[123]H” or type “H”) in the same residue as the donor atom within a cut-off distance of 1.2 Å.
heuristic
Looks at the next three atoms in the list of atoms following the donor and selects any atom whose name matches (name “H*” or name “[123]H”). For
The distance search is more rigorous but slower and is set as the default. Until release 0.7.6, only the heuristic search was implemented.
Changed in version 0.7.6: Distance search added (see HydrogenBondAnalysis._get_bonded_hydrogens_dist()) and heuristic search improved (HydrogenBondAnalysis._get_bonded_hydrogens_list())
group | donor | acceptor | comments |
---|---|---|---|
main chain | N | O | |
water | OH2, OW | OH2, OW | SPC, TIP3P, TIP4P (CHARMM27,Gromacs) |
ARG | NE, NH1, NH2 | ||
ASN | ND2 | OD1 | |
ASP | OD1, OD2 | ||
CYS | SG | ||
CYH | SG | possible false positives for CYS | |
GLN | NE2 | OE1 | |
GLU | OE1, OE2 | ||
HIS | ND1, NE2 | ND1, NE2 | presence of H determines if donor |
HSD | ND1 | NE2 | |
HSE | NE2 | ND1 | |
HSP | ND1, NE2 | ||
LYS | NZ | ||
MET | SD | see e.g. [Gregoret1991] | |
SER | OG | OG | |
THR | OG1 | OG1 | |
TRP | NE1 | ||
TYR | OH | OH |
element | donor | acceptor |
---|---|---|
N | N,NT,N3 | N,NT |
O | OH,OW | O,O2,OH,OS,OW,OY |
S | SM |
Donor and acceptor names for the CHARMM27 force field will also work for e.g. OPLS/AA (tested in Gromacs). Residue names in the table are for information only and are not taken into account when determining acceptors and donors. This can potentially lead to some ambiguity in the assignment of donors/acceptors for residues such as histidine or cytosine.
For more information about the naming convention in GLYCAM06 have a look at the Carbohydrate Naming Convention in Glycam.
The lists of donor and acceptor names can be extended by providing lists of atom names in the donors and acceptors keywords to HydrogenBondAnalysis. If the lists are entirely inappropriate (e.g. when analysing simulations done with a force field that uses very different atom names) then one should either use the value “other” for forcefield to set no default values, or derive a new class and set the default list oneself:
class HydrogenBondAnalysis_OtherFF(HydrogenBondAnalysis):
DEFAULT_DONORS = {"OtherFF": tuple(set([...]))}
DEFAULT_ACCEPTORS = {"OtherFF": tuple(set([...]))}
Then simply use the new class instead of the parent class and call it with forcefield = “OtherFF”. Please also consider to contribute the list of heavy atom names to MDAnalysis.
References
[Gregoret1991] | L.M. Gregoret, S.D. Rader, R.J. Fletterick, and F.E. Cohen. Hydrogen bonds involving sulfur atoms in proteins. Proteins, 9(2):99–107, 1991. 10.1002/prot.340090204. |
All protein-water hydrogen bonds can be analysed with
import MDAnalysis.analysis.hbonds
u = MDAnalysis.Universe(PSF, PDB, permissive=True)
h = MDAnalysis.analysis.hbonds.HydrogenBondAnalysis(u, 'protein', 'resname TIP3', distance=3.0, angle=120.0)
h.run()
The results are stored as the attribute HydrogenBondAnalysis.timeseries; see Output for the format and further options.
Note
Due to the way HydrogenBondAnalysis is implemented, it is more efficient to have the second selection (selection2) be the larger group, e.g. the water when looking at water-protein H-bonds or the whole protein when looking at ligand-protein interactions.
Perform a hydrogen bond analysis
The analysis of the trajectory is performed with the HydrogenBondAnalysis.run() method. The result is stored in HydrogenBondAnalysis.timeseries. See run() for the format.
The default atom names are taken from the CHARMM 27 force field files, which will also work for e.g. OPLS/AA in Gromacs, and GLYCAM06.
See also
Changed in version 0.7.6: DEFAULT_DONORS/ACCEPTORS is now embedded in a dict to switch between default values for different force fields.
Set up calculation of hydrogen bonds between two selections in a universe.
Arguments : |
|
---|
The timeseries is accessible as the attribute HydrogenBondAnalysis.timeseries.
Changed in version 0.7.6: New verbose keyword (and per-frame debug logging disable by default).
New detect_hydrogens keyword to switch between two different algorithms to detect hydrogens bonded to donor. “distance” is a new, rigorous distance search within the residue of the donor atom, “heuristic” is the previous list scan (improved with an additional distance check).
New forcefield keyword to switch between different values of DEFAULT_DONORS/ACCEPTORS to accomodate different force fields. Also has an option “other” for no default values.
List of the times of each timestep. This can be used together with timeseries to find the specific time point of a hydrogen bond existence, or see table.
Results of the hydrogen bond analysis, stored for each frame. In the following description, # indicates comments that are not part of the output:
results = [
[ # frame 1
[ # hbond 1
<donor index>, <acceptor index>, <donor string>, <acceptor string>, <distance>, <angle>
],
[ # hbond 2
<donor index>, <acceptor index>, <donor string>, <acceptor string>, <distance>, <angle>
],
....
],
[ # frame 2
[ ... ], [ ... ], ...
],
...
]
The time of each step is not stored with each hydrogen bond frame but in timesteps.
Note
The index is a 1-based index. To get the Atom.number (the 0-based index typically used in MDAnalysis simply subtract 1. For instance, to find an atom in Universe.atoms by index one would use u.atoms[index-1].
A normalised table of the data in HydrogenBondAnalysis.timeseries, generated by HydrogenBondAnalysis.generate_table(). It is a numpy.recarray with the following columns:
- “time”
- “donor_idx”
- “acceptor_idx”
- “donor_resnm”
- “donor_resid”
- “donor_atom”
- “acceptor_resnm”
- “acceptor_resid”
- “acceptor_atom”
- “distance”
- “angle”
It takes up more space than timeseries but it is easier to analyze and to import into databases (e.g. using recsql).
Note
The index is a 1-based index. To get the Atom.number (the 0-based index typically used in MDAnalysis simply subtract 1. For instance, to find an atom in Universe.atoms by index one would use u.atoms[index-1].
Find hydrogens bonded to atom.
This method is typically not called by a user but it is documented to facilitate understanding of the internals of HydrogenBondAnalysis.
Returns : | list of hydrogens (can be a AtomGroup) or empty list [] if none were found. |
---|
See also
_get_bonded_hydrogens_dist() and _get_bonded_hydrogens_list()
Changed in version 0.7.6: Can switch algorithm by using the detect_hydrogens keyword to the constructor. kwargs can be used to supply arguments for algorithm.
Find hydrogens bonded within cutoff to atom.
The performance of this implementation could be improved once the topology always contains bonded information; it currently uses the selection parser with an “around” selection.
New in version 0.7.6.
Find “bonded” hydrogens to the donor atom.
At the moment this relies on the assumption that the hydrogens are listed directly after the heavy atom in the topology. If this is not the case then this function will fail.
Hydrogens are detected by name H*, [123]H* and they have to be within a maximum distance from the heavy atom. The cutoff distance depends on the heavy atom and is parameterized in HydrogenBondAnalysis.r_cov.
Changed in version 0.7.6.
default atom names that are treated as hydrogen acceptors (see Default heavy atom names for CHARMM27 force field.) Use the keyword acceptors to add a list of additional acceptor names.
default heavy atom names whose hydrogens are treated as donors (see Default heavy atom names for CHARMM27 force field.) Use the keyword donors to add a list of additional donor names.
Calculate the angle (in degrees) between two atoms with H at apex.
Calculate the Euclidean distance between two atoms.
Counts the number of hydrogen bonds per timestep.
Returns : | a class:numpy.recarray |
---|
Counts the frequency of hydrogen bonds of a specific type.
Processes HydrogenBondAnalysis.timeseries and returns a numpy.recarray containing atom indices, residue names, residue numbers (for donors and acceptors) and the fraction of the total time during which the hydrogen bond was detected.
Returns : | a class:numpy.recarray |
---|
Generate a normalised table of the results.
The table is stored as a numpy.recarray in the attribute table and can be used with e.g. recsql.
A collections.defaultdict of covalent radii of common donors (used in :meth`_get_bonded_hydrogens_list` to check if a hydrogen is sufficiently close to its donor heavy atom). Values are stored for N, O, P, and S. Any other heavy atoms are assumed to have hydrogens covalently bound at a maximum distance of 1.5 Å.
Analyze trajectory and produce timeseries.
Stores the hydrogen bond data per frame as HydrogenBondAnalysis.timeseries (see there for output format).
See also
HydrogenBondAnalysis.generate_table() for processing the data into a different format.
Changed in version 0.7.6: Results are not returned, only stored in timeseries.
Saves table to a pickled file.
Load with
import cPickle
table = cPickle.load(open(filename))
See also
cPickle module and numpy.recarray
Frames during which each hydrogen bond existed, sorted by hydrogen bond.
Processes HydrogenBondAnalysis.timeseries and returns a numpy.recarray containing atom indices, residue names, residue numbers (for donors and acceptors) and a list of timesteps at which the hydrogen bond was detected.
Returns : | a class:numpy.recarray |
---|