3.1.4. Calculating root mean square quantities — MDAnalysis.analysis.rms

Author:Oliver Beckstein, David L. Dotson
Year:2012
Copyright:GNU Public License v2

New in version 0.7.7.

Changed in version 0.11.0: Added RMSF analysis.

The module contains code to analyze root mean square quantities such as the coordinat root mean square distance (RMSD) or the per-residue root mean square fluctuations (RMSF).

This module uses the fast QCP algorithm [Theobald2005] to calculate the root mean square distance (RMSD) between two coordinate sets (as implemented in MDAnalysis.lib.qcprot.CalcRMSDRotationalMatrix()).

When using this module in published work please cite [Theobald2005].

See also

MDAnalysis.analysis.align
aligning structures based on RMSD
MDAnalysis.lib.qcprot
implements the fast RMSD algorithm.

3.1.4.1. Examples

3.1.4.1.1. Calculating RMSD for multiple domains

In this example we will globally fit a protein to a reference structure and investigate the relative movements of domains by computing the RMSD of the domains to the reference. The example is a DIMS trajectory of adenylate kinase, which samples a large closed-to-open transition. The protein consists of the CORE, LID, and NMP domain.

  • superimpose on the closed structure (frame 0 of the trajectory), using backbone atoms
  • calculate the backbone RMSD and RMSD for CORE, LID, NMP (backbone atoms)

The trajectory is included with the test data files. The data in RMSD.rmsd is plotted with matplotlib.pyplot.plot():

import MDAnalysis
from MDAnalysis.tests.datafiles import PSF,DCD,CRD
u = MDAnalysis.Universe(PSF,DCD)
ref = MDAnalysis.Universe(PSF,DCD)     # reference closed AdK (1AKE) (with the default ref_frame=0)
#ref = MDAnalysis.Universe(PSF,CRD)    # reference open AdK (4AKE)

import MDAnalysis.analysis.rms

R = MDAnalysis.analysis.rms.RMSD(u, ref,
           select="backbone",             # superimpose on whole backbone of the whole protein
           groupselections=["backbone and (resid 1-29 or resid 60-121 or resid 160-214)",   # CORE
                            "backbone and resid 122-159",                                   # LID
                            "backbone and resid 30-59"],                                    # NMP
           filename="rmsd_all_CORE_LID_NMP.dat")
R.run()
R.save()

import matplotlib.pyplot as plt
rmsd = R.rmsd.T   # transpose makes it easier for plotting
time = rmsd[1]
fig = plt.figure(figsize=(4,4))
ax = fig.add_subplot(111)
ax.plot(time, rmsd[2], 'k-',  label="all")
ax.plot(time, rmsd[3], 'k--', label="CORE")
ax.plot(time, rmsd[4], 'r--', label="LID")
ax.plot(time, rmsd[5], 'b--', label="NMP")
ax.legend(loc="best")
ax.set_xlabel("time (ps)")
ax.set_ylabel(r"RMSD ($\AA$)")
fig.savefig("rmsd_all_CORE_LID_NMP_ref1AKE.pdf")

3.1.4.2. Functions

MDAnalysis.analysis.rms.rmsd(a, b, weights=None, center=False)[source]

Returns RMSD between two coordinate sets a and b.

a and b are arrays of the coordinates of N atoms of shape N*3 as generated by, e.g., MDAnalysis.core.AtomGroup.AtomGroup.coordinates().

An implicit optimal superposition is performed, which minimizes the RMSD between a and b although both a and b must be centered on the origin before performing the RMSD calculation so that translations are removed.

One can use the center = True keyword, which subtracts the center of geometry (for weights = None) before the superposition. With weights, a weighted average is computed as the center.

The weights can be an array of length N, containing e.g. masses for a weighted RMSD calculation.

The function uses Douglas Theobald’s fast QCP algorithm [Theobald2005] to calculate the RMSD.

Example::
>>> u = Universe(PSF,DCD)
>>> bb = u.select_atoms('backbone')
>>> A = bb.coordinates()  # coordinates of first frame
>>> u.trajectory[-1]      # forward to last frame
>>> B = bb.coordinates()  # coordinates of last frame
>>> rmsd(A,B)
6.8342494129169804

3.1.4.3. Analysis classes

class MDAnalysis.analysis.rms.RMSD(traj, reference=None, select='all', groupselections=None, filename='rmsd.dat', mass_weighted=False, tol_mass=0.1, ref_frame=0)[source]

Class to perform RMSD analysis on a trajectory.

Run the analysis with RMSD.run(), which stores the results in the array RMSD.rmsd:

frame    time (ps)    RMSD (A)

This class uses Douglas Theobald’s fast QCP algorithm [Theobald2005] to calculate the RMSD.

New in version 0.7.7.

Setting up the RMSD analysis.

The RMSD will be computed between select and reference for all frames in the trajectory in universe.

Arguments:
traj

universe (MDAnalysis.Universe object) that contains a trajectory

reference

reference coordinates; MDAnalysis.Universe object; if None the traj is used (uses the current time step of the object) [None]

select

The selection to operate on; can be one of:

  1. any valid selection string for select_atoms() that produces identical selections in mobile and reference; or
  2. a dictionary {'mobile':sel1, 'reference':sel2} (the MDAnalysis.analysis.align.fasta2select() function returns such a dictionary based on a ClustalW or STAMP sequence alignment); or
  3. a tuple (sel1, sel2)

When using 2. or 3. with sel1 and sel2 then these selections can also each be a list of selection strings (to generate a AtomGroup with defined atom order as described under Ordered selections).

groupselections

A list of selections as described for select. Each selection describes additional RMSDs to be computed after the structures have be superpositioned according to select. The output contains one additional column for each selection. [None]

Note

Experimental feature. Only limited error checking implemented.

filename

If set, filename can be used to write the results with RMSD.save() [None]

mass_weighted

do a mass-weighted RMSD fit

tol_mass

Reject match if the atomic masses for matched atoms differ by more than tol_mass [0.1]

ref_frame

frame index to select frame from reference [0]

New in version 0.7.7.

Changed in version 0.8: groupselections added

rmsd

Results are stored in this N×3 numpy.ndarray array, (frame, time (ps), RMSD (Å)).

run(**kwargs)[source]

Perform RMSD analysis on the trajectory.

A number of parameters can be changed from the defaults. The result is stored as the array RMSD.rmsd.

Keywords:
start, stop, step

start and stop frame index with step size: analyse trajectory[start:stop:step] [None]

mass_weighted

do a mass-weighted RMSD fit

tol_mass

Reject match if the atomic masses for matched atoms differ by more than tol_mass

ref_frame

frame index to select frame from reference

save(filename=None)[source]

Save RMSD from RMSD.rmsd to text file filename.

If filename is not supplied then the default provided to the constructor is used.

The data are saved with numpy.savetxt().

class MDAnalysis.analysis.rms.RMSF(atomgroup)[source]

Class to perform RMSF analysis on a set of atoms across a trajectory.

Run the analysis with RMSF.run(), which stores the results in the array RMSF.rmsf.

This class performs no coordinate transforms; RMSFs are obtained from atom coordinates as-is.

New in version 0.11.0.

Calculate RMSF of given atoms across a trajectory.

Arguments:
atomgroup

AtomGroup to obtain RMSF for

rmsf

Results are stored in this N-length numpy.ndarray array, giving RMSFs for each of the given atoms.

rmsf

RMSF data; only available after using RMSF.run()

run(start=0, stop=-1, step=1, progout=10, quiet=False)[source]

Calculate RMSF of given atoms across a trajectory.

This method implements an algorithm for computing sums of squares while avoiding overflows and underflows; please reference:

[Welford1962]B. P. Welford (1962). “Note on a Method for Calculating Corrected Sums of Squares and Products.” Technometrics 4(3):419-420.
Keywords:
start

starting frame [0]

stop

stopping frame [-1]

step

step between frames [1]

progout

number of frames to iterate through between updates to progress output; None for no updates [10]

quiet

if True, suppress all output (implies progout = None) [False]