5.15. Gromacs TRR file IO — MDAnalysis.coordinates.TRR
¶
The Gromacs TRR trajectory format is a lossless format like
e.g. the DCD format (see DCD
) and
unlike the XTC
format, which stores
reduced precision coordinates. Therefore, if one wants to convert
to Gromacs trajectories without loss of precision then one should
use the TRR format.
The TRR format can store velocities and forces in addition to coordinates. It is also used by other Gromacs tools to store and process other data such as modes from a principal component analysis.
The TRR I/O interface uses libxdrfile2
to implement random access to frames. This works by initially building an
internal index of all frames and then using this index for direct
seeks. Building the index is triggered by
read_trr_n_frames()
, which
typically happens when one accesses the TRRReader.n_frames
attribute
for the first time. Building the index may take many minutes for large
trajectories but afterwards access is faster than with native Gromacs tools.
Changed in version 0.8.0: The TRR I/O interface now uses
libxdrfile2
, which has seeking and
indexing capabilities. Note that unlike
libxdrfile
before it,
libxdrfile2
is distributed under the
GNU GENERAL PUBLIC LICENSE, version 2 (or higher).
Timestep
now correctly deals
with presence/absence of coordinate/velocity/force information on a
per-frame basis.
5.15.1. Tips and Tricks¶
5.15.1.1. Filling a TRR with PCA modes¶
The following recipe by Ramon Crehuet shows how to convert modes
stored in a NumPy-like array (e.g. from a PCA analysis with MMTK) to
a TRR usable by Gromacs. The idea is to manually fill a
Timestep
with the desired
values and then write it to a file with the appropriate
TRRWriter
. In order to
respect the Gromacs format for modes in a TRR file, one must write the
average coordinates in the first frame of the TRR and the modes into
subsequent ones. The mode number is stored in the
step
attribute
and the mode coordinates are filling the
positions
attribute of
Timestep
:
# 'modes' is a mode object with M PCs, similar to a MxNx3 array
# 'xav' the average coordinates, a Nx3 array for N atoms
N = len(xav) # number of atoms, i.e. number of coordinates
W = Writer('pca.trr', n_atoms=N) # TRR writer
ts = MDAnalysis.coordinates.TRR.Timestep(N) # TRR time step
# N of atoms is passed.
for frame,mode in enumerate(modes[4:16]):
ts.lmbda = -1
ts.frame = frame # manually change the frame number
ts._frame = frame - 1
if frame<=1:
ts.positions = xav
else:
ts.positions = mode.scaledToNorm(1.).array*10 # nm to angstroms
if frame <= 1:
ts.time = frame-1
else:
ts.time = mode.frequency
W.write(ts) # converts angstrom to nm for gmx
W.close()
5.15.2. Module reference¶
-
class
MDAnalysis.coordinates.TRR.
Timestep
(n_atoms, **kwargs)[source]¶ Timestep for a Gromacs TRR trajectory.
TRR Timestep always has positions, velocities and forces allocated, meaning that the _pos, _velocities and _forces attributes will always exist. Whether or not this data is valid for the current frame is determined by the flags has_positions, has_velocities and has_forces. These are controlled by the entries in the data dictionary: position_source, velocity_source and force_source. When accessing the .positions attribute, this will only be returned if position_source matches the current frame, otherwise a
NoDataError
will be raised. This scheme applies to both velocities and forces.When writing to the Timestep, it is important that the
frame
attribute is updated first! This is because the source of the data is taken as the current frame of the Timestep.For example:
ts = MDAnalysis.coordinates.TRR.Timestep(N) # N being the number of atoms ts.frame = 0 # The first frame ts.velocities = vel_array # Where vel_array is an existing array of shape (N, DIM) # This will also automatically set the `has_velocities` flag # And the velocity source will be "0"
Attempting to populate the array before setting the frame instead will cause the velocity data to be considered “out of date”:
ts = MDAnalysis.coordinates.TRR.Timestep(N) # N being the number of atoms ts.velocities = vel_array ts.frame = 0 # This frame number doesn't match the frame number of when # velocities was populated
Changed in version 0.8.0: TRR
Timestep
objects are now fully aware of the existence or not of coordinate/velocity/force information in frames.Changed in version 0.11.0: Velocities and forces can now be directly accessed via the velocities and forces attributes, with a
NoDataError
being raised if no data was given for that frame. Management redone to use has_positions, has_velocities and has_forces. Setting these will update the data dictionary entries which track the data. Timestep lmbda now stored in the data dictionary-
copy_slice
(sel)[source]¶ Make a new Timestep containing a subset of the original Timestep.
ts.copy_slice(slice(start, stop, skip))
ts.copy_slice([list of indices])
Returns: A Timestep object of the same type containing all header information and all atom information relevant to the selection. Note
The selection must be a 0 based slice or array of the atom indices in this Timestep
New in version 0.8.
Changed in version 0.11.0: Reworked to follow new Timestep API. Now will strictly only copy official attributes of the Timestep.
-
dimensions
¶ unitcell dimensions (A, B, C, alpha, beta, gamma)
- A, B, C are the lengths of the primitive cell vectors e1, e2, e3
- alpha = angle(e1, e2)
- beta = angle(e1, e3)
- gamma = angle(e2, e3)
-
dt
¶ The time difference in ps between timesteps
Note
This defaults to 1.0 ps in the absence of time data
New in version 0.11.0.
-
from_coordinates
(positions, velocities=None, forces=None, **kwargs)[source]¶ Create an instance of this Timestep, from coordinate data
New in version 0.11.0.
-
from_timestep
(other, **kwargs)[source]¶ Create a copy of another Timestep, in the format of this Timestep
New in version 0.11.0.
-
n_atoms
¶ A read only view of the number of atoms this Timestep has
Changed in version 0.11.0: Changed to read only property
-
time
¶ The time in ps of this timestep
This is calculated as:
time = ts.data['time_offset'] + ts.time
Or, if the trajectory doesn’t provide time information:
time = ts.data['time_offset'] + ts.frame * ts.dt
New in version 0.11.0.
-
triclinic_dimensions
¶ The unitcell dimensions represented as triclinic vectors
Returns: A (3, 3) numpy.ndarray of unit cell vectors For example:
>>> ts.dimensions array([ 13., 14., 15., 90., 90., 90.], dtype=float32) >>> ts.triclinic_dimensions array([[ 13., 0., 0.], [ 0., 14., 0.], [ 0., 0., 15.]], dtype=float32)
Setting the attribute also works:
>>> ts.triclinic_dimensions = [[15, 0, 0], [5, 15, 0], [5, 5, 15]] >>> ts.dimensions array([ 15. , 15.81138802, 16.58312416, 67.58049774, 72.45159912, 71.56504822], dtype=float32)
New in version 0.11.0.
-
volume
¶ volume of the unitcell
-
-
class
MDAnalysis.coordinates.TRR.
TRRReader
(filename, sub=None, **kwargs)[source]¶ Read a Gromacs TRR trajectory.
Changed in version 0.8.0:
Timestep
objects returned from TRR files now havehas_x
,has_velocities
, andhas_forces
flags reflecting whether coordinates/velocities/forces were read. Attempting to access such data when the corresponding flag is set toFalse
will raise aNoDataError
.Changed in version 0.11.0: Returned Timesteps will always be a
Timestep
instance. Attributes lmbda and status now stored in the TS.data dictionaryArguments: - filename
the name of the trr file.
Keywords: - sub
an numpy integer array of what subset of trajectory atoms to load into the timestep. Intended to work similarly to the ‘sub’ argument to Gromacs‘ trjconv.
This is usefull when one has a Universe loaded with only an unsolvated protein, and wants to read a solvated trajectory.
The length of this array must be <= to the actual number of atoms in the trajectory, and equal to number of atoms in the Universe.
- refresh_offsets
if
True
, do not retrieve stored offsets, but instead generate new ones; ifFalse
, use retrieved offsets if available [False
]
Changed in version 0.9.0: New keyword refresh_offsets
Changed in version 0.11.0: Renamed “delta” attribute to “dt” Now passes weakref of self to ts (as “_reader”)
-
OtherWriter
(filename, **kwargs)¶ Returns a writer appropriate for filename.
Sets the default keywords start, step and dt (if available). n_atoms is always set from
Reader.n_atoms
.See also
Reader.Writer()
andMDAnalysis.Writer()
-
Writer
(filename, **kwargs)¶ Returns a Gromacs TrjWriter for filename with the same parameters as this trajectory.
All values can be changed through keyword arguments.
Arguments: - filename
filename of the output trajectory
Keywords: - n_atoms
number of atoms
- dt
Time interval between frames.
- precision
accuracy for lossy XTC format as a power of 10 (ignored for TRR) [1000.0]
Returns: appropriate
TrjWriter
Changed in version 0.11.0: Changed “delta” keyword to “dt”
-
close
()¶ Close xdr trajectory file if it was open.
-
convert_forces_from_native
(force, inplace=True)¶ In-place conversion of forces array force from native units to base units.
By default, the input force is modified in place and also returned.
New in version 0.7.7.
-
convert_forces_to_native
(force, inplace=True)¶ In-place conversion of force array force from base units to native units.
By default, the input force is modified in place and also returned.
New in version 0.7.7.
-
convert_pos_from_native
(x, inplace=True)¶ In-place conversion of coordinate array x from native units to base units.
By default, the input x is modified in place and also returned.
Changed in version 0.7.5: Keyword inplace can be set to
False
so that a modified copy is returned unless no conversion takes place, in which case the reference to the unmodified x is returned.
-
convert_pos_to_native
(x, inplace=True)¶ Conversion of coordinate array x from base units to native units.
By default, the input x is modified in place and also returned.
Changed in version 0.7.5: Keyword inplace can be set to
False
so that a modified copy is returned unless no conversion takes place, in which case the reference to the unmodified x is returned.
-
convert_time_from_native
(t, inplace=True)¶ Convert time t from native units to base units.
By default, the input t is modified in place and also returned (although note that scalar values t are passed by value in Python and hence an in-place modification has no effect on the caller.)
Changed in version 0.7.5: Keyword inplace can be set to
False
so that a modified copy is returned unless no conversion takes place, in which case the reference to the unmodified x is returned.
-
convert_time_to_native
(t, inplace=True)¶ Convert time t from base units to native units.
By default, the input t is modified in place and also returned. (Also note that scalar values t are passed by value in Python and hence an in-place modification has no effect on the caller.)
Changed in version 0.7.5: Keyword inplace can be set to
False
so that a modified copy is returned unless no conversion takes place, in which case the reference to the unmodified x is returned.
-
convert_velocities_from_native
(v, inplace=True)¶ In-place conversion of velocities array v from native units to base units.
By default, the input v is modified in place and also returned.
New in version 0.7.5.
-
convert_velocities_to_native
(v, inplace=True)¶ In-place conversion of coordinate array v from base units to native units.
By default, the input v is modified in place and also returned.
New in version 0.7.5.
-
dt
¶ Time between two trajectory frames in picoseconds.
-
frame
¶ Frame number of the current time step.
This is a simple short cut to
Timestep.frame
.
-
load_offsets
(filename, check=False)¶ Loads current trajectory offsets from pickled filename.
Checks if ctime and size of trajectory file matches that stored in pickled filename. If either one does not match (and check ==
True
) then the offsets are not loaded. This is intended to conservatively avoid loading out-of-date offsets.- The offset file is expected to be a pickled dictionary with keys/values::
- ctime
- the ctime of the trajectory file
- size
- the size of the trajectory file
- offsets
- a numpy array of the offsets themselves
Arguments: - filename
filename of pickle file saved with
save_offsets()
with the frame offsets for the loaded trajectory
Keywords: - check
if False, ignore ctime and size check of trajectory file
Raises: IOError
if the file cannot be read (seeopen()
).
-
n_atoms
¶ The number of publically available atoms that this reader will store in the timestep.
If ‘sub’ was not given in the ctor, then this value will just be the actual number of atoms in the underlying trajectory file. If however ‘sub’ was given, then this value is the number specified by the ‘sub’ sub-selection.
If for any reason the trajectory cannot be read then a negative value is returned.
-
n_frames
¶ Read the number of frames from the trajectory.
The result is cached. If for any reason the trajectory cannot be read then 0 is returned.
This takes a long time because the frames are counted by iterating through the whole trajectory. If the trajectory was previously loaded and saved offsets exist, then loading will be significantly faster.
See also
TrjReader.load_offsets()
andTrjReader.save_offsets()
-
next
()¶ Forward one step to next frame.
-
open_trajectory
()¶ Open xdr trajectory file.
Returns: pointer to XDRFILE (and sets self.xdrfile) Raises: IOError
with code EALREADY if file was already opened or ENOENT if the file cannot be found
-
rewind
()¶ Position at beginning of trajectory
-
save_offsets
(filename)¶ Saves current trajectory offsets into filename, as a pickled object.
Along with the offsets themselves, the ctime and file size of the trajectory file are also saved. These are used upon load as a check to ensure the offsets still match the trajectory they are being applied to.
- The offset file is a pickled dictionary with keys/values::
- ctime
- the ctime of the trajectory file
- size
- the size of the trajectory file
- offsets
- a numpy array of the offsets themselves
Arguments: - filename
filename in which to save the frame offsets
-
time
¶ Time of the current frame in MDAnalysis time units (typically ps).
This is either read straight from the Timestep, or calculated as time =
Timestep.frame
*Timestep.dt
-
totaltime
¶ Total length of the trajectory n_frames * dt.
-
class
MDAnalysis.coordinates.TRR.
TRRWriter
(filename, n_atoms, start=0, step=1, dt=None, precision=1000.0, remarks=None, convert_units=None)[source]¶ Write a Gromacs TRR trajectory.
Create a new TrjWriter
Arguments: - filename
name of output file
- n_atoms
number of atoms in trajectory file
Keywords: - start
starting timestep frame; only used when dt is set.
- step
skip in frames between subsequent timesteps; only used when dt is set.
- dt
time between frames to use. If set will override any time information contained in the passed
Timestep
objects, which will otherwise be used. Thetime
attribute defaults to a timestep of to setting the trajectory time at 1 ps per step if there is no time information.- precision
accuracy for lossy XTC format as a power of 10 (ignored for TRR) [1000.0]
- convert_units
True
: units are converted to the MDAnalysis base format;None
selects the value ofMDAnalysis.core.flags
[‘convert_lengths’]. (see Flags)
Changed in version 0.8.0: The TRR writer is now able to write TRRs without coordinates/velocities/forces, depending on the properties available in the
Timestep
objects passed towrite()
.Changed in version 0.11.0: Keyword “delta” renamed to “dt”
-
convert_dimensions_to_unitcell
(ts)¶ Read dimensions from timestep ts and return Gromacs box vectors
-
convert_forces_from_native
(force, inplace=True)¶ In-place conversion of forces array force from native units to base units.
By default, the input force is modified in place and also returned.
New in version 0.7.7.
-
convert_forces_to_native
(force, inplace=True)¶ In-place conversion of force array force from base units to native units.
By default, the input force is modified in place and also returned.
New in version 0.7.7.
-
convert_pos_from_native
(x, inplace=True)¶ In-place conversion of coordinate array x from native units to base units.
By default, the input x is modified in place and also returned.
Changed in version 0.7.5: Keyword inplace can be set to
False
so that a modified copy is returned unless no conversion takes place, in which case the reference to the unmodified x is returned.
-
convert_pos_to_native
(x, inplace=True)¶ Conversion of coordinate array x from base units to native units.
By default, the input x is modified in place and also returned.
Changed in version 0.7.5: Keyword inplace can be set to
False
so that a modified copy is returned unless no conversion takes place, in which case the reference to the unmodified x is returned.
-
convert_time_from_native
(t, inplace=True)¶ Convert time t from native units to base units.
By default, the input t is modified in place and also returned (although note that scalar values t are passed by value in Python and hence an in-place modification has no effect on the caller.)
Changed in version 0.7.5: Keyword inplace can be set to
False
so that a modified copy is returned unless no conversion takes place, in which case the reference to the unmodified x is returned.
-
convert_time_to_native
(t, inplace=True)¶ Convert time t from base units to native units.
By default, the input t is modified in place and also returned. (Also note that scalar values t are passed by value in Python and hence an in-place modification has no effect on the caller.)
Changed in version 0.7.5: Keyword inplace can be set to
False
so that a modified copy is returned unless no conversion takes place, in which case the reference to the unmodified x is returned.
-
convert_velocities_from_native
(v, inplace=True)¶ In-place conversion of velocities array v from native units to base units.
By default, the input v is modified in place and also returned.
New in version 0.7.5.
-
convert_velocities_to_native
(v, inplace=True)¶ In-place conversion of coordinate array v from base units to native units.
By default, the input v is modified in place and also returned.
New in version 0.7.5.
-
has_valid_coordinates
(criteria, x)¶ Returns
True
if all values are within limit values of their formats.Due to rounding, the test is asymmetric (and min is supposed to be negative):
min < x <= maxArguments: - criteria
dictionary containing the max and min values in native units
- x
np.ndarray
of(x, y, z)
coordinates of atoms selected to be written out.
Returns: boolean