Table Of Contents

Previous topic

7.1. Core functions of MDAnalysis

Next topic

7.3. Atom selection Hierarchy — MDAnalysis.core.Selection

This Page

7.2. Fundamental building blocks — MDAnalysis.core.AtomGroup

The most important data structure in MDAnalysis is the AtomGroup, which contains Atom instances.

A Universe is the user-visible entry point and collects all information needed to analyze a structure or a whole trajectory.

Segments and residues are a way to refer to a collection of atoms. By convention, a Residue is a single amino acid, or a water molecule, ion, or ligand. A Segment is a collection of residues such as a whole protein or a chain in a protein or all the water in the system.

7.2.1. Class Hierarchy

A Universe contains Segments, which contain Residues, which contain Atoms; all containers are derived from AtomGroup, and thus one can always analyze them as a collection of atoms, independent of the hierarchical level.

Each Atom can only belong to a single Residue, and a Residue belongs to one specific Segment. This hierarchy can be described as

Segment > Residue > Atom

Depending on the use case, it can be more convenient to access data on, for instance, the basis of residues than atoms, or to write out individual chains (segments) of a protein. MDAnalysis simply provides access to these groupings and keeps track of where an atom belongs. Each object provides three attributes (atoms, residues or residue, segments or segment) that give access to the tiers in the hierarchy that the object belongs to.

7.2.2. Manipulating atoms, residues, and segments

When working with MDAnalysis it is useful to remember that the fundamental object is the Atom. Each particle in the topology is represented by exactly one Atom instance. One Atom, however, can be a member of multiple AtomGroup collections, for instance from different selections even though they all refer to the same Atom object. Thus, changing a property of a specific and Atom in one AtomGroup changes it “everywhere”.

The same is mostly true for Residue instances although they are derived from Atom instances: all Atom objects with the same Atom.resid are bundled into a single Residue with Residue.id = resid. This means that just changing, say, the residue name with a command such as

>>> r = u.selectAtoms("resid 99").residues[0]
>>> print(r)
<Residue 'ALA', 99>
>>> r.name = "UNK"
>>> print(r)
<Residue 'UNK', 99>
>>> rnew = u.selectAtoms("resid 99").residues[0]
>>> print(rnew)
<Residue 'UNK', 99>

will typically work as expected. When working with collections such as AtomGroup or ResidueGroup it is generally better to use provided setter methods such as AtomGroup.set_resname() or ResidueGroup.set_resname().

There are two cases when it is very important to use the setters:

Because residues are determined by the Atom.resid and segments by Atom.segid, the above methods take extra care to rebuild the list of segments and residues.

Note

AtomGroup.set_resid(), ResidueGroup.set_resid(), AtomGroup.set_segid(), ResidueGroup.set_segid() can change the topology: they can split or merge residues or segments.

Splitting/merging of residues is probably not very useful because no chemical rearrangements are carried out. Manipulating segments might be more useful in order to add additional structure to a Universe and provide instant segment selectors for interactive work:

u.selectAtoms("protein").set_segid("protein")
u.selectAtoms("resname POPE or resname POPC").set_segid("lipids")
u.selectAtoms("resname SOL").set_segid("water")
u.selectAtoms("resname NA or resname CL").set_segid("ions")

u.protein.numberOfResidues()
water_oxygens = u.water.OW

The setter methods have the additional advantage that they can assign lists. For instance, many MD codes number residues consecutively starting from 1. However, the original structure might be missing a few atoms at the N-terminus. Let’s say that the first residue is really residue 10. In order to store the canonical residue IDs (“resnum”) one could the use

import numpy as np
protein = u.selectAtoms("protein").residues
protein.set_resnum(np.array(protein.resnums()) + 9)

One can then use protein.select("resnum 42") to select the residue that has the canonical residue id 42 (instead of resid 33).

One can also read the resids directly from an original PDB file:

orig = MDAnalysis.Universe(“2jln.pdb”) protein.set_resnum(orig.selectAtoms(“protein”).resids())

7.2.3. Classes and functions

class MDAnalysis.core.AtomGroup.Universe(*args, **kwargs)[source]

The MDAnalysis Universe contains all the information describing the system.

The system always requires a topology file — in the simplest case just a list of atoms. This can be a CHARMM/NAMD PSF file or a simple coordinate file with atom informations such as PDB, Gromacs GRO, or CHARMM CRD. See Table of Supported topology formats for what kind of topologies can be read.

A trajectory provides coordinates; the coordinates have to be ordered in the same way as the list of atoms in the topology. A trajectory can be a single frame such as a PDB, CRD, or GRO file, or it can be a MD trajectory (in CHARMM/NAMD/LAMMPS DCD, Gromacs XTC/TRR, or generic XYZ format). See Table of Supported coordinate formats for what can be read as a “trajectory”.

As a special case, when the topology is a PDB, GRO or CRD file then the coordinates are immediately loaded from the “topology” file unless a trajectory is supplied.

Examples for setting up a universe:

u = Universe(topology, trajectory)          # read system from file(s)
u = Universe(pdbfile)                       # read atoms and coordinates from PDB or GRO
u = Universe(topology, [traj1, traj2, ...]) # read from a list of trajectories
u = Universe(topology, traj1, traj2, ...)   # read from multiple trajectories

Load new data into a universe (replaces old trajectory and does not append):

u.load_new(trajectory)                      # read from a new trajectory file

Select atoms, with syntax similar to CHARMM (see selectAtoms for details):

u.selectAtoms(...)

Attributes:

  • Universe.trajectory: currently loaded trajectory reader; Universe.trajectory.ts is the current time step
  • Universe.dimensions: current system dimensions (simulation unit cell, if set in the trajectory)
  • bonds, angles, dihedrals, impropers (low level access through Universe._psf)

Note

If atom attributes such as element, mass, or charge are not explicitly provided in the topology file then MDAnalysis tries to guess them (see MDAnalysis.topology.tables). This does not always work and if you require correct values (e.g. because you want to calculate the center of mass) then you need to make sure that MDAnalysis gets all the information needed. Furthermore, the list of bonds is only constructed when provided in the topology and never guessed (see Issue 23).

Changed in version 0.7.5: Can also read multi-frame PDB files with the PrimitivePDBReader.

Changed in version 0.8: Parse arbitrary number of arguments as a single topology file and a a sequence of trajectories.

Initialize the central MDAnalysis Universe object.

Arguments:
topologyfile

A CHARMM/XPLOR PSF topology file, PDB file or Gromacs GRO file; used to define the list of atoms. If the file includes bond information, partial charges, atom masses, ... then these data will be available to MDAnalysis. A “structure” file (PSF, PDB or GRO, in the sense of a topology) is always required.

coordinatefile

A trajectory (such as CHARMM DCD, Gromacs XTC/TRR/GRO, XYZ, XYZ.bz2) or a PDB that will provide coordinates, possibly multiple frames. If a list of filenames is provided then they are sequentially read and appear as one single trajectory to the Universe. The list can contain different file formats.

Deprecated since version 0.8: Do not use the coordinatefile keyword argument, just provide trajectories as positional arguments.

permissive

currently only relevant for PDB files: Set to True in order to ignore most errors and read typical MD simulation PDB files; set to False to read with the Bio.PDB reader, which can be useful for real Protein Databank PDB files. None selects the MDAnalysis default (which is set in MDAnalysis.core.flags) [None]

topology_format

provide the file format of the topology file; None guesses it from the file extension [None]

format

provide the file format of the coordinate or trajectory file; None guesses it from the file extension. Note that this keyword has no effect if a list of file names is supplied because the “chained” reader has to guess the file format for each individual list member. [None]

bonds

bond handling for PDB files. The default is to read and store the CONECT records only. When set to True it will attempt to guess connectivity between all atoms in the Universe. Each bond knows if it was guessed or was a CONECT record, so when saving out one can specify which ones to write out by

u = Universe("example.pdb")
u.atoms.write("output.pdb", bonds="conect") # default, only CONECT
u.atoms.write("output.pdb", bonds="all")
u.atoms.write("output.pdb", bonds=None)

This routine tries to do the right thing:

  1. If a pdb/gro file is provided instead of a psf and no coordinatefile then the coordinates are taken from the first file. Thus you can load a functional universe with

    u = Universe('1ake.pdb')
    

    If you want to specify the coordinate file format yourself you can do so using the format keyword:

    u = Universe('1ake.ent1', format='pdb')
    
  2. If only a topology file without coordinate information is provided one will have to load coordinates manually using Universe.load_new(). The file format of the topology file can be explicitly set with the topology_format keyword.

Changed in version 0.7.4: New topology_format and format parameters to override the file format detection.

coord[source]

Reference to current timestep and coordinates of universe.

The raw trajectory coordinates are Universe.coord._pos, represented as a numpy.float32 array.

Because coord is a reference to a Timestep, it changes its contents while one is stepping through the trajectory.

Note

In order to access the coordinates it is probably better to use the AtomGroup.coordinates() method; for instance, all coordinates of the Universe as a numpy array: Universe.atoms.coordinates().

dimensions[source]

Current dimensions of the unitcell

load_new(filename, **kwargs)[source]

Load coordinates from filename, using the suffix to detect file format.

Arguments:
filename

the coordinate file (single frame or trajectory) or a list of filenames, which are read one after another.

permissive

currently only relevant for PDB files: Set to True in order to ignore most errors and read typical MD simulation PDB files; set to False to read with the Bio.PDB reader, which can be useful for real Protein Databank PDB files. None selects the MDAnalysis default (which is set in MDAnalysis.core.flags) [None]

format

provide the file format of the coordinate or trajectory file; None guesses it from the file extension. Note that this keyword has no effect if a list of file names is supplied because the “chained” reader has to guess the file format for each individual list member [None]

kwargs

other kwargs are passed to the trajectory reader (only for advanced use)

Returns:

(filename, trajectory_format) or None if filename == None

Raises:

TypeError if trajectory format can not be determined or no appropriate trajectory reader found

Changed in version 0.8: If a list or sequence that is provided for filename only contains a single entry then it is treated as single coordinate file. This has the consequence that it is not read by the ChainReader but directly by its specialized file format reader, which typically has more features than the ChainReader.

selectAtoms(sel, *othersel)[source]

Select atoms using a CHARMM selection string.

Returns an AtomGroup with atoms sorted according to their index in the psf (this is to ensure that there aren’t any duplicates, which can happen with complicated selections).

Subselections can be grouped with parentheses.

Example::
>>> universe.selectAtoms("segid DMPC and not ( name H* or name O* )")
<AtomGroup with 3420 atoms>

Note

If exact ordering of atoms is required (for instance, for angle() or dihedral() calculations) then one supplies selections separately in the required order. Also, when multiple AtomGroup instances are concatenated with the + operator then the order of Atom instances is preserved and duplicates are not removed.

See also

Selection Commands for further details and examples.

The selection parser understands the following CASE SENSITIVE keywords:

Simple selections

protein, backbone, nucleic, nucleicbackbone
selects all atoms that belong to a standard set of residues; a protein is identfied by a hard-coded set of residue names so it may not work for esoteric residues.
segid seg-name
select by segid (as given in the topology), e.g. segid 4AKE or segid DMPC
resid residue-number-range
resid can take a single residue number or a range of numbers. A range consists of two numbers separated by a colon (inclusive) such as resid 1:5. A residue number (“resid”) is taken directly from the topology.
resnum resnum-number-range
resnum is the canonical residue number; typically it is set to the residue id in the original PDB structure.
resname residue-name
select by residue name, e.g. resname LYS
name atom-name
select by atom name (as given in the topology). Often, this is force field dependent. Example: name CA (for C&alpha; atoms) or name OW (for SPC water oxygen)
type atom-type
select by atom type; this is either a string or a number and depends on the force field; it is read from the topology file (e.g. the CHARMM PSF file contains numeric atom types). It has non-sensical values when a PDB or GRO file is used as a topology.
atom seg-name residue-number atom-name
a selector for a single atom consisting of segid resid atomname, e.g. DMPC 1 C2 selects the C2 carbon of the first residue of the DMPC segment

Boolean

not
all atoms not in the selection, e.g. not protein selects all atoms that aren’t part of a protein
and, or
combine two selections according to the rules of boolean algebra, e.g. protein and not (resname ALA or resname LYS) selects all atoms that belong to a protein, but are not in a lysine or alanine residue

Geometric

around distance selection
selects all atoms a certain cutoff away from another selection, e.g. around 3.5 protein selects all atoms not belonging to protein that are within 3.5 Angstroms from the protein
point x y z distance
selects all atoms within a cutoff of a point in space, make sure coordinate is separated by spaces, e.g. point 5.0 5.0 5.0  3.5 selects all atoms within 3.5 Angstroms of the coordinate (5.0, 5.0, 5.0)
prop [abs] property operator value
selects atoms based on position, using property x, y, or z coordinate. Supports the abs keyword (for absolute value) and the following operators: <, >, <=, >=, ==, !=. For example, prop z >= 5.0 selects all atoms with z coordinate greater than 5.0; prop abs z <= 5.0 selects all atoms within -5.0 <= z <= 5.0.

Connectivity

byres selection
selects all atoms that are in the same segment and residue as selection, e.g. specify the subselection after the byres keyword

Index

bynum index-range
selects all atoms within a range of (1-based) inclusive indices, e.g. bynum 1 selects the first atom in the universe; bynum 5:10 selects atoms 5 through 10 inclusive. All atoms in the MDAnalysis.Universe are consecutively numbered, and the index runs from 1 up to the total number of atoms.

Changed in version 0.7.4: Added resnum selection.

trajectory[source]

Reference to trajectory reader object containing trajectory data.

class MDAnalysis.core.AtomGroup.AtomGroup(atoms)[source]

A group of atoms.

ag = universe.selectAtoms(atom-list)

The AtomGroup contains a list of atoms; typically, a AtomGroup is generated from a selection. It is build from any list-like collection of Atom instances. It is also possible to create an empty AtomGroup from an empty list.

An AtomGroup can be indexed and sliced like a list:

ag[0], ag[-1]

will return the first and the last Atom in the group whereas the slice

ag[0:6:2]

returns every second element, corresponding to indices 0, 2, and 4.

It also supports “advanced slicing” when the argument is a numpy.ndarray or a list:

aslice = [0, 3, -1, 10, 3]
ag[aslice]

will return a new AtomGroup containing (ag[0], ag[3], ag[-1], ag[10], ag[3]).

Note

AtomGroups originating from a selection are sorted and duplicate elements are removed. This is not true for AtomGroups produced by slicing. Thus slicing can be used when the order of atoms is crucial (for instance, in order to define angles or dihedrals).

Atoms can also be accessed in a Pythonic fashion by using the atom name as an attribute. For instance,

ag.CA

will provide a AtomGroup of all CA atoms in the group. These instant selector attributes are auto-generated for each atom name encountered in the group.

Note

The name-attribute instant selector access to atoms is mainly meant for quick interactive work. Thus it either returns a single Atom if there is only one matching atom, or a new AtomGroup for multiple matches. This makes it difficult to use the feature consistently in scripts but it is much better for interactive work.

References for analysis methods

[Dima2004](1, 2) Dima, R. I., & Thirumalai, D. (2004). Asymmetry in the shapes of folded and denatured states of proteins. J Phys Chem B, 108(21), 6564-6570. doi:10.1021/jp037128y

Changed in version 0.7.6: An empty AtomGroup can be created and no longer raises a NoDataError.

_atoms[source]

immutable list of references to the atoms Atom in the group

_rebuild_caches()[source]

Rebuild all AtomGroup caches.

A number of lists and attributes are cached. These caches are lazily built the first time they are needed. When editing the topology it might happen that not all caches were synced properly (even though that this is supposed to happen eventually). In this case the user can manually force a complete cache rebuild.

Currently the following caches are used:

See also

_clear_caches()

New in version 0.7.5.

_clear_caches(*args)[source]

Clear cache for all args.

If no args are provided, all caches are cleared.

New in version 0.8.

_fill_cache(name, value)[source]

Populate __cache[name] with value.

New in version 0.8.

align_principalAxis(axis, vector)[source]

Align principal axis with index axis with vector.

Arguments:
axis

Index of the principal axis (0, 1, or 2), as produced by principalAxes().

vector

A 3D vector such as the z-axis ([0,0,1]); can be anything that looks like a list with three entries.

To align the long axis of a channel (the first principal axis, i.e. axis = 0) with the z-axis:

u.atoms.align_principalAxis(0, [0,0,1])
u.atoms.write("aligned.pdb")
angle()[source]

Returns the angle in degrees between atoms 0, 1, 2.

Angle between atoms 0 and 2 with apex at 1:

   2
  /
 /
1------0

Note

Only makes sense for a AtomGroup with exactly 3 Atom; anything else will raise a ValueError.

New in version 0.7.3.

angleDict[source]

A MDAnalysis.topology.core.TopologyDict of angles within this AtomGroup.

New in version 0.8.

asphericity()[source]

Asphericity.

See [Dima2004] for background information.

New in version 0.7.7.

atoms[source]

AtomGroup of all atoms in this group.

If this is a AtomGroup then it returns itself. Otherwise, it will return a new AtomGroup based on all Atom instances contained.

Apply :func:`list to atoms or use _atoms if you really only need a list of individual Atom instances.

bbox()[source]

Return the bounding box of the selection.

The lengths A,B,C of the orthorhombic enclosing box are

L = AtomGroup.bbox()
A,B,C = L[1] - L[0]
Returns:[[xmin, ymin, zmin], [xmax, ymax, zmax]]

New in version 0.7.2.

bfactors[source]

Crystallographic B-factors (from PDB) in A**2.

Deprecated since version 0.8: This managed attribute will become a method in 0.8 in order to provide a unified interface to querying properties: AtomGroup.bfactors will become AtomGroup.bfactors()

bond()[source]

Returns the distance between atoms in a 2-atom group.

Distance between atoms 0 and 1:

0---1

Note

Only makes sense for a AtomGroup with exactly 2 Atom; anything else will raise a ValueError.

New in version 0.7.3.

bondDict[source]

A MDAnalysis.topology.core.TopologyDict of bonds within this AtomGroup.

New in version 0.8.

bsphere()[source]

Return the bounding sphere of the selection.

The sphere is calculated relative to the centre of geometry.

Returns:(R, [xcen,ycen,zcen])

New in version 0.7.3.

centerOfGeometry()[source]

Center of geometry (also known as centroid) of the selection.

centerOfMass()[source]

Center of mass of the selection.

centroid()

Center of geometry (also known as centroid) of the selection.

charges()[source]

Array of partial charges of the atoms (as defined in the topology)

coordinates(ts=None, copy=False, dtype=<type 'numpy.float32'>)

NumPy array of the coordinates.

See also

positions and get_positions()

Deprecated since version 0.7.6: In new scripts use AtomGroup.get_positions() preferrably.

dihedral()[source]

Calculate the dihedral angle in degrees.

Dihedral angle around axis connecting atoms 1 and 2 (i.e. the angle between the planes spanned by atoms (0,1,2) and (1,2,3)):

        3
        |
  1-----2
 /
0

Note

Only makes sense for a AtomGroup with exactly 4 Atom; anything else will raise a ValueError.

New in version 0.7.0.

dimensions[source]

Dimensions of the Universe to which the group belongs, at the current time step.

forces

Forces on the atoms in the AtomGroup.

The forces can be changed by assigning an array of the appropriate shape, i.e. either Nx3 to assign individual force or 3, to assign the same force to all atoms (e.g. ag.forces = array([0,0,0]) will set all forces to (0.,0.,0.)).

For more control use the get_forces() and set_forces() methods.

New in version 0.7.7.

get_forces(ts=None, copy=False, dtype=<type 'numpy.float32'>)[source]

Get a NumPy array of the atomic forces (if available). Currently only supported for Gromacs .trr trajectories.

Keywords:
ts

If ts is provided then positions are read from that Timestep instead of the one from the current trajectory belonging to this universe. The ts is indexed with the indices returned by indices() and it is the user’s responsibility to provide a time step that has the appropriate dimensions.

copy

True: always make a copy (slow), False: Try to return a array view or reference (faster); note that for passing coordinates to C-code it can be necessary to use a copy [False]

dtype

NumPy Data type of the array; the default is usually entirely appropriate. Most C-code actually requires the default [numpy.float32]

Forces can also be directly obtained from the attribute forces.

Forces can be directly set with set_forces() or by assigning to forces.

New in version 0.7.7.

get_positions(ts=None, copy=False, dtype=<type 'numpy.float32'>)[source]

Get a NumPy array of the coordinates.

Keywords:
ts

If ts is provided then positions are read from that Timestep instead of the one from the current trajectory belonging to this universe. The ts is indexed with the indices returned by indices() and it is the user’s responsibility to provide a time step that has the appropriate dimensions.

copy

True: always make a copy (slow), False: Try to return a array view or reference (faster); note that for passing coordinates to C-code it can be necessary to use a copy [False]

dtype

NumPy Data type of the array; the default is usually entirely appropriate. Most C-code actually requires the default [numpy.float32]

Coordinates can also be directly obtained from the attribute positions.

Coordinates can be directly set with set_positions() or by assigning to positions.

This method is identical with coordinates() but named differently for symmetry with with set_positions().

New in version 0.7.6.

get_velocities(ts=None, copy=False, dtype=<type 'numpy.float32'>)[source]

NumPy array of the velocities.

Raises a NoDataError if the underlying Timestep does not contain _velocities.

See also AtomGroup.set_velocities() and attribute access through AtomGroup.velocities.

New in version 0.7.6.

improper()[source]

Returns the improper dihedral between 4 atoms.

The improper dihedral is calculated in the same way as the proper dihedral(): The angle between the planes formed by atoms (0,1,2) and (1,2,3).

Note

Only makes sense for a AtomGroup with exactly 4 Atom; anything else will raise a ValueError. The interpretation of the angle as an “improper” solely depends on the selection of atoms and thus the user input!

New in version 0.7.3.

indices()[source]

Array of all Atom.number in the group.

These indices are 0-based and can be used to directly index Universe.atoms or the coordinate array MDAnalysis.coordinates.base.Timestep._pos.

masses()[source]

Array of atomic masses (as defined in the topology)

momentOfInertia()[source]

Tensor of inertia as 3x3 NumPy array.

names()[source]

Returns a list of atom names.

Changed in version 0.8: Returns a numpy.ndarray

numberOfAngleTypes()[source]

Number of different angle types in this AtomGroup

numberOfAtoms()[source]

Total number of atoms in the group

numberOfBondTypes()[source]

Number of different bond types in this AtomGroup

numberOfResidues()[source]

Total number of residues in the group

numberOfSegments()[source]

Total number of segments in the group

numberOfTorsionTypes()[source]

Number of different torsions types in this AtomGroup

packintobox(box=None)[source]

Shift all atoms in this group to be within the primary unit cell.

AtomGroup.packintobox([box])

All atoms will be moved so that they lie between 0 and boxlength \(L_i\) in all dimensions, i.e. the lower left corner of the simulation box is taken to be at (0,0,0):

\[x_i' = x_i - \left\lfloor \]

rac{x_i}{L_i} ight floor

The default is to take unit cell information from the underlying Timestep instance. The optional argument box can be used to provide alternative unit cell information (in the MDAnalysis standard format [Lx, Ly, Lz, alpha, beta, gamma]).

Warning

Currently only works with orthogonal boxes. See Issue 136

positions

Coordinates of the atoms in the AtomGroup.

The positions can be changed by assigning an array of the appropriate shape, i.e. either Nx3 to assign individual coordinates or 3, to assign the same coordinate to all atoms (e.g. ag.positions = array([0,0,0]) will move all particles to the origin).

For more control use the get_positions() and set_positions() methods.

New in version 0.7.6.

principalAxes()[source]

Calculate the principal axes from the moment of inertia.

e1,e2,e3 = AtomGroup.principalAxes()

The eigenvectors are sorted by eigenvalue, i.e. the first one corresponds to the highest eigenvalue and is thus the first principal axes.

Returns:numpy.array v with v[0] as first, v[1] as second, and v[2] as third eigenvector.
radii()[source]

Array of atomic radii (as defined in the PQR file)

radiusOfGyration()[source]

Radius of gyration.

resids()[source]

Returns a list of residue numbers.

Changed in version 0.8: Returns a numpy.ndarray

residues[source]

Read-only list of Residue objects.

A ResidueGroup of all residues that contain atoms in this group.

resnames()[source]

Returns a list of residue names.

Changed in version 0.8: Returns a numpy.ndarray

resnums()[source]

Returns a list of canonical residue numbers.

New in version 0.7.4.

Changed in version 0.8: Returns a numpy.ndarray

rotate(R)[source]

Apply a rotation matrix R to the selection’s coordinates.

AtomGroup.rotate(R)

R is a 3x3 orthogonal matrix that transforms x –> x’:

x’ = R.x
rotateby(angle, axis, point=None)[source]

Apply a rotation to the selection’s coordinates.

AtomGroup.rotateby(angle,axis[,point])

The transformation from current coordinates x to new coordinates x’ is

x’ = R.(x-p) + p

where R is the rotation by angle around the axis going through point p.

Arguments:
angle

rotation angle in degrees

axis

rotation axis vector, a 3-tuple, list, or array, or a 2-tuple of two MDAnalysis objects from which the axis is calculated as the vector from the first to the second center of geometry.

point

point on the rotation axis; by default (None) the center of geometry of the selection is chosen, or, if axis is a tuple of selections, it defaults to the first point of the axis. point can be a 3-tuple, list, or array or a MDAnalysis object (in which case its centroid() is used).

Returns:

The 4x4 matrix which consists of the rotation matrix M[:3,:3] and the translation vectort M[:3,3].

segids()[source]

Returns a list of segment ids (=segment names).

Changed in version 0.8: Returns a numpy.ndarray

segments[source]

Read-only list of Segment objects.

A SegmentGroup of all segments that contain atoms in this group.

selectAtoms(sel, *othersel)[source]

Selection of atoms using the MDAnalysis selection syntax.

AtomGroup.selectAtoms(selection[,selection[,...]])

selectBonds(criteria)[source]

Select all of a given bond, angle or torsion type from the AtomGroup. This method makes use of the MDAnalysis.topology.core.TopologyDict instances for this AtomGroup.

Usage::
ag.selectBonds(criteria)

criteria must match a key from the appropriate TopologyDict, viewable through the MDAnalysis.topology.core.TopologyDict.keys() method. These keys are a tuple of the atom types in the bond.

Returns:a MDAnalysis.topology.core.TopologyGroup

New in version 0.8.

set(name, value, **kwargs)

Set attribute name to value for all atoms in the AtomGroup.

If value is a sequence of the same length as the AtomGroup then each Atom‘s property name is set to the corresponding value. If value is neither of length 1 (or a scalar) nor of the length of the AtomGroup then a ValueError is raised.

New in version 0.7.4.

Changed in version 0.8: Can set atoms to distinct values by providing a sequence or iterable.

set_bfactor(bfactor)[source]

Set the atom bfactor to float bfactor for all atoms in the AtomGroup.

New in version 0.7.4.

Changed in version 0.8: Can set atoms and residues to distinct values by providing a sequence or iterable.

set_charge(charge)[source]

Set the partial charge to float charge for all atoms in the AtomGroup.

New in version 0.7.4.

Changed in version 0.8: Can set atoms and residues to distinct values by providing a sequence or iterable.

set_forces(forces, ts=None)[source]

Set the forces for all atoms in the group.

Arguments:
forces

a Nx3 NumPy numpy.ndarray where N is the number of atoms in this atom group.

Keywords:
ts

Timestep, defaults to None and then the current time step is used.

Note

If the group contains N atoms and force is a single point (i.e. an array of length 3) then all N atom positions are set to force (due to NumPy’s broadcasting rules), as described for forces.

See also get_forces() and attribute access through forces.

New in version 0.7.7.

set_mass(mass)[source]

Set the atom mass to float mass for all atoms in the AtomGroup.

New in version 0.7.4.

Changed in version 0.8: Can set atoms and residues to distinct values by providing a sequence or iterable.

set_name(name)[source]

Set the atom name to string for all atoms in the AtomGroup.

If value is a sequence of the same length as the AtomGroup then each Atom.name is set to the corresponding value. If value is neither of length 1 (or a scalar) nor of the length of the AtomGroup then a ValueError is raised.

New in version 0.7.4.

Changed in version 0.8: Can set atoms to distinct values by providing a sequence or iterable.

set_positions(coords, ts=None)[source]

Set the positions for all atoms in the group.

Arguments:
coords

a Nx3 NumPy numpy.ndarray where N is the number of atoms in this atom group.

Keywords:
ts

Timestep, defaults to None and then the current time step is used.

Note

If the group contains N atoms and coord is a single point (i.e. an array of length 3) then all N atom positions are set to coord (due to NumPy’s broadcasting rules), as described for positions.

See also get_positions() and attribute access through positions.

New in version 0.7.6.

set_radius(radius)[source]

Set the atom radius to float radius for all atoms in the AtomGroup.

New in version 0.7.4.

Changed in version 0.8: Can set atoms and residues to distinct values by providing a sequence or iterable.

set_resid(resid)[source]

Set the resid to integer resid for all atoms in the AtomGroup.

If resid is a sequence of the same length as the AtomGroup then each Atom.resid is set to the corresponding value together with the Residue.id of the residue the atom belongs to. If value is neither of length 1 (or a scalar) nor of the length of the AtomGroup then a ValueError is raised.

Note

Changing resids can change the topology.

Assigning the same resid to multiple residues will merge these residues. Assigning different resid to atoms in the same residue will split a residue (and potentially merge with another one).

New in version 0.7.4.

Changed in version 0.7.5: Also changes the residues.

Changed in version 0.8: Can set atoms and residues to distinct values by providing a sequence or iterable and can change the topology via MDAnalysis.topology.core.build_residues().

set_resname(resname)[source]

Set the resname to string resname for all atoms in the AtomGroup.

If resname is a sequence of the same length as the AtomGroup then each Atom.resname is set to the corresponding value together with the Residue.name of the residue the atom belongs to. If value is neither of length 1 (or a scalar) nor of the length of the AtomGroup then a ValueError is raised.

New in version 0.7.4.

Changed in version 0.7.5: Also changes the residues.

Changed in version 0.8: Can set atoms and residues to distinct values by providing a sequence or iterable.

set_resnum(resnum)[source]

Set the resnum to resnum for all atoms in the AtomGroup.

If resnum is a sequence of the same length as the AtomGroup then each Atom.resnum is set to the corresponding value together with the Residue.resnum of the residue the atom belongs to. If value is neither of length 1 (or a scalar) nor of the length of the AtomGroup then a ValueError is raised.

Note

Changing resnum will not affect the topology: you can have multiple residues with the same resnum.

See also

set_resid()

New in version 0.7.4.

Changed in version 0.7.5: Also changes the residues.

Changed in version 0.8: Can set atoms and residues to distinct values by providing a sequence or iterable.

set_segid(segid, buildsegments=True)[source]

Set the segid to segid for all atoms in the AtomGroup.

If segid is a sequence of the same length as the AtomGroup then each Atom.segid is set to the corresponding value together with the Segment.id of the residue the atom belongs to. If value is neither of length 1 (or a scalar) nor of the length of the AtomGroup then a ValueError is raised.

Note

set_segid() can change the topology.

With the default buildsegments = True it can be used to join segments or to break groups into multiple disjoint segments. Note that each Atom can only belong to a single Segment.

For performance reasons, buildsegments can be set to False. Then one needs to run Universe._build_segments() manually later in order to update the list of Segment instances and regenerate the segid instant selectors.

New in version 0.7.4.

Changed in version 0.8: Can set atoms and residues to distinct values by providing a sequence or iterable.

set_type(atype)[source]

Set the atom type to atype for all atoms in the AtomGroup.

New in version 0.7.4.

Changed in version 0.8: Can set atoms and residues to distinct values by providing a sequence or iterable.

set_velocities(v, ts=None)[source]

Assign the velocities v to the timestep.

Raises a NoDataError if the underlying Timestep does not contain _velocities.

See also AtomGroup.get_velocities() and AtomGroup.velocities for attribute access.

New in version 0.7.6.

shapeParameter()[source]

Shape parameter.

See [Dima2004] for background information.

New in version 0.7.7.

torsionDict[source]

A MDAnalysis.topology.core.TopologyDict of angles within this AtomGroup.

New in version 0.8.

totalCharge()[source]

Sum of all partial charges (must be defined in topology).

totalMass()[source]

Total mass of the selection (masses are taken from the topology or guessed).

transform(M)[source]

Apply homogenous transformation matrix M to the coordinates.

The matrix M must be a 4x4 matrix, with the rotation in M[:3,:3] and the translation in M[:3,3].

The rotation is applied before the translation:

x' = R.x + t
translate(t)[source]

Apply translation vector t to the selection’s coordinates.

>>> AtomGroup.translate(t)
>>> AtomGroup.translate((A, B))

The method applies a translation to the AtomGroup from current coordinates x to new coordinates x’:

x' = x + t

The translation can also be given as a tuple of two MDAnalysis objects such as two selections (selA, selB), i.e. two AtomGroup, or two Atom. The translation vector is computed as the difference between the centers of geometry (centroid) of B and A:

t = B.centroid() - A.centroid()
ts[source]

Temporary Timestep that contains the selection coordinates.

A Timestep instance, which can be passed to a trajectory writer.

If ts is modified then these modifications will be present until the frame number changes (which typically happens when the underlying trajectory frame changes).

It is not possible to assign a new Timestep to the AtomGroup.ts attribute; change attributes of the object.

universe[source]

The universe to which the atoms belong (read-only).

velocities(**kwargs)[source]

NumPy array of the velocities of the atoms in the group.

If the trajectory does not contain velocity information then a NoDataError is raised.

New in version 0.7.5.

Deprecated since version 0.7.6: In 0.8 this will become an attribute! You can already use get_velocities() and set_velocities().

write(filename=None, format='PDB', filenamefmt='%(trjname)s_%(frame)d', **kwargs)[source]

Write AtomGroup to a file.

AtomGroup.write(filename[,format])

Keywords:
filename

None: create TRJNAME_FRAME.FORMAT from filenamefmt [None]

format

PDB, CRD, GRO; case-insensitive and can also be supplied as the filename extension [PDB]

filenamefmt

format string for default filename; use substitution tokens ‘trjname’ and ‘frame’ [“%(trjname)s_%(frame)d”]

bonds

how to handle bond information, especially relevant for PDBs [“conect”] - write only the CONECT records defined in the original file “all” - write out all bonds, both the original defined and those

guessed by MDAnalysis

None - do not write out bonds

write_selection(filename=None, format='vmd', filenamefmt='%(trjname)s_%(frame)d', **kwargs)[source]

Write AtomGroup selection to a file to be used in another programme.

Keywords:
filename

None: create TRJNAME_FRAME.FORMAT from filenamefmt

format

output file format: VMD (tcl), PyMol (pml), Gromacs (ndx), CHARMM (str); can also be supplied as the filename extension. Case insensitive. [vmd]

filenamefmt

format string for default filename; use ‘%(trjname)s’ and ‘%(frame)s’ placeholders; the extension is set according to the format [“%(trjname)s_%(frame)d”]

kwargs

additional keywords are passed on to the appropriate SelectionWriter

class MDAnalysis.core.AtomGroup.Atom(number, name, type, resname, resid, segid, mass, charge, residue=None, segment=None, radius=None, bfactor=None, resnum=None, serial=None)[source]

A class representing a single atom.

Atom is the basic building block of all larger data structures in MDAnalysis, in particular of the AtomGroup.

An Atom is typically generated by a topology reader from MDAnalysis.topology.

For performance reasons, only a predefined number of attributes are included (and thus it is not possible to add attributes “on the fly”; they have to be included in the class definition).

number

atom number

segid

name of the segment

resid

residue number

resnum

canonical residue number as, for instance, used in the original PDB file

New in version 0.7.4.

resname

residue name

residue

Residue object containing the atoms

id

atom number inside the residue

name

string, short name

type

string or number (from force field), describing the atom type; stored as a string.

Changed in version 0.7.6: The Atom.type attribute is always stored as a string.

mass

float, in atomic mass units (u)

charge

float, in electron charges (e)

radius

Born-radius for electrostatic calculations. (Only if read from a PQR file with the PQRReader.)

centroid()[source]

The centroid of an atom is its position, Atom.pos.

pos[source]

deprecated, use position, get the Current cartesian coordinates of the atom.

position[source]

Get the current cartesian coordinates of the atom. @return: a numpy 1x3 array

universe[source]

a pointer back to the Universe

velocity[source]

Current velocity of the atom.

A NoDataError is raised if the trajectory does not contain velocities.

New in version 0.7.5.

class MDAnalysis.core.AtomGroup.Residue(name, id, atoms, resnum=None)[source]

A group of atoms corresponding to a residue.

Pythonic access to atoms:
  • Using a atom name as attribute returns the matching atom (a Atom instance), i.e. r.name. Example:

    >>> from MDAnalysis.tests.datafiles import PSF,DCD
    >>> u = Universe(PSF,DCD)
    >>> print(u.s4AKE.r1.CA)  # C-alpha of M1
    < Atom 5: name 'CA' of type '22' of resname 'MET', resid 1 and segid '4AKE'>
    
  • r['name'] or r[id] - returns the atom corresponding to that name

Data:
Residue.name

Three letter residue name.

Residue.id

Numeric (integer) resid, taken from the topology.

Residue.resnum

Numeric canonical residue id (e.g. as used in the PDB structure).

Note

Creating a Residue modifies the underlying Atom instances. Each Atom can only belong to a single Residue.

Changed in version 0.7.4: Added Residue.resnum attribute and resnum keyword argument.

chi1_selection()[source]

AtomGroup corresponding to the chi1 sidechain dihedral N-CA-CB-CG.

Returns:4-atom selection in the correct order. If no CB and/or CG is found then this method returns None.

New in version 0.7.5.

omega_selection()[source]

AtomGroup corresponding to the omega protein backbone dihedral CA-C-N’-CA’.

omega describes the -C-N- peptide bond. Typically, it is trans (180 degrees) although cis-bonds (0 degrees) are also occasionally observed (especially near Proline).

Returns:4-atom selection in the correct order. If no C’ found in the previous residue (by resid) then this method returns None.
phi_selection()[source]

AtomGroup corresponding to the phi protein backbone dihedral C’-N-CA-C.

Returns:4-atom selection in the correct order. If no C’ found in the previous residue (by resid) then this method returns None.
psi_selection()[source]

AtomGroup corresponding to the psi protein backbone dihedral N-CA-C-N’.

Returns:4-atom selection in the correct order. If no N’ found in the following residue (by resid) then this method returns None.
class MDAnalysis.core.AtomGroup.ResidueGroup(residues)[source]

A group of residues.

Pythonic access to atoms:
  • Using a atom name as attribute returns a list of all atoms (a AtomGroup) of the same name. Example:

    >>> from MDAnalysis.tests.datafiles import PSF,DCD
    >>> u = Universe(PSF,DCD)
    >>> print(u.s4AKE.MET.CA)  # C-alpha of all Met
    <AtomGroup with 6 atoms>
    
Data:ResidueGroup._residues

Initialize the ResidueGroup with a list of Residue instances.

set(name, value, **kwargs)

Set attribute name to value for all residues in the ResidueGroup.

If value is a sequence of the same length as the ResidueGroup (AtomGroup.residues) then each Residue‘s property name is set to the corresponding value. If value is neither of length 1 (or a scalar) nor of the length of the ResidueGroup then a ValueError is raised.

New in version 0.7.5.

Changed in version 0.8: Can set residues to distinct values by providing a sequence or iterable.

set_resid(resid)[source]

Set the resid to integer resid for all residues in the ResidueGroup.

If resid is a sequence of the same length as the ResidueGroup then each Atom.resid is set to the corresponding value together with the Residue.id of the residue the atom belongs to. If value is neither of length 1 (or a scalar) nor of the length of the AtomGroup then a ValueError is raised.

Note

Changing resids can change the topology.

Assigning the same resid to multiple residues will merge these residues. The new residue name will be the name of the first old residue in the merged residue.

Warning

The values of this ResidueGroup are not being changed. You must create a new :class:`ResidueGroup` from the :class:`Universe` — only Atom instances are changed, everything else is derived from these atoms.

New in version 0.8.

set_resname(resname)[source]

Set the resname to string resname for all residues in the ResidueGroup.

If resname is a sequence of the same length as the ResidueGroup then each Atom.resname is set to the corresponding value together with the Residue.name of the residue the atom belongs to. If value is neither of length 1 (or a scalar) nor of the length of the AtomGroup then a ValueError is raised.

New in version 0.7.4.

Changed in version 0.7.5: Also changes the residues.

Changed in version 0.8: Can set atoms and residues to distinct values by providing a sequence or iterable.

set_resnum(resnum)[source]

Set the resnum to resnum for all residues in the ResidueGroup.

If resnum is a sequence of the same length as the ResidueGroup then each Atom.resnum is set to the corresponding value together with the Residue.resnum of the residue the atom belongs to. If value is neither of length 1 (or a scalar) nor of the length of the AtomGroup then a ValueError is raised.

Note

Changing resnum will not affect the topology: you can have multiple residues with the same resnum.

See also

set_resid()

New in version 0.7.4.

Changed in version 0.7.5: Also changes the residues.

Changed in version 0.8: Can set atoms and residues to distinct values by providing a sequence or iterable.

class MDAnalysis.core.AtomGroup.Segment(name, residues)[source]

A group of residues corresponding to one segment of the topology.

Pythonic access to residues:

  • The attribute rN returns the N-th residue Residue of the segment (numbering starts at N=1). Example:

    >>> from MDAnalysis.tests.datafiles import PSF,DCD
    >>> u = Universe(PSF,DCD)
    >>> print(u.s4AKE.r1)
    <Residue 'MET', 1>
    
  • Using a residue name as attribute returns a list of all residues (a ResidueGroup) of the same name. Example:

    >>> from MDAnalysis.tests.datafiles import PSF,DCD
    >>> u = Universe(PSF,DCD)
    >>> print(u.s4AKE.CYS)
    <ResidueGroup [<Residue 'CYS', 77>]>
    >>> print(u.s4AKE.MET)
    <ResidueGroup [<Residue 'MET', 1>, <Residue 'MET', 21>, <Residue 'MET', 34>, <Residue 'MET', 53>, <Residue 'MET', 96>, <Residue 'MET', 174>]>
    
Data:Segment.name is the segid from the topology or the chain identifier when loaded from a PDB

Initialize a Segment with segid name from a list of Residue instances.

id[source]

Segment id (alias for Segment.name)

id_setter[source]

Segment id (alias for Segment.name)

class MDAnalysis.core.AtomGroup.SegmentGroup(segments)[source]

A group of segments.

Pythonic access to segments:
  • Using a segid as attribute returns the segment. Because of python language rule, any segid starting with a non-letter character is prefixed with ‘s’, thus ‘4AKE’ –> ‘s4AKE’.

    Example:

    >>> from MDAnalysis.tests.datafiles import PSF,DCD
    >>> u = Universe(PSF,DCD)
    >>> print(u.atoms.segments.s4AKE)  # segment 4AKE
    <AtomGroup with 3314 atoms>
    
  • Indexing the group returns the appropriate segment.

Data:SegmentGroup._segments

Initialize the SegmentGroup with a list of Segment instances.

set(name, value, **kwargs)

Set attribute name to value for all Segment in this AtomGroup.

If value is a sequence of the same length as the SegmentGroup (AtomGroup.residues) then each Segment‘s property name is set to the corresponding value. If value is neither of length 1 (or a scalar) nor of the length of the SegmentGroup then a ValueError is raised.

New in version 0.8.

set_segid(segid, buildsegments=True)[source]

Set the segid to segid for all atoms in the SegmentGroup.

If segid is a sequence of the same length as the SegmentGroup then each Atom.segid is set to the corresponding value together with the Segment.id of the segment the atom belongs to. If value is neither of length 1 (or a scalar) nor of the length of the AtomGroup then a ValueError is raised.

Note

set_segid() can change the topology.

With the default buildsegments = True it can be used to join segments or to break groups into multiple disjoint segments. Note that each Atom can only belong to a single Segment.

For performance reasons, buildsegments can be set to False. Then one needs to run Universe._build_segments() manually later in order to update the list of Segment instances and regenerate the segid instant selectors.

New in version 0.7.4.

Changed in version 0.8: Can set atoms and residues to distinct values by providing a sequence or iterable.

MDAnalysis.core.AtomGroup.asUniverse(*args, **kwargs)[source]

Return a universe from the input arguments.

  1. If the first argument is a universe, just return it:

    as_universe(universe) --> universe
  2. Otherwise try to build a universe from the first or the first and second argument:

    asUniverse(PDB, **kwargs) --> Universe(PDB, **kwargs)
    asUniverse(PSF, DCD, **kwargs) --> Universe(PSF, DCD, **kwargs)
    asUniverse(*args, **kwargs) --> Universe(*args, **kwargs)
Returns:an instance of Universe
exception MDAnalysis.core.AtomGroup.SelectionError

Raised when a atom selection failed.

x.__init__(...) initializes x; see help(type(x)) for signature

exception MDAnalysis.core.AtomGroup.SelectionWarning

Warning indicating a possible problem with a selection.

x.__init__(...) initializes x; see help(type(x)) for signature

exception MDAnalysis.core.AtomGroup.NoDataError

Raised when empty input is not allowed or required data are missing.

x.__init__(...) initializes x; see help(type(x)) for signature