MDAnalysis reads coordinates from PDB files and additional optional data such as B-factors. It is also possible to substitute a PDB file instead of PSF file in order to define the list of atoms (but no connectivity information will be available in this case).
PDB files contain both coordinate and atom information. It is also possible to write trajectories as multi-frame (or multi-model) PDB files. This is not very space efficient but is sometimes the lowest common denominator for exchanging trajectories. Single frame and multi-frame PDB files are automatically recognized; the only difference is thath the single-frame PDB is represented as a trajectory with only one frame.
In order to write a selection to a PDB file one typically uses the MDAnalysis.core.AtomGroup.AtomGroup.write() method of the selection:
calphas = universe.selectAtoms("name CA")
calphas.write("calpha_only.pdb")
This uses the coordinates from the current timestep of the trajectory.
In order to write a PDB trajectory one needs to first obtain a multi-frame writer (keyword multiframe = True) and then iterate through the trajectory, while writing each frame:
calphas = universe.selectAtoms("name CA")
W = MDAnalysis.Writer("calpha_traj.pdb", multiframe=True)
for ts in u.trajectory:
W.write(calphas)
W.close()
It is important to always close the trajectory when done because only at this step is the final END record written, which is required by the PDB standard.
Two different implementations of PDB I/O are available: the “permissive” and the “strict” Reader/Writers. The default are the “permissive” ones but this can be changed by setting the flag “permissive_pdb_reader” in MDAnalysis.core.flags (see Flags) to False:
MDAnalysis.core.flags["permissive_pdb_reader"] = False
The default for MDAnalysis is to use the “permissive” PrimitivePDBReader and PrimitivePDBWriter, corresponding to
MDAnalysis.core.flags["permissive_pdb_reader"] = True
On a case-by-case basis one kind of reader can be selected with the permissive keyword to Universe, e.g.
u = MDAnalysis.Universe(PDB, permissive=False)
would select PDBReader instead of the default PrimitivePDBReader.
All classes make use of the Timestep class, which represents the current timestep.
A pure-Python implementation for PDB files commonly encountered in MD simulations comes under the names PrimitivePDBReader and PrimitivePDBWriter. It only implements a subset of the PDB standard (for instance, it does not deal with insertion codes) and also allows some typical enhancements such as 4-letter resids (introduced by CHARMM/NAMD). The “primitive PDB Reader/Writer” are the default in MDAnalysis (equivalent to supplying the permissive = True keyword to Universe).
The PrimitivePDBReader can read multi-frame PDB files and represents them as a trajectory. The PrimitivePDBWriter can write single and multi-frame PDB files as specified by the multiframe keyword. By default, it writes single frames. On the other hand, the MultiPDBWriter is set up to write a PDB trajectory by default (equivalent to using multiframe = True).
In order to write a multi-frame PDB trajectory from a universe u one can do the following:
pdb = MDAnalysis.Writer("all.pdb", multiframe=True)
for ts in u.trajectory:
pdb.write(u)
pdb.close()
Similarly, writing only a protein:
pdb = MDAnalysis.Writer("protein.pdb", multiframe=True)
protein = u.selectAtoms("protein")
for ts in u.trajectory:
pdb.write(protein)
pdb.close()
A single frame can be written with the write() method of any AtomGroup:
protein = u.selectAtoms("protein")
protein.write("protein.pdb")
Alternatively, get the single frame writer and supply the AtomGroup:
pdb = MDAnalysis.Writer("protein.pdb")
protein = u.selectAtoms("protein")
pdb.write(protein)
pdb.close()
PDBReader that reads a PDB-formatted file, no frills.
Reads multi-MODEL PDB files as trajectories.
COLUMNS | DATA TYPE | FIELD | DEFINITION |
---|---|---|---|
1 - 6 | Record name | “CRYST1” | |
7 - 15 | Real(9.3) | a | a (Angstroms). |
16 - 24 | Real(9.3) | b | b (Angstroms). |
25 - 33 | Real(9.3) | c | c (Angstroms). |
34 - 40 | Real(7.2) | alpha | alpha (degrees). |
41 - 47 | Real(7.2) | beta | beta (degrees). |
48 - 54 | Real(7.2) | gamma | gamma (degrees). |
1 - 6 | Record name | “ATOM “ | |
7 - 11 | Integer | serial | Atom serial number. |
13 - 16 | Atom | name | Atom name. |
17 | Character | altLoc | Alternate location indicator. IGNORED |
18 - 21 | Residue name | resName | Residue name. |
22 | Character | chainID | Chain identifier. |
23 - 26 | Integer | resSeq | Residue sequence number. |
27 | AChar | iCode | Code for insertion of residues. IGNORED |
31 - 38 | Real(8.3) | x | Orthogonal coordinates for X in Angstroms. |
39 - 46 | Real(8.3) | y | Orthogonal coordinates for Y in Angstroms. |
47 - 54 | Real(8.3) | z | Orthogonal coordinates for Z in Angstroms. |
55 - 60 | Real(6.2) | occupancy | Occupancy. |
61 - 66 | Real(6.2) | tempFactor | Temperature factor. |
67 - 76 | String | segID | (unofficial CHARMM extension ?) |
77 - 78 | LString(2) | element | Element symbol, right-justified. IGNORED |
79 - 80 | LString(2) | charge | Charge on the atom. IGNORED |
See also
Read coordinates from filename.
filename can be a gzipped or bzip2ed compressed PDB file.
If the pdb file contains multiple MODEL records then it is read as a trajectory where the MODEL numbers correspond to frame numbers. Therefore, the MODEL numbers must be a sequence of integers (typically starting at 1 or 0).
PDB writer that implements a subset of the PDB 3.2 standard .
PDB format as used by NAMD/CHARMM: 4-letter resnames and segID are allowed, altLoc is written.
The PrimitivePDBWriter can be used to either a dump a coordinate set to a PDB file (operating as a “single frame writer”, selected with the constructor keyword multiframe = False, the default) or by writing a PDB “movie” (multi frame mode, multiframe = True), consisting of multiple models (using the MODEL and ENDMDL records).
See also
This class is identical to MultiPDBWriter with the one exception that it defaults to writing single-frame PDB files as if multiframe = False was selected.
Changed in version 0.7.5: Initial support for multi-frame PDB files.
Changed in version 0.7.6: The multiframe keyword was added to select the writing mode. The writing of bond information (CONECT records) is now disabled by default but can be enabled with the bonds keyword.
Create a new PDBWriter
Arguments: |
|
---|
Check if the coordinate values fall within the range allowed for PDB files.
Deletes the output file if this is the first frame or if frames have already been written (in multi-frame mode) adds a REMARK instead of the coordinates and closes the file.
Raises ValueError if the coordinates fail the check.
Writes out all the bond records; works only for Universe objects.
Warning
All bonds are written out, using the old atom numbers - this is incorrect. Once a selection is made, the atom numbers have to be updated (currently they are unmodified) and bonds have to be selected for, only if all the atoms for a bond are still present.
Therefore, this method raises a NotImplementedError if CONECT records for anything smaller than the Universe are written.
Changed in version 0.7.6: Only write CONECT records if PrimitivePDBWriter.bonds == True. Raises NotImplementedError if it would produce wrong output.
Method to initialize important attributes in writer from a AtomGroup or Universe obj.
Attributes initialized/updated:
Before calling write_next_timestep() this method must be called at least once to enable extracting topology information from the current frame.
Write a new timestep ts to file
Does unit conversion if necessary.
By setting multiframe = True, MODEL ... ENDMDL records are written to represent trajectory frames in a multi-model PDB file. (At the moment we do not write the NUMMDL record.)
The multiframe = False keyword signals that the PrimitivePDBWriter is in single frame mode and no MODEL records are written.
Changed in version 0.7.6: The multiframe keyword was added, which completely determines if MODEL records are written. (Previously, this was decided based on the underlying trajectory and only if len(traj) > 1 would MODEL records have been written.)
Write ATOM record.
Only some keword args are optional (altLoc, iCode, chainID), for some defaults are set.
All inputs are cut to the maximum allowed length. For integer numbers the highest-value digits are chopped (so that the serial and reSeq wrap); for strings the trailing characters are chopped. The last character of chainID becomes the PDB chainID (unless it has the special value “SYSTEM” (assigned by MDAnalysis if neither segID nor chainID were available), in which case the PDB will have an empty chainID).
Changed in version 0.7.6: If the chainID has the special value “SYSTEM” (case insensitive) then the chain is set to the empty string “”.
Write END record.
Only a single END record is written. Calling END multiple times has no effect. Because close() also calls this method right before closing the file it is recommended to not call END() explicitly.
Write generic REMARK record (without number).
Each string provided in remarks is written as a separate REMARK record.
See also REMARK (update).
wrap comments into REMARK records that are not longer than
Write object obj at current trajectory frame to file.
obj can be a selection (i.e. a AtomGroup) or a whole Universe.
The last letter of the segid is used as the PDB chainID (but see ATOM() for details).
Arguments: |
---|
Write all timesteps associated with obj to the PDB file.
obj can be a AtomGroup or a Universe.
The method writes the frames from the one specified as start until the end, using a step of skip (start and skip are set in the constructor). Thus, if u is a Universe then
u.trajectory[-2]
pdb = PrimitivePDBWriter("out.pdb", u.atoms.numberOfAtoms())
pdb.write_all_timesteps(u)
will write a PDB trajectory containing the last 2 frames and
pdb = PrimitivePDBWriter("out.pdb", u.atoms.numberOfAtoms(), start=12, skip=2)
pdb.write_all_timesteps(u)
will be writing frames 12, 14, 16, ...
write a new timestep to the PDB file
Keywords: |
|
---|
Note
Before using this method with another Timestep in the ts argument, PrimitivePDBWriter._update_frame() must be called with the Universe as its argument so that topology information can be gathered.
PDB writer that implements a subset of the PDB 3.2 standard .
PDB format as used by NAMD/CHARMM: 4-letter resnames and segID, altLoc is written.
By default, MultiPDBWriter writes a PDB “movie” (multi frame mode, multiframe = True), consisting of multiple models (using the MODEL and ENDMDL records).
See also
This class is identical to PrimitivePDBWriter with the one exception that it defaults to writing multi-frame PDB files instead of single frames.
New in version 0.7.6.
Create a new PDBWriter
Arguments: |
|
---|
The PDB module can make use of Biopython’s Bio.PDB [Hamelryck2003] but replaces the standard PDB file parser with one that uses the MDAnalysis.coordinates.pdb.extensions.SloppyStructureBuilder to cope with very large pdb files as commonly encountered in MD simulations. The Biopython-based PDBReader has the advantage that it implements the PDB standard rigorously but this comes at the cost of flexibility and performance. It is also difficult to write out selections using this implementation (PDBWriter) and multi frame PDB files are not implemented. The Biopython Reader/Writer can be selected when loading data into a Universe by providing the keyword permissive = False.
The Biopython PDB parser Bio.PDB.PDBParser is fairly strict and even in its own permissive mode (which MDAnalysis employs) typically warns about missing element names with a Bio.PDB.PDBExceptions.PDBConstructionWarning . Such warnings, however, are generally harmless and therefore are filtered (and ignored) by MDAnalysis with the help of warnings.filterwarnings().
Read a pdb file into a BioPython pdb structure.
The coordinates are also supplied as one numpy array and wrapped into a Timestep object; attributes are set so that the PDBReader object superficially resembles the DCDReader object.
Write out the current time step as a pdb file.
This is not cleanly implemented at the moment. One must supply a universe, even though this is nominally an optional argument. The class behaves slightly differently depending on if the structure was loaded from a PDB (then the full-fledged Bio.PDB writer is used) or if this is really only an atom selection (then a less sophistiocated writer is employed).
Note
The standard PDBWriter can only write the whole system. In order to write a selection, use the PrimitivePDBWriter , which happens automatically when the write() method of a AtomGroup instance is used.
pdbwriter = PDBWriter(<pdbfilename>,universe=universe,**kwargs)
Arguments: |
universe supply a universe [really REQUIRED; optional only for compatibility]
|
---|
[Hamelryck2003] | Hamelryck, T., Manderick, B. (2003) PDB parser and structure class implemented in Python. Bioinformatics, 19, 2308-2310. http://biopython.org |