PDBModel¶
-
class
biskit.
PDBModel
(source=None, pdbCode=None, noxyz=0, skipRes=None, headPatterns=[])[source]¶ Bases:
object
Store and manipulate coordinates and atom infos stemming from a PDB file. Coordinates are stored in the numpy array ‘xyz’; the additional atom infos from the PDB (name, residue_name, and many more) are efficiently stored in a
PDBProfiles
instance ‘atoms’ which can be used to also associate arbitrary other data to the atoms. Moreover, a similar collection ‘residues’ can hold data associated to residues (but is initially empty). A normal dictionary ‘info’ accepts any information about the whole model.For detailed documentation, see http://biskit.pasteur.fr/doc/handling_structures/PDBModel
- @todo:
- outsource validSource into PDBParserFactory
- prevent repeated loading of test PDB for each test
Methods Overview
__init__
Examples: addChainFromSegid
Takes the last letter of the segment ID and adds it as chain ID. addChainId
Assign consecutive chain identifiers A - Z to all atoms. argsort
Prepare sorting atoms within residues according to comparison function. atom2chainIndices
Convert atom indices to chain indices. atom2chainMask
Mask (set to 0) chains for which all atoms are masked (0) in atomMask. atom2resIndices
Get list of indices of residues for which any atom is in indices. atom2resMask
Mask (set 0) residues for which all atoms are masked (0) in atomMask. atom2resProfile
Get a residue profile where each residue has the value that its first atom has in the atom profile. atomNames
Return a list of atom names from start to stop RESIDUE index atomRange
>>> m.atomRange() == range( m.lenAtoms() )
atomkey
Create a string key encoding the atom content of this model independent of the order in which atoms appear within residues. biomodel
Return the ‘biologically relevant assembly’ of this model according to the information in the PDB’s BIOMT record (captured in info[‘BIOMT’]). center
Geometric centar of model. centerOfMass
Center of mass of PDBModel. centered
Get model with centered coordinates. chain2atomIndices
Convert chain indices into atom indices. chain2atomMask
Convert chain mask to atom mask. chainBreaks
Identify discontinuities in the molecule’s backbone. chainEndIndex
Get the position of the each residue’s last atom. chainIndex
Get indices of first atom of each chain. chainMap
Get chain index of each atom. clone
Clone PDBModel. compareAtoms
Get list of atom indices for this and reference model that converts both into 2 models with identical residue and atom content. compareChains
Get list of corresponding chain indices for this and reference model. compress
Compress PDBmodel using mask. concat
Concatenate atoms, coordinates and profiles. disconnect
Disconnect this model from its source (if any). equals
Compares the residue and atom sequence in the given range. extendIndex
Translate a list of positions that is defined, e.g., on residues (/chains) to a list of atom positions AND also return the starting position of each residue (/chain) in the new sub-list of atoms. extendMask
Translate a mask that is defined,e.g., on residues(/chains) to a mask that is defined on atoms. filter
Extract atoms that match a combination of key=values. filterIndex
Get atom positions that match a combination of key=values. fit
Least-square fit this model onto refMode getAtoms
Get atom CrossViews that can be used like dictionaries. getPdbCode
Return pdb code of model. getXyz
Get coordinates, fetch from source PDB or pickled PDBModel, if necessary. index2map
Create a map of len_i length, giving the residue(/chain) numer of each atom, from list of residue(/chain) starting positions. indices
Get atom indices conforming condition. indicesFrom
Get atom indices conforming condition applied to an atom profile. keep
Replace atoms,coordinates,profiles of this(!) model with sub-set. lenAtoms
Number of atoms in model. lenBiounits
Number of biological assemblies defined in PDB BIOMT record, if any. lenChains
Number of chains in model. lenResidues
Number of residues in model. magicFit
Superimpose this model onto a ref. map2index
Identify the starting positions of each residue(/chain) from a map giving the residue(/chain) number of each atom. mask
Get atom mask. maskBB
Short cut for mask of all backbone atoms. maskCA
Short cut for mask of all CA atoms. maskCB
Short cut for mask of all CB I{and} CA of GLY. maskDNA
Short cut for mask of all atoms in DNA (based on residue name). maskF
Create list whith result of atomFunction( atom ) for each atom. maskFrom
Create an atom mask from the values of a specific profile. maskH
Short cut for mask of hydrogens. maskH2O
Short cut for mask of all atoms in residues named TIP3, HOH and WAT maskHeavy
Short cut for mask of all heavy atoms. maskHetatm
Short cut for mask of all HETATM maskNA
Short cut for mask of all atoms in DNA or RNA (based on residue name). maskProtein
Short cut for mask containing all atoms of amino acids. maskRNA
Short cut for mask of all atoms in RNA (based on residue name). maskSolvent
Short cut for mask of all atoms in residues named TIP3, HOH, WAT, Na+, Cl-, CA, ZN mass
Molecular weight of PDBModel. masses
Collect the molecular weight of all atoms in PDBModel. mergeChains
Merge two adjacent chains. mergeResidues
Merge two adjacent residues. plot
Get a quick & dirty overview over the content of a PDBModel. profile
Use:: profile( name, updateMissing=0) -> atom or residue profile profile2atomMask
Same as profile2mask
, but converts residue mask to atom mask.profile2mask
param cutoff_min: low value cutoff (all values >= cutoff_min) :type cutoff_min: float :param cutoff_max: high value cutoff (all values < cutoff_max) :type cutoff_max: float profile2resList
Group the profile values of each residue’s atoms into a separate list. profileChangedFromDisc
Check if profile has changed compared to source. profileInfo
Use: remove
Convenience access to the 3 different remove methods. removeProfile
Remove residue or atom profile(s) removeRes
Remove all atoms with a certain residue name. renameAmberRes
Rename special residue names from Amber back into standard names (i.e CYX S{->} CYS ) renumberResidues
Make all residue numbers consecutive and remove any insertion code letters. report
Print (or return) a brief description of this model. reportAtoms
param i: optional list of atom positions to report (default: all) :type i: [ int ] :return: formatted string with atom and residue names similar to PDB :rtype: str res2atomIndices
Convert residue indices to atom indices. res2atomMask
Convert residue mask to atom mask. res2atomProfile
Get an atom profile where each atom has the value its residue has in the residue profile. resEndIndex
Get the position of the each residue’s last atom. resIndex
Get the position of the each residue’s first atom. resList
Return list of lists of atom pseudo dictionaries per residue, which allows to iterate over residues and atoms of residues. resMap
Get list to map from any atom to a continuous residue numbering (starting with 0). resMapOriginal
Generate list to map from any atom to its ORIGINAL(!) PDB residue number. resModels
Creates one new PDBModel for each residue in the parent PDBModel. residusMaximus
Take list of value per atom, return list where all atoms of any residue are set to the highest value of any atom in that residue. rms
Rmsd between two PDBModels. saveAs
Pickle this PDBModel to a file, set the ‘source’ field to this file name and mark atoms, xyz, and profiles as unchanged. sequence
Amino acid sequence in one letter code. setPdbCode
Set model pdb code. setSource
param source: LocalPath OR PDBModel OR str setXyz
Replace coordinates. slim
Remove xyz array and profiles if they haven’t been changed and could hence be loaded from the source file (only if there is a source file…). sort
Apply a given sort list to the atoms of this model. sourceFile
Name of pickled source or PDB file. structureFit
Structure-align this model onto a reference model using the external TM-Align program (which needs to be installed). take
Extract a PDBModel with a subset of atoms: takeChains
Get copy of this model with only the given chains. takeResidues
Copy the given residues into a new model. transform
Transform coordinates of PDBModel. transformation
Get the transformation matrix which least-square fits this model onto the other model. unequalAtoms
Identify atoms that are not matching between two models. unsort
Undo a previous sorting on the model itself (no copy). update
Read coordinates, atoms, fileName, etc. validSource
Check for a valid source on disk. version
writePdb
Save model as PDB file. xplor2amber
Rename atoms so that tleap from Amber can read the PDB. xyzChangedFromDisc
Tell whether xyz can currently be reconstructed from a source on disc. xyzIsChanged
Tell if xyz or atoms have been changed compared to source file or source object (which can be still in memory). Attributes Overview
PDB_KEYS
keys of all atom profiles that are read directly from the PDB file
PDBModel Method & Attribute Details
-
PDB_KEYS
= ['name', 'residue_number', 'insertion_code', 'alternate', 'name_original', 'chain_id', 'occupancy', 'element', 'segment_id', 'charge', 'residue_name', 'after_ter', 'serial_number', 'type', 'temperature_factor']¶ keys of all atom profiles that are read directly from the PDB file
-
__init__
(source=None, pdbCode=None, noxyz=0, skipRes=None, headPatterns=[])[source]¶ Examples:
PDBModel()
creates an empty Model to which coordinates (field xyz) and PDB records (atom profiles) have still to be added.PDBModel( file_name )
creates a complete model with coordinates and PDB records from file_name (pdb, pdb.gz, or pickled PDBModel)PDBModel( PDBModel )
creates a copy of the given modelPDBModel( PDBModel, noxyz=1 )
creates a copy without coordinates
Parameters: - source (str or PDBModel) – str, file name of pdb/pdb.gz file OR pickled PDBModel OR PDBModel, template structure to copy atoms/xyz field from
- pdbCode (str or None) – PDB code, is extracted from file name otherwise
- noxyz (0||1) – 0 (default) || 1, create without coordinates
- headPatterns ([(str, str)]) – [(putIntoKey, regex)] extract given REMARK values
Raises: PDBError – if file exists but can’t be read
-
residues
= None¶ save atom-/residue-based values
-
xyzChanged
= None¶ monitor changes of coordinates
-
initVersion
= None¶ version as of creation of this object
-
info
= None¶ to collect further informations
-
report
(prnt=True, plot=False, clipseq=60)[source]¶ Print (or return) a brief description of this model.
Parameters: - prnt (bool) – directly print report to STDOUT (default True)
- plot (bool) – show simple 2-D line plot using gnuplot [False]
- clipseq (int) – clip chain sequences at this number of letters [60]
Returns: if prnt==True: None, else: formatted description of this model
Return type: None or str
-
plot
(hetatm=False)[source]¶ Get a quick & dirty overview over the content of a PDBModel. plot simply creates a 2-D plot of all x-coordinates versus all y coordinates, colored by chain. This is obviously not publication-quality ;-). Use the Biskit.Pymoler class for real visalization.
Parameters: hetatm (bool) – include hetero & solvent atoms (default False)
-
update
(skipRes=None, updateMissing=0, force=0, headPatterns=[])[source]¶ Read coordinates, atoms, fileName, etc. from PDB or pickled PDBModel - but only if they are currently empty. The atomsChanged and xyzChanged flags are not changed.
Parameters: - skipRes (list of str) – names of residues to skip if updating from PDB
- updateMissing (0|1) – 0(default): update only existing profiles
- force (0|1) – ignore invalid source (0) or report error (1)
- headPatterns ([(str, str)]) – [(putIntoKey, regex)] extract given REMARKS
Raises: PDBError – if file can’t be unpickled or read:
-
setXyz
(xyz)[source]¶ Replace coordinates.
Parameters: xyz (array) – Numpy array ( 3 x N_atoms ) of float Returns: array( 3 x N_atoms ) or None, old coordinates Return type: array
-
getXyz
(mask=None)[source]¶ Get coordinates, fetch from source PDB or pickled PDBModel, if necessary.
Parameters: mask (list of int OR array of 1||0) – atom mask Returns: xyz-coordinates, array( 3 x N_atoms, Float32 ) Return type: array
-
getAtoms
(mask=None)[source]¶ Get atom CrossViews that can be used like dictionaries. Note that the direct manipulation of individual profiles is more efficient than the manipulation of CrossViews (on profiles)!
Parameters: mask (list of int OR array of 1||0) – atom mask Returns: list of CrossView dictionaries Return type: [ ProfileCollection.CrossView
]
-
profile
(name, default=None, update=True, updateMissing=False)[source]¶ - Use::
- profile( name, updateMissing=0) -> atom or residue profile
Parameters: - name (str) – name to access profile
- default – default result if no profile is found, if None,
try to update from source and raise error [None] :type default: any :param update: update from source before returning empty profile [True] :type update: bool :param updateMissing: update from source before reporting missing
profile [False]Raises: ProfileError – if neither atom- nor rProfiles contains |name|
-
profileInfo
(name, updateMissing=0)[source]¶ Use:
profileInfo( name ) -> dict with infos about profileParameters: - name (str) – name to access profile
- updateMissing (0|1) –
update from source before reporting missing profile. Guaranteed infos are:
- ’version’ (str)
- ’comment’ (str)
- ’changed’ (1||0)
Raises: ProfileError – if neither atom - nor rProfiles contains |name|
-
removeProfile
(*names)[source]¶ Remove residue or atom profile(s)
Use:
removeProfile( str_name [,name2, name3] ) -> 1|0,Parameters: names (str OR list of str) – name or list of residue or atom profiles Returns: 1 if at least 1 profile has been deleted, 0 if none has been found Return type: int
-
xyzIsChanged
()[source]¶ Tell if xyz or atoms have been changed compared to source file or source object (which can be still in memory).
Returns: xyz field has been changed with respect to source Return type: (1||0, 1||0)
-
xyzChangedFromDisc
()[source]¶ Tell whether xyz can currently be reconstructed from a source on disc. Same as xyzChanged() unless source is another not yet saved PDBModel instance that made changes relative to its own source.
Returns: xyz has been changed Return type: bool
-
profileChangedFromDisc
(pname)[source]¶ Check if profile has changed compared to source.
Returns: 1, if profile |pname| can currently not be reconstructed from a source on disc. Return type: int Raises: ProfileError – if there is no atom or res profile with pname
-
slim
()[source]¶ Remove xyz array and profiles if they haven’t been changed and could hence be loaded from the source file (only if there is a source file…). AUTOMATICALLY CALLED BEFORE PICKLING Currently also called by deepcopy via getstate
-
validSource
()[source]¶ Check for a valid source on disk.
Returns: str or PDBModel, None if this model has no valid source Return type: str or PDBModel or None
-
sourceFile
()[source]¶ Name of pickled source or PDB file. If this model has another PDBModel as source, the request is passed on to this one.
Returns: file name of pickled source or PDB file Return type: str Raises: PDBError – if there is no valid source
-
disconnect
()[source]¶ Disconnect this model from its source (if any).
Note
If this model has an (in-memory) PDBModel instance as source, the entries of ‘atoms’ could still reference the same dictionaries.
-
sequence
(mask=None, xtable={'ca': '+', 'cl-': '-', 'hoh': '~', 'na+': '+', 'nap': 'X', 'ndp': 'X', 'tip3': '~', 'wat': '~'})[source]¶ Amino acid sequence in one letter code.
Parameters: - mask (list or array) – atom mask, to apply before (default None)
- xtable (dict) – dict {str:str}, additional residue:single_letter mapping for non-standard residues (default molUtils.xxDic) [currently not used]
Returns: 1-letter-code AA sequence (based on first atom of each res).
Return type: str
-
xplor2amber
(aatm=True, parm10=False)[source]¶ Rename atoms so that tleap from Amber can read the PDB. If HIS residues contain atoms named HE2 or/and HD2, the residue name is changed to HIE or HID or HIP, respectively. Disulfide bonds are not yet identified - CYS -> CYX renaming must be done manually (see AmberParmBuilder for an example). Internally amber uses H atom names ala HD21 while (old) standard pdb files use 1HD2. By default, ambpdb produces ‘standard’ pdb atom names but it can output the less ambiguous amber names with switch -aatm.
Parameters: - change (1|0) – change this model’s atoms directly (default:1)
- aatm (1|0) – use, for example, HG23 instead of 3HG2 (default:1)
- parm10 (1|0) – adapt nucleic acid atom names to 2010 Amber forcefield
Returns: [ {..} ], list of atom dictionaries
Return type: list of atom dictionaries
-
renameAmberRes
()[source]¶ Rename special residue names from Amber back into standard names (i.e CYX S{->} CYS )
-
writePdb
(fname, ter=1, amber=0, original=0, left=0, wrap=0, headlines=None, taillines=None)[source]¶ Save model as PDB file.
Parameters: - fname (str) – name of new file
- ter (int) –
Option of how to treat the terminal record:
- 0 - don’t write any TER statements
- 1 - restore original TER statements (doesn’t work, if preceeding atom has been deleted) [default]
- 2 - put TER between all detected chains
- 3 - as 2 but also detect and split discontinuous chains
- amber (1||0) – amber formatted atom names (implies ter=3, left=1, wrap=0) (default 0)
- original (1||0) – revert atom names to the ones parsed in from PDB (default 0)
- left (1||0) – left-align atom names (as in amber pdbs)(default 0)
- wrap (1||0) – write e.g. ‘NH12’ as ‘2NH1’ (default 0)
- headlines (list of tuples) – [( str, dict or str)], list of record / data tuples:: e.g. [ (‘SEQRES’, ‘ 1 A 22 ALA GLY ALA’), ]
- taillines (list of tuples) – same as headlines but appended at the end of file
-
saveAs
(path)[source]¶ Pickle this PDBModel to a file, set the ‘source’ field to this file name and mark atoms, xyz, and profiles as unchanged. Normal pickling of the object will only dump those data that can not be reconstructed from the source of this model (if any). saveAs creates a ‘new source’ without further dependencies.
Parameters: path (str OR LocalPath instance) – target file name
-
maskF
(atomFunction, numpy=1)[source]¶ Create list whith result of atomFunction( atom ) for each atom. (Depending on the return value of atomFunction, the result is not necessarily a mask of 0 and 1. Creating masks should be just the most common usage).
Note:
This method is slow compared to maskFrom because the dictionaries that are given to the atomFunction have to be created from aProfiles on the fly. If performance matters, better combine the result from several maskFrom calls, e.g. instead of:
r = m.maskF( lambda a: a['name']=='CA' and a['residue_name']=='ALA' )
use:
r = m.maskFrom( 'name', 'CA' ) * m.maskFrom('residue_name', 'ALA')
Parameters: - atomFunction (1||0) – function( dict_from_aProfiles.toDict() ), true || false (Condition)
- numpy (int) – 1(default)||0, convert result to Numpy array of int
Returns: Numpy array( [0,1,1,0,0,0,1,0,..], Int) or list
Return type: array or list
-
maskFrom
(key, cond)[source]¶ Create an atom mask from the values of a specific profile. Example, the following three statements are equivalent:
>>> mask = m.maskFrom( 'name', 'CA' ) >>> mask = m.maskFrom( 'name', lambda a: a == 'CA' ) >>> mask = N0.array( [ a == 'CA' for a in m.atoms['name'] ] )
However, the same can be also achieved with standard numpy operators:
>>> mask = numpy.array(m.atoms['name']) == 'CA'
Parameters: - key (str) – the name of the profile to use
- cond (function OR any OR [ any ]) – either a function accepting a single value or a value or an iterable of values (to allow several alternatives)
Returns: array or list of indices where condition is met
Return type: list or array of int
-
maskCA
(force=0)[source]¶ Short cut for mask of all CA atoms.
Parameters: force (0||1) – force calculation even if cached mask is available Returns: array( 1 x N_atoms ) of 0||1 Return type: array
-
maskBB
(force=0, solvent=0)[source]¶ Short cut for mask of all backbone atoms. Supports standard protein and DNA atom names. Any residues classified as solvent (water, ions) are filtered out.
Parameters: - force (0||1) – force calculation even if cached mask is available
- solvent (1||0) – include solvent residues (default: false)
Returns: array( 1 x N_atoms ) of 0||1
Return type: array
-
maskHeavy
(force=0)[source]¶ Short cut for mask of all heavy atoms. (‘element’ <> H)
Parameters: force (0||1) – force calculation even if cached mask is available Returns: array( 1 x N_atoms ) of 0||1 Return type: array
-
maskH
()[source]¶ Short cut for mask of hydrogens. (‘element’ == H)
Returns: array( 1 x N_atoms ) of 0||1 Return type: array
-
maskCB
()[source]¶ Short cut for mask of all CB I{and} CA of GLY.
Returns: mask of all CB plus CA of GLY Return type: array
-
maskH2O
()[source]¶ Short cut for mask of all atoms in residues named TIP3, HOH and WAT
Returns: array( 1 x N_atoms ) of 0||1 Return type: array
-
maskSolvent
()[source]¶ Short cut for mask of all atoms in residues named TIP3, HOH, WAT, Na+, Cl-, CA, ZN
Returns: array( 1 x N_atoms ) of 0||1 Return type: array
-
maskHetatm
()[source]¶ Short cut for mask of all HETATM
Returns: array( 1 x N_atoms ) of 0||1 Return type: array
-
maskProtein
(standard=0)[source]¶ Short cut for mask containing all atoms of amino acids.
Parameters: standard (0|1) – only standard residue names (not CYX, NME,..) (default 0) Returns: array( 1 x N_atoms ) of 0||1, mask of all protein atoms (based on residue name) Return type: array
-
maskDNA
()[source]¶ Short cut for mask of all atoms in DNA (based on residue name).
Returns: array( 1 x N_atoms ) of 0||1 Return type: array
-
maskRNA
()[source]¶ Short cut for mask of all atoms in RNA (based on residue name).
Returns: array( 1 x N_atoms ) of 0||1 Return type: array
-
maskNA
()[source]¶ Short cut for mask of all atoms in DNA or RNA (based on residue name).
Returns: array( 1 x N_atoms ) of 0||1 Return type: array
-
indicesFrom
(key, cond)[source]¶ Get atom indices conforming condition applied to an atom profile. Corresponds to:
>>> numpy.nonzero( m.maskFrom( key, cond) )
Parameters: - key (str) – the name of the profile to use
- cond (function OR any OR [any]) – either a function accepting a single value or a value or an iterable of values
Returns: array of indices where condition is met
:rtype : array of int
-
indices
(what)[source]¶ Get atom indices conforming condition. This is a convenience method to ‘normalize’ different kind of selections (masks, atom names, indices, functions) to indices as they are e.g. required by
PDBModel.take
.Parameters: what (function OR list of str or int OR int) – Selection:: - function applied to each atom entry,
e.g. lambda a: a[‘residue_name’]==’GLY’- list of str, allowed atom names
- list of int, allowed atom indices OR mask with only 1 and 0
- int, single allowed atom index
Returns: N_atoms x 1 (0||1 ) Return type: Numeric array Raises: PDBError – if what is neither of above
-
mask
(what)[source]¶ Get atom mask. This is a convenience method to ‘normalize’ different kind of selections (masks, atom names, indices, functions) to a mask as it is e.g. required by
PDBModel.compress
.Parameters: what (function OR list of str or int OR int) – Selection:: - function applied to each atom entry,
e.g. lambda a: a[‘residue_name’]==’GLY’- list of str, allowed atom names
- list of int, allowed atom indices OR mask with only 1 and 0
- int, single allowed atom index
Returns: N_atoms x 1 (0||1 ) Return type: Numeric array Raises: PDBError – if what is neither of above
-
index2map
(index, len_i)[source]¶ Create a map of len_i length, giving the residue(/chain) numer of each atom, from list of residue(/chain) starting positions.
Parameters: - index ([ int ] or array of int) – list of starting positions, e.g. [0, 3, 8]
- len_i (int) – length of target map, e.g. 10
Returns: list mapping atom positions to residue(/chain) number, e.g. [0,0,0, 1,1,1,1,1, 2,2] from above example
Return type: array of int (and of len_i length)
-
map2index
(imap)[source]¶ Identify the starting positions of each residue(/chain) from a map giving the residue(/chain) number of each atom.
Parameters: imap ([ int ]) – something like [0,0,0,1,1,1,1,1,2,2,2,…] Returns: list of starting positions, e.g. [0, 3, 8, …] in above ex. Return type: array of int
-
extendMask
(mask, index, len_i)[source]¶ Translate a mask that is defined,e.g., on residues(/chains) to a mask that is defined on atoms.
:param mask : mask marking positions in the list of residues or chains :type mask : [ bool ] or array of bool or of 1||0 :param index: starting positions of all residues or chains :type index: [ int ] or array of int :param len_i: length of target mask :type len_i: int
Returns: mask that blows up the residue / chain mask to an atom mask Return type: array of bool
-
extendIndex
(i, index, len_i)[source]¶ Translate a list of positions that is defined, e.g., on residues (/chains) to a list of atom positions AND also return the starting position of each residue (/chain) in the new sub-list of atoms.
:param i : positions in higher level list of residues or chains :type i : [ int ] or array of int :param index: atomic starting positions of all residues or chains :type index: [ int ] or array of int :param len_i: length of atom index (total number of atoms) :type len_i: int
Returns: (ri, rindex) - atom positions & new index Return type: array of int, array of int
-
atom2resMask
(atomMask)[source]¶ Mask (set 0) residues for which all atoms are masked (0) in atomMask.
Parameters: atomMask (list/array of int) – list/array of int, 1 x N_atoms Returns: 1 x N_residues (0||1 ) Return type: array of int
-
atom2resIndices
(indices)[source]¶ Get list of indices of residues for which any atom is in indices.
Note: in the current implementation, the resulting residues are returned in their old order, regardless of the order of input positions.
Parameters: indices (list of int) – list of atom indices Returns: indices of residues Return type: list of int
-
res2atomMask
(resMask)[source]¶ Convert residue mask to atom mask.
Parameters: resMask (list/array of int) – list/array of int, 1 x N_residues Returns: 1 x N_atoms Return type: array of int
-
res2atomIndices
(indices)[source]¶ Convert residue indices to atom indices.
Parameters: indices (list/array of int) – list/array of residue indices Returns: array of atom positions Return type: array of int
-
atom2chainIndices
(indices, breaks=0)[source]¶ Convert atom indices to chain indices. Each chain is only returned once.
Parameters: - indices (list of int) – list of atom indices
- breaks (0||1) – look for chain breaks in backbone coordinates (def. 0)
Returns: chains any atom which is in indices
Return type: list of int
-
atom2chainMask
(atomMask, breaks=0)[source]¶ Mask (set to 0) chains for which all atoms are masked (0) in atomMask. Put another way: Mark all chains that contain any atom that is marked ‘1’ in atomMask.
Parameters: atomMask (list/array of int) – list/array of int, 1 x N_atoms Returns: 1 x N_residues (0||1 ) Return type: array of int
-
chain2atomMask
(chainMask, breaks=0)[source]¶ Convert chain mask to atom mask.
Parameters: - chainMask (list/array of int) – list/array of int, 1 x N_chains
- breaks (0||1) – look for chain breaks in backbone coordinates (def. 0)
Returns: 1 x N_atoms
Return type: array of int
-
chain2atomIndices
(indices, breaks=0)[source]¶ Convert chain indices into atom indices.
Parameters: indices (list/array of int) – list/array of chain indices Returns: array of atom positions, new chain index Return type: array of int
-
res2atomProfile
(p)[source]¶ Get an atom profile where each atom has the value its residue has in the residue profile.
Parameters: p (str) – name of existing residue profile OR … [ any ], list of lenResidues() length Returns: [ any ] OR array, atom profile Return type: list or array
-
atom2resProfile
(p, f=None)[source]¶ Get a residue profile where each residue has the value that its first atom has in the atom profile. :param p: name of existing atom profile OR …
[ any ], list of lenAtoms() lengthParameters: f (func) – function to calculate single residue from many atom values f( [atom_value1, atom_value2,…] ) -> res_value (default None, simply take value of first atom in each res.) Returns: [ any ] OR array, residue profile Return type: list or array
-
profile2mask
(str_profname[, cutoff_min, cutoff_max=None])[source]¶ Parameters: - cutoff_min (float) – low value cutoff (all values >= cutoff_min)
- cutoff_max (float) – high value cutoff (all values < cutoff_max)
Returns: mask len( profile(profName) ) x 1||0
Return type: array
Raises: ProfileError – if no profile is found with name profName
-
profile2atomMask
(str_profname[, cutoff_min, cutoff_max=None])[source]¶ Same as
profile2mask
, but converts residue mask to atom mask.Parameters: - cutoff_min (float) – low value cutoff
- cutoff_max (float) – high value cutoff
Returns: mask N_atoms x 1|0
Return type: array
Raises: ProfileError – if no profile is found with name profName
-
profile2resList
(p)[source]¶ Group the profile values of each residue’s atoms into a separate list. :param p: name of existing atom profile OR …
[ any ], list of lenAtoms() lengthReturns: a list (one entry per residue) of lists (one entry per resatom) Return type: [ [ any ] ]
-
mergeChains
(c1, id='', segid='', rmOxt=True, renumberAtoms=False, renumberResidues=True)[source]¶ Merge two adjacent chains. This merely removes all internal markers for a chain boundary. Atom content or coordinates are not modified.
PDBModel tracks chain boundaries in an internal _chainIndex. However, there are cases when this chainIndex needs to be re-built and new chain boundaries are then infered from jumps in chain- or segment labelling or residue numbering. mergeChains automatically re-assigns PDB chain- and segment IDs as well as residue numbering to prepare for this situation.
:param c1 : first of the two chains to be merged :type c1 : int :param id : chain ID of the new chain (default: ID of first chain) :type id : str :param segid: ew chain’s segid (default: SEGID of first chain) :type segid: str :param renumberAtoms: rewrite PDB serial numbering of the adjacent
chain to be consequtive to the last atom of the first chain (default: False)Parameters: renumberResidues (bool) – shift PDB residue numbering so that the first residue of the adjacent chain follows the previous residue. Other than for atom numbering, later jumps in residue numbering are preserved. (default: True)
-
mergeResidues
(r1, name='', residue_number=None, chain_id='', segment_id='', renumberAtoms=False)[source]¶ Merge two adjacent residues. Duplicate atoms are labelled with alternate codes ‘A’ (first occurrence) to ‘B’ or later. :param r1: first of the two residues to be merged :type r1: int :param name: name of the new residue (default: name of first residue) :type name: str
-
concat
(*models, **kw)[source]¶ Concatenate atoms, coordinates and profiles. source and fileName are lost, so are profiles that are not available in all models. model0.concat( model1 [, model2, ..]) -> single PDBModel.
Parameters: - models (one or more PDBModel instances) – models to concatenate
- newRes (bool) – treat beginning of second model as new residue (True)
- newChain (bool) – treat beginning of second model as new chain (True)
Note: info records of given models are lost.
-
take
(i, rindex=None, cindex=None, *initArgs, **initKw)[source]¶ Extract a PDBModel with a subset of atoms:
take( atomIndices ) -> PDBModelAll other PDBModel methods that extract portions of the model (e.g. compress, takeChains, takeResidues, keep, clone, remove) are ultimately using
take()
at their core.Note: take employs fast numpy vector mapping methods to re-calculate the residue and chain index of the result model. The methods generally work but there is one scenario were this mechanism can fail: If take is used to create repetitions of residues or chains directly next to each other, these residues or chains can get accidentally merged. For this reason, calling methods can optionally pre-calculate and provide a correct version of the new residue or chain index (which will then be used as is).
Parameters: - i (list/array of int) – atomIndices, positions to take in the order to take
- rindex (array of int) – optional residue index for result model after extraction
- cindex (array of int) – optional chain index for result model after extraction
- initArgs – any number of additional arguments for constructor of result model
- initKw – any additional keyword arguments for constructure of result model
Returns: new PDBModel or sub-class
Return type:
-
keep
(i)[source]¶ Replace atoms,coordinates,profiles of this(!) model with sub-set. (in-place version of N0.take() )
Parameters: i (list or array of int) – atom positions to be kept
-
clone
()[source]¶ Clone PDBModel.
Returns: PDBModel / subclass, copy of this model, see comments to numpy.take() Return type: PDBModel
-
compress
(mask, *initArgs, **initKw)[source]¶ Compress PDBmodel using mask.
compress( mask ) -> PDBModelParameters: mask (array) – array( 1 x N_atoms of 1 or 0 ):
- 1 .. keep this atom
Returns: compressed PDBModel using mask Return type: PDBModel
-
remove
(what)[source]¶ Convenience access to the 3 different remove methods. The mask used to remove atoms is returned. This mask can be used to apply the same change to another array of same dimension as the old(!) xyz and atoms.
Parameters: what (list of int or int) – Decription of what to remove:
- function( atom_dict ) -> 1 || 0 (1..remove) OR
- list of int [4, 5, 6, 200, 201..], indices of atoms to remove
- list of int [11111100001101011100..N_atoms], mask (1..remove)
- int, remove atom with this index
Returns: array(1 x N_atoms_old) of 0||1, mask used to compress the atoms and xyz arrays. Return type: array Raises: PDBError – if what is neither of above
-
takeResidues
(i)[source]¶ Copy the given residues into a new model.
Parameters: i ([ int ]) – residue indices Returns: PDBModel with given residues in given order Return type: PDBModel
-
takeChains
(chains, breaks=0, force=0)[source]¶ Get copy of this model with only the given chains.
Note, there is one very special scenario where chain boundaries can get lost: If breaks=1 (chain positions are based on normal chain boundaries as well as structure-based chain break detection) AND one or more chains are extracted several times next to each other, for example chains=[0, 1, 1, 2], then the repeated chain will be merged. So in the given example, the new model would have chainLength()==3. This case is tested for and a PDBIndexError is raised. Override with force=1 and proceed at your own risk. Which, in this case, simply means you should re-calculate the chain index after takeChains(). Example:
>>> repeat = model.takeChains( [0,0,0], breaks=1, force=1 ) >>> repeat.chainIndex( force=1, cache=1 )
This works because the new model will have back-jumps in residue numbering.
Parameters: - chains (list of int) – list of chains, e.g. [0,2] for first and third
- breaks (0|1) – split chains at chain breaks (default 0)
- maxDist (float) – (if breaks=1) chain break threshold in Angstrom
- force (bool) – override check for chain repeats (only for breaks==1)
Returns: PDBModel consisting of the given chains in the given order
Return type:
-
addChainFromSegid
(verbose=1)[source]¶ Takes the last letter of the segment ID and adds it as chain ID.
-
addChainId
(first_id=None, keep_old=0, breaks=0)[source]¶ Assign consecutive chain identifiers A - Z to all atoms.
Parameters: - first_id (str) – str (A - Z), first letter instead of ‘A’
- keep_old (1|0) – don’t override existing chain IDs (default 0)
- breaks (1|0) – consider chain break as start of new chain (default 0)
-
renumberResidues
(mask=None, start=1, addChainId=1)[source]¶ Make all residue numbers consecutive and remove any insertion code letters. Note that a backward jump in residue numbering (among other things) is interpreted as end of chain by chainMap() and chainIndex() when a PDB file is loaded.
Parameters: - mask (list of int) – [ 0||1 x N_atoms ] atom mask to apply BEFORE
- start (int) – starting number (default 1)
- addChainId (1|0) – add chain IDs if they are missing
-
atomRange
()[source]¶ >>> m.atomRange() == range( m.lenAtoms() )
Returns: integer range for lenght of this model Return type: [ int ]
-
lenResidues
()[source]¶ Number of residues in model.
Returns: total number of residues Return type: int
-
lenChains
(breaks=0, maxDist=None, singleRes=0, solvent=0)[source]¶ Number of chains in model.
Parameters: - breaks (0||1) – detect chain breaks from backbone atom distances (def 0)
- maxDist (float) – maximal distance between consequtive residues [ None ] .. defaults to twice the average distance
- singleRes (1||0) – allow chains consisting of single residues (def 0)
- solvent (1||0) – also check solvent residues for “chain breaks” (def 0)
Returns: total number of chains
Return type: int
-
resList
(mask=None)[source]¶ Return list of lists of atom pseudo dictionaries per residue, which allows to iterate over residues and atoms of residues.
Parameters: mask – [ 0||1 x N_atoms ] atom mask to apply BEFORE Returns: a list (one per residue) of lists (one per atom) of dictionaries [ [ CrossView{'name':'N', ' residue_name':'LEU', ..}, CrossView{'name':'CA', 'residue_name':'LEU', ..} ], [ CrossView{'name':'CA', 'residue_name':'GLY', ..}, .. ] ]
Return type: [ [ biskit.ProfileCollection.CrossView
] ]
-
resModels
(i=None)[source]¶ Creates one new PDBModel for each residue in the parent PDBModel.
Parameters: i ([ int ] or array( int )) – range of residue positions (default: all residues) Returns: list of PDBModels, one for each residue Return type: [ PDBModel
]
-
resMapOriginal
(mask=None)[source]¶ Generate list to map from any atom to its ORIGINAL(!) PDB residue number.
Parameters: mask (list of int (1||0)) – [00111101011100111…] consider atom: yes or no len(mask) == N_atoms Returns: list all [000111111333344444..] with residue number for each atom Return type: list of int
-
resIndex
(mask=None, force=0, cache=1)[source]¶ Get the position of the each residue’s first atom.
Parameters: - force (1||0) – re-calculate even if cached result is available (def 0)
- cache (1||0) – cache the result if new (def 1)
- mask (list of int (1||0)) – atom mask to apply before (i.e. result indices refer to compressed model)
Returns: index of the first atom of each residue
Return type: list of int
-
resMap
(force=0, cache=1)[source]¶ Get list to map from any atom to a continuous residue numbering (starting with 0). A new residue is assumed to start whenever the ‘residue_number’ or the ‘residue_name’ record changes between 2 atoms.
See
resList()
for an example of how to use the residue map.Parameters: - force (0||1) – recalculate map even if cached one is available (def 0)
- cache (0||1) – cache new map (def 1)
Returns: array [00011111122223333..], residue index for each atom
Return type: list of int
-
resEndIndex
()[source]¶ Get the position of the each residue’s last atom.
Returns: index of the last atom of each residue Return type: list of int
-
chainIndex
(breaks=0, maxDist=None, force=0, cache=0, singleRes=0, solvent=0)[source]¶ Get indices of first atom of each chain.
Parameters: - breaks (1||0) – split chains at chain breaks (def 0)
- maxDist (float) – (if breaks=1) chain break threshold in Angstrom
- force (1||0) – re-analyze residue numbering, chain and segids to find chain boundaries, use with care! (def 0)
- cache (1||0) – cache new index even if it was derrived from non-default parameters (def 0) Note: a simple m.chainIndex() will always cache
- singleRes (1||0) – allow chains consisting of single residues (def 0) Otherwise group consecutive residues with identical name into one chain.
- solvent (1||0) – also check solvent residues for “chain breaks” (default: false)
Returns: array (1 x N_chains) of int
Return type: list of int
-
chainEndIndex
(breaks=0, solvent=0)[source]¶ Get the position of the each residue’s last atom.
Returns: index of the last atom of each residue Return type: list of int
-
chainMap
(breaks=0, maxDist=None)[source]¶ Get chain index of each atom. A new chain is started between 2 atoms if the chain_id or segment_id changes, the residue numbering jumps back or a TER record was found.
Parameters: - breaks (1||0) – split chains at chain breaks (def 0)
- maxDist (float) – (if breaks=1) chain break threshold in Angstrom
Returns: array 1 x N_atoms of int, e.g. [000000011111111111122222…]
Return type: list of int
-
chainBreaks
(breaks_only=1, maxDist=None, force=0, solvent=0, z=6.0)[source]¶ Identify discontinuities in the molecule’s backbone. By default, breaks are identified from the distribution of distances between the last backbone atom of a residue and the first backbone atom of the next residue. The median distance and standard deviation are determined iteratively and outliers (i.e. breaks) are identified as any pairs of residues with a distance that is more than z standard deviations (default 10) above the median. This heuristics can be overriden by specifiying a hard distance cutoff (maxDist).
Parameters: - breaks_only (1|0) – don’t report ends of regular chains (def 1)
- maxDist (float) – maximal distance between consequtive residues [ None ] .. defaults median + z * standard dev.
:param z : z-score for outlier distances between residues (def 6.) :type z : float :param solvent: also check selected solvent residues (buggy!) (def 0) :type solvent: 1||0 :param force: force re-calculation, do not use cached positions (def 0) :type force: 1||0
Returns: atom indices of last atom before a probable chain break Return type: list of int
-
removeRes
(what)[source]¶ Remove all atoms with a certain residue name.
Parameters: what (str OR [ str ] OR int OR [ int ]) – indices or name(s) of residue to be removed
-
rms
(other, mask=None, mask_fit=None, fit=1, n_it=1)[source]¶ Rmsd between two PDBModels.
Parameters: - other (PDBModel) – other model to compare this one with
- mask (list of int) – atom mask for rmsd calculation
- mask_fit (list of int) – atom mask for superposition (default: same as mask)
- fit (1||0) – superimpose first (default 1)
- n_it (int) – number of fit iterations:: 1 - classic single fit (default) 0 - until convergence, kicking out outliers on the way
Returns: rms in Angstrom
Return type: float
-
transformation
(refModel, mask=None, n_it=1, z=2, eps_rmsd=0.5, eps_stdv=0.05, profname='rms_outlier')[source]¶ Get the transformation matrix which least-square fits this model onto the other model.
Parameters: - refModel (PDBModel) – reference PDBModel
- mask (list of int) – atom mask for superposition
- n_it (int) – number of fit iterations:: 1 - classic single fit (default) 0 - until convergence
- z (float) – number of standard deviations for outlier definition (default 2)
- eps_rmsd (float) – tolerance in rmsd (default 0.5)
- eps_stdv (float) – tolerance in standard deviations (default 0.05)
- profname (str) – name of new atom profile getting outlier flag
Returns: array(3 x 3), array(3 x 1) - rotation and translation matrices
Return type: array, array
-
transform
(*rt)[source]¶ Transform coordinates of PDBModel.
Parameters: rt (array OR array, array) – rotational and translation array: array( 4 x 4 ) OR array(3 x 3), array(3 x 1) Returns: PDBModel with transformed coordinates Return type: PDBModel
-
fit
(refModel, mask=None, n_it=1, z=2, eps_rmsd=0.5, eps_stdv=0.05, profname='rms_outlier')[source]¶ Least-square fit this model onto refMode
Parameters: - refModel (PDBModel) – reference PDBModel
- mask (list of int (1||0)) – atom mask for superposition
- n_it (int) – number of fit iterations:: 1 - classic single fit (default) 0 - until convergence
- z (float) – number of standard deviations for outlier definition (default 2)
- eps_rmsd (float) – tolerance in rmsd (default 0.5)
- eps_stdv (float) – tolerance in standard deviations (default 0.05)
- profname (str) – name of new atom profile containing outlier flag
Returns: PDBModel with transformed coordinates
Return type:
-
magicFit
(refModel, mask=None)[source]¶ Superimpose this model onto a ref. model with similar atom content. magicFit( refModel [, mask ] ) -> PDBModel (or subclass )
Parameters: - refModel (PDBModel) – reference PDBModel
- mask (list of int (1||0)) – atom mask to use for the fit
Returns: fitted PDBModel or sub-class
Return type:
-
structureFit
(refModel, mask=None)[source]¶ Structure-align this model onto a reference model using the external TM-Align program (which needs to be installed).
structureFit( refModel [, mask] ) -> PDBModel (or subclass)
The result model has additional TM-Align statistics in its info record: r = m.structureFit( ref ) r.info[‘tm_score’] -> TM-Align score the other keys are: ‘tm_rmsd’, ‘tm_len’, ‘tm_id’
See also
biskit.TMAlign
Parameters: - refModel (PDBModel) – reference PDBModel
- mask (list of int (1||0)) – atom mask to use for the fit
Returns: fitted PDBModel or sub-class
Return type:
-
centered
(mask=None)[source]¶ Get model with centered coordinates.
Parameters: mask (list of int (1||0)) – atom mask applied before calculating the center Returns: model with centered coordinates Return type: PDBModel
-
center
(mask=None)[source]¶ Geometric centar of model.
Parameters: mask (list of int (1||0)) – atom mask applied before calculating the center Returns: xyz coordinates of center Return type: (float, float, float)
-
centerOfMass
()[source]¶ Center of mass of PDBModel.
Returns: array(Float32) Return type: (float, float, float)
-
masses
()[source]¶ Collect the molecular weight of all atoms in PDBModel.
Returns: 1-D array with mass of every atom in 1/12 of C12 mass. Return type: array of floats Raises: PDBError – if the model contains elements of unknown mass
-
mass
()[source]¶ Molecular weight of PDBModel.
Returns: total mass in 1/12 of C12 mass Return type: float Raises: PDBError – if the model contains elements of unknown mass
-
residusMaximus
(atomValues, mask=None)[source]¶ Take list of value per atom, return list where all atoms of any residue are set to the highest value of any atom in that residue. (after applying mask)
Parameters: - atomValues (list) – values per atom
- mask (list of int (1||0)) – atom mask
Returns: array with values set to the maximal intra-residue value
Return type: array of float
-
argsort
(cmpfunc=None)[source]¶ Prepare sorting atoms within residues according to comparison function.
Parameters: - cmpfunc (function) – old style function(m.atoms[i], m.atoms[j]) -> -1, 0, +1
- key (function) – new style sort key function(m.atoms[i]) -> sortable
Returns: suggested position of each atom in re-sorted model ( e.g. [2,1,4,6,5,0,..] )
Return type: list of int
-
sort
(sortArg=None)[source]¶ Apply a given sort list to the atoms of this model.
Parameters: sortArg (function) – comparison function Returns: copy of this model with re-sorted atoms (see numpy.take() ) Return type: PDBModel
-
unsort
(sortList)[source]¶ Undo a previous sorting on the model itself (no copy).
Parameters: sortList (list of int) – sort list used for previous sorting. Returns: the (back)sort list used ( to undo the undo…) Return type: list of int Raises: PDBError – if sorting changed atom number
-
atomNames
(start=None, stop=None)[source]¶ Return a list of atom names from start to stop RESIDUE index
Parameters: - start (int) – index of first residue
- stop (int) – index of last residue
Returns: [‘C’,’CA’,’CB’ …. ]
Return type: list of str
-
filterIndex
(mode=0, **kw)[source]¶ Get atom positions that match a combination of key=values. E.g. filter( chain_id=’A’, name=[‘CA’,’CB’] ) -> index
Parameters: - mode (0||1) – 0 combine with AND (default), 1 combine with OR
- kw (filter options, see example) – combination of atom dictionary keys and values/list of values that will be used to filter
Returns: sort list
Return type: list of int
-
filter
(mode=0, **kw)[source]¶ Extract atoms that match a combination of key=values. E.g. filter( chain_id=’A’, name=[‘CA’,’CB’] ) -> PDBModel
Parameters: - mode (0||1) – 0 combine with AND (default), 1 combine with OR
- kw (filter options, see example) – combination of atom dictionary keys and values/list of values that will be used to filter
Returns: filterd PDBModel
Return type:
-
equals
(ref, start=None, stop=None)[source]¶ Compares the residue and atom sequence in the given range. Coordinates are not checked, other profiles are not checked.
Parameters: - start (int) – index of first residue
- stop (int) – index of last residue
Returns: [ 1||0, 1||0 ], first position sequence identity 0|1, second positio atom identity 0|1
Return type: list if int
-
compareAtoms
(ref)[source]¶ Get list of atom indices for this and reference model that converts both into 2 models with identical residue and atom content.
- E.g.
>>> m2 = m1.sort() ## m2 has now different atom order >>> i2, i1 = m2.compareAtoms( m1 ) >>> m1 = m1.take( i1 ); m2 = m2.take( i2 ) >>> m1.atomNames() == m2.atomNames() ## m2 has again same atom order
Returns: indices, indices_ref Return type: ([int], [int])
-
unequalAtoms
(ref, i=None, iref=None)[source]¶ Identify atoms that are not matching between two models. This method returns somewhat of the opposite of compareAtoms().
Not matching means: (1) residue is missing, (2) missing atom within a residue, (3) atom name is different. Differences in coordinates or other atom profiles are NOT evaluated and will be ignored.
(not speed-optimized)
Parameters: - ref (PDBModel) – reference model to compare to
- i (array( int ) or [ int ]) – pre-computed positions that are equal in this model (first value returned by compareAtoms() )
- iref – pre-computed positions that are equal in ref model (first value returned by compareAtoms() )
Returns: missmatching atoms of self, missmatching atoms of ref
Return type: array(int), array(int)
-
reportAtoms
(i=None, n=None)[source]¶ Parameters: i ([ int ]) – optional list of atom positions to report (default: all) Returns: formatted string with atom and residue names similar to PDB Return type: str
-
compareChains
(ref, breaks=0, fractLimit=0.2)[source]¶ Get list of corresponding chain indices for this and reference model. Use takeChains() to create two models with identical chain content and order from the result of this function.
Parameters: - ref (PDBModel) – reference PDBModel
- breaks (1||0) – look for chain breaks in backbone coordinates
- fractLimit (float) –
Returns: chainIndices, chainIndices_ref
Return type: ([int], [int])
-
biomodel
(assembly=0)[source]¶ Return the ‘biologically relevant assembly’ of this model according to the information in the PDB’s BIOMT record (captured in info[‘BIOMT’]).
This removes redundant chains and performs symmetry operations to complete multimeric structures. Some PDBs define several alternative biological units: usually (0) the author-defined one and (1) software-defined – see
lenBiounits
.Note: The BIOMT data are currently not updated during take/compress calls which may change chain indices and content. This method is therefore best run on an original PDB record before any other modifications are performed.
Parameters: assembly (int) – assembly index (default: 0 .. author-determined unit) Returns: PDBModel; biologically relevant assembly
-
lenBiounits
()[source]¶ Number of biological assemblies defined in PDB BIOMT record, if any.
Returns: number of alternative biological assemblies defined in PDB header Return type: int
-
atomkey
(compress=True)[source]¶ Create a string key encoding the atom content of this model independent of the order in which atoms appear within residues. Atom names are simply sorted alphabetically within residues and then concatenated.
Parameters: compress (bool) – compress key with zlib (default: true) Returns: key formed from sorted atom content of model Return type: str