ModelCraft

An automated model building pipeline for X-ray crystallography and cryo-EM

Installation

ModelCraft can be installed using pip for Python 3.7 or newer, e.g.

python3 -m pip install --user modelcraft

Refer to the pip documentation if pip is not installed. ModelCraft also requires an installation of CCP4. The CCP4 environment needs to be set up so that programs such as Buccaneer and Refmac can be called from the command line.

Usage

The first argument must be either xray or em for X-ray crystallography or cryo-EM. The simplest execution for X-ray crystallography requires only a description of the asymmetric unit contents (see the next section) and a reflection data file in MTZ format (with observations, a free-R flag and starting phases).

modelcraft xray --contents contents.json --data data.mtz

Alternatively, a model can be provided (in PDB, mmCIF or mmJSON format), which will be refined and used as a starting point instead of starting from phases in the data file.

modelcraft xray --contents contents.json --data data.mtz --model model.cif

For cryo-EM, a map and a resolution must be provided instead of a reflection data file.

modelcraft em --contents contents.json --map map.mrc --resolution 2.5

The command line documentation has more detailed information on individual arguments.

modelcraft xray --help
modelcraft em --help

ASU Contents Description

A description of the expected contents of the asymmetric unit must be provided as a FASTA sequence file or a JSON file using the --contents argument. A sequence file is simpler, but the JSON format has the following advantages:

In order to create a JSON file it may be helpful to start from the contents for an existing PDB entry. The modelcraft-contents script creates a contents JSON file for a released PDB entry.

An example JSON file is shown below:

{
    "copies": 2,
    "proteins": [
        {
            "sequence": "LPGECSVNVIPKMNLDKAKFFSGTWYETHYLDMDPQATEKFCFSFAPRESGGTVMEALYHFNVDSKV",
            "stoichiometry": 1,
            "modifications": ["M->MSE"]
        },
        {
            "sequence": "GGG"
        }
    ],
    "rnas": [
        {
            "sequence": "GGUAACUGUUACAGUUACC",
            "stoichiometry": 2,
            "modifications": ["1->GTP", "19->CCC"]
        }
    ],
    "dnas": [],
    "carbs": [
        { "codes": { "NAG": 2 }, "stoichiometry": 1 },
        { "codes": { "MAN": 1, "NAG": 2 }, "stoichiometry": 1 }
    ],
    "ligands": [
        { "code": "HEM", "stoichiometry": 1 }
    ],
    "buffers": ["GOL", "NA", "CL"]
}

The file has a list of proteins, rnas, dnas, carbs, ligands, and buffers that are in the crystal. The only mandatory items are that each protein, RNA or DNA chain must have a sequence, each carbohydrate must have a dictionary of codes to specify the number of each sugar, and each ligand must have a single code.

Each component (other than buffers) has a stoichiometry parameter to specify the stoichiometry. In the example above there are 2 RNA chains for each protein chain. If the stoichiometry is not specified it is assumed to be 1. There is also a copies parameter for the whole file to specify how many copies of the contents are in the asymmetric unit. If this value is not known the most likely number will be estimated. The modelcraft-copies script can be used to view the solvent fraction and probability for each number of copies given a contents file and an MTZ file. It is assumed that the number of ordered buffer molecules is unknown so they are not included in the solvent calculation.

Finally, protein, RNA and DNA chains may have a list of modifications, e.g. M->MSE to specify that all methionine residues are actually selenomethionine or 1->GTP to specify that the residue 1 is guanosine triphosphate.

Note: ModelCraft does not yet build carbohydrates, ligands, or modified residues (other than selenomethionine derivatives). However, this is planned for the future and inclusion of these components in the contents allows for more accurate calculation of the solvent fraction during density modification.

Citations

ModelCraft
P Bond. Next generation software for placing atoms into electron density maps. PhD thesis, University of York (2021) URL
Buccaneer
K Cowtan. Acta Cryst. D, 62, 1002 (2006) DOI
Coot
P Emsley, B Lohkamp, WG Scott, K Cowtan. Acta Cryst. D, 66, 486 (2010) DOI
Nautilus
K Cowtan. IUCrJ, 1, 387 (2014) DOI
Parrot
K Cowtan. Acta Cryst. D, 66, 470 (2010) DOI
Refmac
O Kovalevskiy, RA Nicholls, F Long, A Carlon, GN Murshudov. Acta Cryst. D, 74, 215 (2018) DOI
Sheetbend
K Cowtan, S Metcalfe, P Bond. Acta Cryst. D, 76, 1192 (2020) DOI