Adding Missing Residues
Briefs
Proteins are often missing residues from flexible loop regions. This is a problem for various types
of calculations including molecular dynamics (MD) and quantum mechanics (QM) simulations. This function
aids in HTP workflow design by adding missing Residue() objects into a Structure(). Missing residues
can be defined by the user or by pulling from the Protein DataBank (PDB) when possible.
Note
Using this functionality requires the modeller python package to be installed. For more information on installing the package, see the Sali Lab website
Note
Science API is a special concept in EnzyHTP. They stands for those top-layer APIs the are supposed to be used in assembling the workflow.
Warning
Some loop refinement methods are known to slightly move sidechain positions. It may be necessary to minimize a Structure after filling in missing loops.
You will find a detailed tutorial of how to use this Science API.
Input/Output
Input
The stru, missing_residues, method arguments and required inputs
stru:A
Structure()object that is missing residues.Getting a
StructureAStructure()object can be obtained using these APIs
missing_residues:A
List[SeqRes]objects which describe the chain/index location and identity of each missing residue.Getting the missing residues
AList[SeqRes]objects can be created by giving the four letter PDB code of a structure using theidentify_missing_residues()function from thepreparationmodule. (See tutorial here)
method:A
strspecifying the method to use for filling missing residues.
work_dir:Working directory containing all the files in the calculation process. Optional argument.
inplace:Should the missing residues be added to the supplied
Structureor to a new, copied one. Optional argument,Trueby default.
Output
A Union[None, Structure] that is either nothing or a copied Structure with the added missing residues.
Examples
Example Code
from enzy_htp import ( PDBParser, identify_missing_residues, fill_missing_residues) sp = PDBParser() stru = sp.get_structure("./3r3v_.pdb") print(stru) fill_missing_residues(stru, identify_missing_residues("3R3V")) print(stru)The output from the above code is listed below:
<Structure object at 0x7f8576baa370> Structure( chains: (sorted, original ['A', 'B']) A(polypeptide): residue: 4-254,256-300 atom_count: 2339 B(polypeptide): residue: 4-252,259-300 atom_count: 2310 ) <Structure object at 0x7f8576baa370> Structure( chains: (sorted, original ['A', 'B']) A(polypeptide): residue: -1-304 atom_count: 2408 B(polypeptide): residue: -1-304 atom_count: 2408 )
Author: Chris Jurich <chris.jurich@vanderbilt.edu>