Structure

Briefs

This class define the core data structure of EnzyHTP: Structure. Structure stands for the single enzyme structure. As the core data class, Structure will be solely designed for storing, accessing and editing structural data.

Input/Output

Highlighted Methods

Method: get

The get method is designed to retrieve specific components—either a Chain, a Residue, or an Atom—from a biological structure based on a given key. The key must follow a specific format, and the method performs differently based on the level of detail in the key.

Input

key

A string that specifies the chain, residue, and atom in a biological structure using the format <chain_name>.<residue_index>.<atom_name>. The levels of detail can vary:

How to obtain

  • Chain Only: Input just the chain name (e.g., A).

  • Residue: Input the chain name and residue index (e.g., A.1).

  • Atom: Input the chain name, residue index, and atom name (e.g., A.1.CA).

Output

The function returns an object of type Chain, Residue, or Atom depending on the input key’s specificity. If the input key does not conform to the expected format or if the specified item does not exist within the structure, the function raises a ValueError.

Example Usage

Here’s how you might use the get method in your Python code:

chain = structure.get("A")
residue = structure.get("A.1")
atom = structure.get("A.1.CA")

Example Code

In this example, we load a protein from PDB file and perform Structrure() on this protein.

How input is prepared

8k68.pdb

Download example protein from Protein Data Bank, and load it via PDBParser

import enzy_htp
#Loading a structure from PDB
structure : enzy_htp.Structure = enzy_htp.PDBParser().get_structure("./8k68.pdb")

#Highlighted Methods '.get'
structure.get()
#TypeError: get() missing 1 required positional argument: 'key'
structure.get('A')
#<enzy_htp.structure.chain.Chain object at 0x2ad5c219a9d0>
structure.get('A.1')
#Residue(1, SER, atom:6, Chain(A, residue: 1-45,51-301))
structure.get('A.1.CA')
#<Atom(CA, 2, (32.589, -23.487, -14.414), Residue(1, SER, atom:6, Chain(A, residue: 1-45,51-301)), 25.32, C, None ) at 0x2ad5c2171b80>
structure.get('B.401')
#Solvent(401, HOH, atom:1, Chain(B, residue: 401-658))

#Surveying basic information
structure.num_chains
#2
structure.sequence
#{'A': 'SGFRKMAFPSGKVEGCMVQVTCGTTTLNGLWLDDVVYCPRHVICTNPNYEDLLIRKSNHNFLVQAGNVQLRVIGHSMQNCVLKLKVDTANPKTPKYKFVRIQPGQTFSVLACYNGSPSGVYQCAMRPNFTIKGSFLNGSCGSVGFNIDYDCVSFCYMHHMELPTGVHAGTDLEGNFYGPFVDRQTAQAAGTDTTITVNVLAWLYAAVINGDRWFLNRFTTTLNDFNLVAMKYNYEPLTQDHVDILGPLSAQTGIAVLDMCASLKELLQNGMNGRTILGSALLEDEFTPFDVVRQCS', 'B': 'HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH  HOH'}
structure.num_residues
#554

#Interfacing with Chain()
structure.chains
#[<enzy_htp.structure.chain.Chain object at 0x2b82b42e2880>, <enzy_htp.structure.chain.Chain object at 0x2b82b42e2970>]
structure.chain_names
#['A', 'B']
chain_cpy : enzy_htp.Chain = structure.get_chain( "B" )

#Interfacing with Residue()
structure.num_residues
#554

Method: assign_ncaa_chargespin

The assign_ncaa_chargespin method is designed to assign net charges and spin states to non-canonical amino acids (NCAAs) in Structure(). It specifically targets residues identified as NCAAs, ligands, or modified amino acids, based on their three-letter codes or general categories.

Input

net_charge_mapper

A dictionary where keys are the three-letter codes of NCAAs (or categories like “LIGAND” or “MODAA” for all of that kind), and values are tuples containing the net charge and multiplicity (spin state, the 2S+1 number for multiplicity) for these residues. The format is {"RES" : (charge, spin), ...}.

How to construct

  • Specific NCAA: Use the three-letter code (e.g., HEM) with its charge and spin.

  • General Category: Use identifiers like LIGAND or MODAA to apply properties uniformly to all residues within these categories.

Output

The function does not return a value but modifies the properties of the residues within the structure directly. If a specified NCAA does not exist or if the residue type is not appropriate for the charge and spin assignment, an error is raised.

Example Usage

Here’s how you might use the assign_ncaa_chargespin method in your Python code:

net_charge_mapper = {
    'MOL': (0, 1),  # Substrate molecule with a charge of 0 and a singlet spin state
    'LIGAND': (1, 1),  # All ligands with a charge of +1 and a singlet spin state
    'MODAA': (-1, 2)  # All modified amino acids with a charge of -1 and a doublet spin state
}
structure.assign_ncaa_chargespin(net_charge_mapper)

Example Code

In this example, we load a protein from PDB file and perform Structrure() and assign_ncaa_chargespin on this protein.

How input is prepared

4bf4.pdb

Download example protein from Protein Data Bank, and load it via PDBParser

import enzy_htp
#Loading a structure from PDB
structure : enzy_htp.Structure = enzy_htp.PDBParser().get_structure("./4bf4.pdb")

net_charge_mapper = {
    'HEM': (0, 1),  # Assuming HEM is neutral overall with iron in +2 state, spin state as singlet
    '17Q': (0, 1),  # Neutral organic compound
    'SO4': (-2, 1)  # Sulfate ion with a charge of -2 and a singlet state
}

structure.assign_ncaa_chargespin(net_charge_mapper)

Author: Xingyu Ouyang <ouyangxingyu913@gmail.com>