Assign Mutant

Briefs

The target for this tutorial to mutate the target protein at defined residue(s). The main function provided are assign_mutant() and mutate_stru(), in which assign_mutant() assigns mutants targeted in the study. Decode the user assigned {pattern} based on the {stru} and get a list of mutants defined by a list of mutation objects each.

Input/Output

Input

stru: the target protein structure intended for mutation, represented as a Structure() object.

How to obtain

A Strutcure() object can be obtained by these APIs.
pattern: specifies the residue(s) targeted for mutation, need to be clarified.

How to obtain

The pattern can be written with the syntax Pattern Syntax.
chain_sync_list: a list that identifies homologous chains in the multimer protein, need to be clarified.

How to obtain

The chain_sync_list can be written as a list like [(A,C),(B,D)] to indicate homo-chains in enzyme ploymer (like dimer). Mutations will be copied to the correpondinhomo-chains as it is maybe experimentally impossible to only do mutations on one chain of a homo-dimer enzyme.
chain_index_mapper: a dictionary mapping the residue indices for each chain in the multimer protein, need to be clarified.

How to obtain

The chain_index_mapper need to be clarified in cases that residue index in each chain is not aligned. | e.g.: for a pair of homo-dimer below: | “A”: ABCDEFG (start from 7) | “B”: BCDEFGH (start from 14) | the chain_sync_mapper should be {"A":0, "B":6} and index conversion is done by A_res_idx - 0 + 6 = B_res_idx

Output

mutation

a list of mutants defined each by a list of Mutation objects.

NOTE: this function generates WT as [] or [Mutation(None, "WT", None, None)] unless directly indication. Act accordingly.

Mutant Pattern

Basic usage example

Generate a mutant with a single-point mutation: R378E

test_A = "test_A.pdb"
test_A_stru = PDBParser.get_structure(test_A)
test_mutation_pattern_A = "R378E"
mutants_A = mapi.assign_mutant(test_A_stru, test_mutation_pattern_A)
print(mutants_A)
#[[('ARG','GLU','A',378)]]

Chemistry-inspired example

Mutate residues using the protein with the LIG substrate near the binding pocket, defined as residues within 5 Å of LIG, to increase pocket volume. Not all residues need to be force mutated.

test_A = "test_A.pdb"
test_A_stru = PDBParser.get_structure(test_A)
test_mutation_pattern_A = "a:[byres resn LIG around 5:smaller]"
mutants_A = mapi.assign_mutant(test_A_stru, test_mutation_pattern_A)
print(mutants_A)

Click to see more homodimeric protein examples

Generate two mutants for a homologous dimeric protein: L383H and N363E.

test_A_B = "test_A_B.pdb"
test_A_B_stru = PDBParser.get_structure(test_A_B)
test_mutation_pattern_A_B = "LA383H,NB363E"
mutation_pattern_A_B = mapi.assign_mutant(test_A_B_stru,
                                          test_mutation_pattern_A_B,
                                          chain_sync_list=[("A", "B")],
                                          chain_index_mapper={"A": 0, "B": 0})
print(mutation_pattern_A_B)
#[[('LEU','HIS','B',383), ('LEU','HIS','A',383)], [('ASN','GLU','B',363), ('ASN','GLU','A',363)]]

Randomly generate 3 double mutants by mutating residues at the dimer interface to smaller amino acids for a homologous dimericprotein.

test_A_B = "test_A_B.pdb"
test_A_B_stru = PDBParser.get_structure(test_A_B)
test_mutation_pattern_A_B = "r:2[byres chain A around 5.0 and chain B:smaller]*3"
mutation_pattern_A_B = mapi.assign_mutant(test_A_B_stru,
                                          test_mutation_pattern_A_B,
                                          chain_sync_list=[("A", "B")],
                                          chain_index_mapper={"A": 0, "B": 0})
print(mutation_pattern_A_B)
#[[('PHE','GLY','B',179), ('ALA','GLY','A',332), ('PHE','GLY','A',179), ('ALA','GLY','B',332)], [('ARG','THR','A',272), ('ASP''ALA',       'B',275), ('ARG','THR','B',272), ('ASP','ALA','A',275)], [('ASN','CYS','A',178), ('GLU','ASP','B',340), ('GLU''ASP','A',340), ('ASN',  'CYS','B',178)]]

Randomly generate 4 triple mutants by mutating residues at the dimer interface to neutral amino acids for a homologous dimeric protein.

test_A_B = "test_A_B.pdb"
test_A_B_stru = PDBParser.get_structure(test_A_B)
test_mutation_pattern_A_B = "r:3[byres chain A around 5.0 and chain B:neutral]*4"
mutation_pattern_A_B = mapi.assign_mutant(test_A_B_stru,
                                          test_mutation_pattern_A_B,
                                          chain_sync_list=[("A", "B")],
                                          chain_index_mapper={"A": 0, "B": 0})
print(mutation_pattern_A_B)
#[[('PRO','MET','A',344), ('GLY','PHE','B',181), ('ASP','ALA','A',211), ('PRO','MET','B',344), ('ASP','ALA','B',211), ('GLY''PHE','A',     181)], [('ARG','CYS','A',276), ('ARG','VAL','B',351), ('ASP','CYS','A',364), ('ARG','VAL','A',351), ('ASP','CYS''B',364), ('ARG','CYS',    'B',276)], [('ARG','CYS','B',336), ('ARG','CYS','A',336), ('LYS','GLY','A',357), ('PRO','ALA','A'358), ('LYS','GLY','B',357), ('PRO',     'ALA','B',358)], [('ILE','TYR','A',182), ('ASP','THR','A',211), ('ALA','TRP','B',332),('ASP','THR','B',211), ('ILE','TYR','B',182),      ('ALA','TRP','A',332)]]

Click to see more heterodimeric protein examples

Generate the following mutations on a heterodimeric protein with chains A and D: P151F on chain A and T76D on chain D.

test_A_D = "4nb9_A_D.pdb"
test_A_D_stru = PDBParser.get_structure(test_A_B)
test_mutation_pattern_A_D = "{PA151F,TD76D}"
mutation_pattern_A_D = mapi.assign_mutant(test_A_B_stru,
                                          test_mutation_pattern_A_B,
                                          chain_sync_list=[("A"), ("D")],
                                          chain_index_mapper={"A": 0, "D": 0})
print(mutation_pattern_A_D)
#[[('THR','ASP','D',76), ('PRO','PHE','A',151)]]

Generate two separate mutants with a heterodimeric protein containing chains A and D: P151F on chain A, and T76D on chain D

test_A_D = "4nb9_A_D.pdb"
test_A_D_stru = PDBParser.get_structure(test_A_B)
test_mutation_pattern_A_D = "{PA151F,TD76D}"
mutation_pattern_A_D = mapi.assign_mutant(test_A_B_stru,
                                          test_mutation_pattern_A_B,
                                          chain_sync_list=[("A"), ("D")],
                                          chain_index_mapper={"A": 0, "D": 0})
print(mutation_pattern_A_D)
#[[('PRO','PHE','A',151)], [('THR','ASP','D',76)]]

Use a heterodimeric protein comprised of A and D chains, where chain A contains the cofactor FE2 and chain D contains the cofactor FES. Generate 3 single mutants to add a negative charge within 3 Å of the FE2 cofactor in chain A. Simultaneously, mutate residues within 4 Å of the FES cofactor in chain D to smaller residues to create 2 double mutations, for each mutation in chain A. The result should be 6 mutants, each with a single point mutation in chain A and a double point mutation in chain D.

test_A_D = "4nb9_A_D.pdb"
test_A_D_stru = PDBParser.get_structure(test_A_B)
test_mutation_pattern_A_D = "{r:1[byres resn FE2 around 3 and chain A:charge+1]*3, r:2[byres resn FES around 4 and chain D:smaller]*2}"
mutation_pattern_A_D = mapi.assign_mutant(test_A_B_stru,
                                          test_mutation_pattern_A_B,
                                          chain_sync_list=[("A"), ("D")],
                                          chain_index_mapper={"A": 0, "D": 0})
print(mutation_pattern_A_D)
#[[('HIS','ASP','D',48), ('ASP','PHE','A',333), ('CYS','GLY','D',84)],
#[('ASP','PHE','A',333), ('PHE','THR','D',67), ('CYS','GLY','D',84)],
#[('ASP','SER','A',333), ('HIS','ASP','D',48), ('CYS','GLY','D',84)],
#[('ASP','SER','A',333), ('PHE','THR','D',67), ('CYS','GLY','D',84)],
#[('HIS','ASP','D',48), ('HIS','ARG','A',183), ('CYS','GLY','D',84)],
#[('PHE','THR','D',67), ('HIS','ARG','A',183), ('CYS','GLY','D',84)]]

Arguments

stru: the target protein structure for mutation represented as Structure()
pattern:: the pattern that defines the mutation. (See Mutant Pattern section)
chain_sync_list:: A list to indicate homo-chains in enzyme ploymer (like dimer). (See Input/Output section)
random_state:: The int() seed for the random number generator. Default value is 100.
chain_index_mapper:: A dictionary that need to be clarified in cases that residue index in each chain is not aligned. (See Input/Output section)
if_check: if or not checking if each mutation is valid. (This could be pretty slow if the mutant is >10^7 level)

Example Code

1. Assign mutants for a monomer protein

In this example, we perform assign mutations on a monomer protein structure.

How input is prepared

stru: obtained by reading from a PDB file using PDBParser().get_structure() (See Details)
pattern: defined as pattern syntax (See Details)

from enzy_htp.structure import PDBParser
import enzy_htp.mutation.api as mapi
test_A = "test_A.pdb"
test_A_stru = PDBParser.get_structure(test_A)
test_mutation_pattern_A = (
        "GA11A, {NA176W, PA51A},"
        " {L56A, r:2[resi 254 around 3:all not self]*5}"
        )
mutants_A = mapi.assign_mutant(test_A_stru, test_mutation_pattern_A)
print(mutants_A)

2. Assign mutants for a two-chain protein

In this example, we perform assign mutations on a two-chainr protein structure, in which A and B are homologous chains.

How input is prepared

stru: obtained by reading from a PDB file using PDBParser().get_structure() (See Details)
pattern: defined as pattern syntax (See Details)
chain_sync_list: defined according to the structure, there are two chains (A and B) (See Details)
chain_index_mapper: defined according to the structure, there are two chains (A and B) both start from the same residue index (See Details)

from enzy_htp.structure import PDBParser
import enzy_htp.mutation.api as mapi
test_A_B = "test_A_B.pdb"
test_A_B_stru = PDBParser.get_structure(test_A_B)
test_mutation_pattern_A_B = "{GA11A, NB176W, PB51A}"
mutation_pattern_A_B = mapi.assign_mutant(test_A_B_stru,
                                          test_mutation_pattern_A_B,
                                          chain_sync_list=[("A", "B")],
                                          chain_index_mapper{"A": 0, "B": 0})
print(mutation_pattern_A_B)

3. Assign mutants for a four-chain protein

In this example, we perform assign mutations on a four-chainr protein structure, in which A and B are homologous chains, and C and D are homologous chains distinct from A and B.

How input is prepared

stru: obtained by reading from a PDB file using PDBParser().get_structure() (See Details)
pattern: defined as pattern syntax (See Details)
chain_sync_list: defined according to the structure, there are four chains (A, B, C, and D), A and B are same subunits, C and D are same subunits (See Details)
chain_index_mapper: defined according to the structure, A & B and C & D start from the same residue index (See Details)

from enzy_htp.structure import PDBParser
import enzy_htp.mutation.api as mapi
test_A_B_C_D = "test_A_B_C_D.pdb"
test_A_B_C_D_stru = PDBParser.get_structure(test_A_B_C_D)
test_mutation_pattern_A_B_C_D = "{TA391A, RC58A}"
mutation_pattern_A_B_C_D = mapi.assign_mutant(test_A_B_C_D_stru,
                                              test_mutation_pattern_A_B_C_D,
                                              chain_sync_list=[("A", "B"), ("C","D")],
                                              chain_index_mapper={"A": 0, "B": 0, "C": 0, "D": 0})
print(mutation_pattern_A_B_C_D)

Author: Xingyu Ouyang <ouyangxingyu913@gmail.com>