Post-2 Rosetta-Ligand Docking

完成Rosetta编译之后我们便可以进行分子对接,我将在这篇博文中记录完成Ligand Docking的整个过程。

1 任务阐述

3r0r.pdb该蛋白质具有酶的活性,但存在几种小分子会与该蛋白质结合抑制蛋白酶活性。已知几种小分子的抑制活性,通过结合亲和力计算来验证抑制活性。

2 结果记录

得出最终结果的命令是:

1
/home/gt-lck/software/rosetta.binary.ubuntu.release-371/main/source/bin/rosetta_scripts.default.linuxgccrelease @ options -nstruct 5

这条命令涉及的文件包括options,options文件又涉及到蛋白质与配体合并后的pdb文件(3r0r_A_LIG.pdb),可供Rosetta读取的配体的.params文件以及dock.xml,options如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#Pound signs indicate comments 
#-in:file:s option imports the protein and ligand PDB structures
#-in:file:extra_res_fa option imports the parameters for the ligand
-in
-file
-s 3r0r_A_LIG.pdb
-extra_res_fa LIG.params
#the packing options allow Rosetta to sample additional rotamers for
#protein sidechain angles chi 1 (ex1) and chi 2 (ex2)
#no_optH false tells Rosetta to optimize hydrogen placements
#flip_HNQ tells Rosetta to consider HIS,ASN,GLN hydrogen flips
#ignore_ligand_chi prevents Roseta from adding additional ligand rotamer
-packing
-ex1
-ex2
-no_optH false
-flip_HNQ true
-ignore_ligand_chi true
#parser:protocol locates the XML file for RosettaScripts
-parser
-protocol dock.xml
#overwrite allows Rosetta to write over previous structures and scores
-overwrite
#Ligand docking is not yet benchmarked with the updated scoring function
#This flag restores certain parameters to previously published values
-mistakes
-restore_pre_talaris_2013_behavior true
-nstruct 5

其中dock.xml文件又涉及到蛋白质配体复合物native.pdb(蛋白质配体自然复合物),dock.xml如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
<ROSETTASCRIPTS>
<SCOREFXNS>
<ScoreFunction name="ligand_soft_rep" weights="ligand_soft_rep">
</ScoreFunction>
<ScoreFunction name="hard_rep" weights="ligand">
</ScoreFunction>
</SCOREFXNS>
<LIGAND_AREAS>
<LigandArea name="inhibitor_dock_sc" chain="X" cutoff="6.0" add_nbr_radius="true" all_atom_mode="false"/>
<LigandArea name="inhibitor_final_sc" chain="X" cutoff="6.0" add_nbr_radius="true" all_atom_mode="false"/>
<LigandArea name="inhibitor_final_bb" chain="X" cutoff="7.0" add_nbr_radius="false" all_atom_mode="true" Calpha_restraints="0.3"/>
</LIGAND_AREAS>
<INTERFACE_BUILDERS>
<InterfaceBuilder name="side_chain_for_docking" ligand_areas="inhibitor_dock_sc"/>
<InterfaceBuilder name="side_chain_for_final" ligand_areas="inhibitor_final_sc"/>
<InterfaceBuilder name="backbone" ligand_areas="inhibitor_final_bb" extension_window="3"/>
</INTERFACE_BUILDERS>
<MOVEMAP_BUILDERS>
<MoveMapBuilder name="docking" sc_interface="side_chain_for_docking" minimize_water="false"/>
<MoveMapBuilder name="final" sc_interface="side_chain_for_final" bb_interface="backbone" minimize_water="false"/>
</MOVEMAP_BUILDERS>
<SCORINGGRIDS ligand_chain="X" width="15">
<ClassicGrid grid_name="classic" weight="1.0"/>
</SCORINGGRIDS>
<MOVERS>
<Transform name="transform" chain="X" box_size="7.0" move_distance="0.2" angle="20" cycles="500" repeats="1" temperature="5"/>
<HighResDocker name="high_res_docker" cycles="6" repack_every_Nth="3" scorefxn="ligand_soft_rep" movemap_builder="docking"/>
<FinalMinimizer name="final" scorefxn="hard_rep" movemap_builder="final"/>
<InterfaceScoreCalculator name="add_scores" chains="X" scorefxn="hard_rep" native="native.pdb"/>
</MOVERS>
<PROTOCOLS>
<Add mover_name="transform"/>
<Add mover_name="high_res_docker"/>
<Add mover_name="final"/>
<Add mover_name="add_scores"/>
</PROTOCOLS>
</ROSETTASCRIPTS>

由于我做的是蛋白质与抑制剂之间的对接,我没有native.pdb。因此,我使用了其中之一个配体与蛋白质对接后排名第一的复合物作为native.pdb。

3 操作过程

step1:建立任务目录

在指定目录下建立‘docking’ ‘ligand_prep’ ‘protein_prep’三个目录,对应命令如下:

1
2
3
mkdir docking
mkdir ligand_prep
mkdir protein_prep

Post2-1

step2:准备蛋白质(protein_prep)

进入到protein_prep目录下,导入蛋白质pdb文件(3r0r.pdb)

1
2
cd protein_prep
ls

Post2-2

然后,使用clean_pdb.py 脚本剥离 PDB 中除所需蛋白质坐标以外的信息,’A’ 选项告诉脚本只获取链 A

1
/home/gt-lck/software/rosetta.binary.ubuntu.release-371/main/tools/protein_tools/scripts/clean_pdb.py 3r0r.pdb A

image-20250124194302208

然后将生成的3r0r_A.pdb复制到docking目录中:

1
cp 3r0r_A.pdb ../docking

step3:准备配体分子(ligand_prep)

然后需要对配体分子Ligand.sdf(v2000)进行异构象计算,这一步可以在http://carbon.structbio.vanderbilt.edu/index.php/bclconf中完成,最终得到Ligand_conformers.sdf,将其一并放入到ligand_prep目录下。

image-20250124195259137

然后,生成一个 .params 文件和具有 Rosetta 原子类型的相关 PDB 构象。params 文件对于配体对接是必需的,因为 Rosetta 的数据库中没有自定义小分子的记录,命令如下:

1
python /home/gt-lck/software/rosetta.binary.ubuntu.release-371/main/source/scripts/python/public/molfile_to_params.py -n LIG -p LIG --conformers-in-one-file Ligand6_conformers.sdf

然后,将生成的文件复制到docking目录中。进入到docking目录中将3r0r_A.pdb和LIG.pdb合并为3r0r_A_LIG.pdb:

1
2
3
4
cp LIG* ../docking/
cd ../docking/
cat 3r0r_A.pdb LIG.pdb > 3r0r_A_LIG.pdb
/home/gt-lck/software/rosetta.binary.ubuntu.release-371/main/source/bin/rosetta_scripts.default.linuxgccrelease @ options -nstruct 5

step4:最终计算

将options、docking.xml、native.pdb复制到docking目录下,docking目录下完整结构如图所示:

image-20250124200349430

最后运行计算命令得出结果score.sc:

1
/home/gt-lck/software/rosetta.binary.ubuntu.release-371/main/source/bin/rosetta_scripts.default.linuxgccrelease @ options -nstruct 5

image-20250124200701495