| Dynamics of the WPD Loop of the Yersinia Protein Tyrosine Phosphatase Biophysical Journal, Volume 91, Issue 3, 1 August 2006, Pages 948-956 Xin Hu and C. Erec Stebbins Abstract The bacterial protein tyrosine phosphatase YopH is an essential virulence determinant in spp., causing gastrointestinal diseases and the plague. Like eukaryotic PTPases, YopH catalyzes the hydrolysis of the phosphate moiety of phosphotyrosine within a highly conserved binding pocket, which is also characterized by the closure of the so-called “WPD loop” upon ligand binding. In this study, we investigate the conformational changes and dynamics of the WPD loop by molecular dynamics simulations. Consistent with experimental observations, our simulations show that the WPD loop of YopH is intrinsically flexible and fluctuates between the open and closed conformation with a frequency of ∼4ns for the apo, native protein. The region of helix 4 spanning loop 384-392, which has been revealed experimentally as a second substrate-binding site in YopH, is found to be highly associated with the WPD loop, stabilizing it in the closed, active conformation, and providing a structural basis for the cooperation of the second-substrate binding site in substrate recognition. Loop L4 (residues 323–327) is shown to be involved in a parallel, correlated motion mode with the WPD loop that contributes the stabilization of a more extended open conformation. In addition, we have simulated the loop reopening in the ligand-bound protein complex by applying the locally enhanced sampling method. Finally, the dynamic behavior of the WPD loop for the C403S mutant differs from the wild-type YopH remarkably. These results shed light on the role of the WPD loop in PTPase-mediated catalysis, and are useful in structure-based design for novel, selective YopH inhibitors as antibacterial drugs. Abstract | Full Text | PDF (382 kb) |
| Modulation of Host Signaling by a Bacterial Mimic Molecular Cell, Volume 6, Issue 6, 1 December 2000, Pages 1449-1460 C.Erec Stebbins and Jorge E Galán Summary spp. utilize a specialized protein secretion system to deliver a battery of effector proteins into host cells. Several of these effectors stimulate Cdc42- and Rac1-dependent cytoskeletal changes that promote bacterial internalization. These potentially cytotoxic alterations are rapidly reversed by the effector SptP, a tyrosine phosphatase and GTPase activating protein (GAP) that targets Cdc42 and Rac1. The 2.3 Å resolution crystal structure of an SptP–Rac1 transition state complex reveals an unusual GAP architecture that mimics host functional homologs. The phosphatase domain possesses a conserved active site but distinct surface properties. Binding to Rac1 induces a dramatic stabilization in SptP of a four-helix bundle that makes extensive contacts with the Switch I and Switch II regions of the GTPase. Summary | Full Text | PDF (1782 kb) |
| Yop effectors of Yersinia spp. and actin rearrangements Trends in Microbiology, Volume 8, Issue 5, 1 May 2000, Pages 205-208 James B Bliska Full Text | PDF (145 kb) |
Copyright © 2007 The Biophysical Society. All rights reserved.
Biophysical Journal, Volume 93, Issue 12, 4141-4150, 15 December 2007
doi:10.1529/biophysj.107.113860
Biophysical Theory and Modeling
Zunnan Huang and Chung F. Wong
, 
Address reprint requests to Chung F. Wong.Molecular docking has become a popular research area partly because of its relevance to computer-aided drug design. Earlier methods focused on docking rigid ligands to rigid receptors. With the rapid advance of high-performance computing technology and the development of new docking algorithms, it has become increasingly feasible to treat both the ligands and their receptors as flexible molecules during docking. Although most methods still do not account for the coupling between ligand and receptor motion (e.g., docking ligands to an ensemble of structures that were generated in the absence of the ligand), there were exceptions. For example, Mangoni et al. 1 directly docked a flexible ligand to a flexible protein during a molecular dynamics (MD) simulation in which the translational motion of the center of mass of the ligand was assigned a high temperature. On the other hand, Nakajima et al. 2 employed the multicanonical MD simulation method to dock a flexible peptide into the SH3 domain of Abl tyrosine kinase. These simulations, however, are still very computationally expensive. If one also wants to gain insights into the docking pathways in addition to obtaining correct docking poses, the simulations can be even more costly.
There are methods that can facilitate the simulation of ligand loading into or unloading from a protein. For example, the steered 3 and biased 4 MD approaches apply biases to steer a ligand to enter or release from a protein. However, these simulations are still expensive to do if one wants to carry out many runs to identify representative pathways and to estimate activation barriers and to perform long runs to minimize the artificial effects of the applied biases. It is therefore useful to explore alternative methods that utilize different approximations.
In this work, we took advantage of the well-known complicated energy landscape of large protein-ligand systems 5. The energy landscape is not smooth but contains many local minima. If one can mine the global minimum along with many of these local minima and utilize a method to properly connect them to form pathways, useful insights into the molecular mechanisms of docking can be gained.
In this study, we used simulated annealing 6 as a tool to help mine energy minima. We achieved this by running many short simulated annealing cycles instead of running one or only a few slow simulated annealing simulations. Each cycle only lasted for 8ps. The temperature was near 0K at the end of each cycle so that the resulting structure was near a local or global energy minimum. Many cycles generated many local minima. We could then connect these minima by performing clustering and similarity analysis to form pathways.
In this work, we tested this idea by applying it to study the docking of p-nitrocatechol sulfate (pNCS) to Yersinia protein tyrosine phosphatase YopH. This bacterial protein tyrosine phosphatase is responsible for causing human diseases ranging from gastrointestinal syndromes to bubonic plague 7,8,9. pNCS, whose structure is shown in Fig. 1, is a specific inhibitor of YopH and displays >10× selectivity toward YopH over a panel of mammalian protein tyrosine phosphatases 10. In addition to demonstrating that our method can identify the correct docking structure, we also show that it can generate useful insights into the docking pathways. Crude estimates of the activation barriers for ligand entry and release have also been estimated. In addition to elucidating the molecular mechanisms of docking, information gained from such a study might also be useful for drug design as it provides a semiquantitative means to compare the relative ease of different ligands to enter their target protein and the relative durations in which different drug candidates would stay in the binding pocket once they get there.
One way to mine many energy minima is to perform many short simulated annealing cycles in an MD simulation. We performed such a simulation as follows:
We continued the above loop until a prescribed simulation length was reached. Each short simulated annealing cycle covered the following temperatures: 500K, 250K, 125K, 62.5K, 31.2K, 15.6K, 7.8K, and 3.9K. The structure at the end of the 3.9K run was near a local energy minimum. By performing many short simulated annealing cycles, one could identify many minima. The ones with the lowest energy corresponded to the correct docking pose, whereas the higher energy minima helped to construct approximate docking pathways.
We performed the simulated annealing cycling simulation by modifying the MMTSB toolkit 12 and by using the CHARMM package 13. Usually, each run contained four independent trajectories. We started these trajectories from the same structure and temperature but with different random number seed to initiate the atomic velocities. Different runs could also start from different initial structures.
We took the initial coordinates of YopH from the Protein Data Bank (PDB), PDB id 1PA9. This structure contained a bound pNCS ligand 10. We constructed the coordinates of four missing residues (Ser-389, Ala-390, Val-391, and Ala-392) by using their coordinates form another structure (PDB id 1YTW) 14 after superposing them by Visual Molecular Dynamics 15. The four hydrogens of the pNCS ligand were added by Open Babel (http://openbabel.sourceforge.net), and the partial charges of the ligand atoms were obtained by performing Hartree-Fock calculations using the 6-31G* basis set, the Gaussian03 package 16, and the Merz-Kollman scheme 17 for deriving atomic partial charges from electrostatic potential. We started the simulated annealing cycling simulations by putting the ligand pNCS at or near the surface of the protein. At each location, we used two ligand orientations that were flipped relative to each other. Thus, four different ligand structures were used to start the simulations.
Before each MD simulation, we performed 500 steps of steepest descent minimization of the pNCS-YopH complex to remove bad contacts. During the energy minimization, the α-carbons were held fixed. For each starting position and orientation of the ligand, we carried out five runs, each containing four simulated annealing cycling simulations. Therefore, we had 20 trajectories for each starting structure and 80 trajectories total for four different starting locations/orientations of the ligand. Each trajectory lasted 1ns. Therefore, the aggregate simulation time was 80ns for the whole set of ligand-docking simulations. We used a time step of 1 fs in these simulations, and we collected structures every 1ps for analysis.
We conducted the MD simulations using the CHARMM param27 force field 18. In the docking simulations, we used a simple but inexpensive distance-dependent dielectric model with ɛ(r)=4r where r is the distance between two atoms. However, we also rescored the docking poses by the more sophisticated implicit solvent generalized Born molecular volume (GBMV) model 19,20,21. During the simulations, we used a nonbonded cutoff distance of 14Å, a switching function for the electrostatic interactions that began at 10Å and ended at 12Å, and a shifting function for the Lennard-Jones potential. In rescoring the docking pose with the GBMV model, we used the GBMV1 parameters of Chocholoušová and Feig 21. The corresponding cutoff distances were 12Å, 8Å, and 10Å, respectively.
To examine whether the ligands were docked correctly, we used two measures. One was the distance (Rpath) between the center of the ligand pNCS and that of nine residues (Phe-229, Ile-232, Asp-356, Gln-357, Arg-404, Ala-405, Val-407, Arg-409, and Gln-446) around the binding pocket. For the four starting structures, this distance was 6.7Å, 7.6Å, 11.4Å, or 12.5Å. We also used the distance Rpath to monitor the docking of the ligand to the binding pocket of the protein and as one degree of freedom to define the reaction coordinate in the construction of docking pathways. The second measure was the root mean-square deviation (RMSD) between the docking structure and the crystal structure of the ligand, after superposition of the coordinates of the α-carbons of the protein.
We allowed the protein to move in the docking simulations but with an appropriate constraint to prevent the protein conformation from deviating too far away from the crystal structure. This was because we used a relatively simple distance-dependent dielectric model for the protein and current force field models are still approximate; we did not want to generate conformations that were too far away from reality. We achieved this by applying the restraint 1000kcal/mol/A2×D2 where D was the RMSD of a dynamics snapshot from the crystal structure (an option in CHARMM). Only the α-carbons were used in calculating the RMSD so that the side chains and other backbone atoms were completely free to move. With this constraint, individual residues could still experience large structural variations, as much as ∼7Å from the crystal structure, during the docking simulations.
On a dual core dual processor cluster node with 2.8GHz Intel Xeon EM64T processors, it took ∼104h for each simulated annealing cycling run containing four trajectories.
We constructed docking pathways by clustering similar structures and connecting representative structures between clusters. First we classified the structures near local minima (structures below 10K) according to Rpath, using a bin size of 0.2Å. Structures within each bin were then clustered using a self-organizing neural net approach 12,22,23,24 that has been incorporated into CHARMM and accessible from the MMTSB toolkit. This algorithm optimizes cluster assignment by minimizing the distance between members and their centroid structure within each cluster and by requiring this distance to be within a user-predefined cluster radius. One does not need to specify the number of clusters, as the algorithm determines the optimal number that satisfies the above criteria. In this work, the distance between two ligand structures was measured by the RMSD between their Cartesian coordinates after their protein structures were superimposed with the crystal structure. We set the cluster radius to be 3Å.
After clustering, we extracted the centroid structures from each bin. We then connected centroid structures between bins as described below in the section Ligand-docking pathways to deduce docking pathways. The structures closest to their corresponding centroid structures were used to visualize the structural change along the docking pathways. We also constructed an energy profile along the reaction coordinate by using the averaged binding energy of all cluster members at a given value of the reaction coordinate.
All four sets of simulations (each containing five runs with four trajectories each) starting from one of the four initial ligand structures described above had some trajectories that identified the correct docking pose. The best docking structures for the simulations starting from four different initial structures had RMSDs of 0.19Å, 0.75Å, 0.24Å, and 0.73Å from the crystal structure. Table 1 shows the trajectories that reached the binding pocket. Here, we considered the ligand to be reaching the binding pocket if Rpath<2Å. If it docked correctly, it also satisfied the condition that RMSD<2Å. From the table, one can see that 20 out of 80 trajectories correctly located the experimental docking structure. Forty-seven trajectories reached the binding pocket. We further used these 47 trajectories to construct docking pathways as discussed in the section Ligand-docking pathways below.
| Table 1 Number of trajectories reaching the binding pocket |
| Initial placement of ligand | |||||||
|---|---|---|---|---|---|---|---|
| Trajectories | Near surface 6.7Å* | Near surface, flipped 7.6Å | At surface 11.4Å | At surface, flipped 12.5Å | All 80 trajectories | ||
| reaching binding pocket | 14 | 17 | 7 | 9 | 47 | ||
| Docked correctly | 10 | 4 | 3 | 3 | 20 | ||
| Twenty trajectories were run for each initial placement of the ligand. The total number of trajectories was 80. Here we considered the ligand to reach the binding pocket when Rpath, <2Å. It docked correctly if its RMSD from the crystal tructure was,<2Å. |
| * Distance Rpath described in the text. |
To examine whether one could pick out the correctly docked structure based on energy alone, we plotted thebinding energy versus Rpath or RMSD (Fig. 2). As previous work has shown, including the energy of the protein would introduce data that were too noisy for such an analysis 25. Therefore, we defined the binding energy as the total energy of the complex minus that of the isolated protein. This was done for both the ɛ(r)=4r and the GBMV 21 models. In the figure, we only included data points for structures near local minima, i.e., structures obtained below 10K. From the RMSD plots, one can clearly see that the structures having the lowest energies were correct docking structures for both energy models. The more sophisticated GBMV model gave somewhat better results. The lowest energy structure in the ɛ(r)=4r model had an RMSD of 1.66Å from the x-ray structure and the corresponding structure for the GBMV model had an RMSD of 1.34Å. This gave us some confidence that the energy models used here were adequate to yield semiquantitative insights into protein-ligand docking for this system.
To construct the docking pathways, we first took 11,750 structures sampled below 10K from the 47 trajectories mentioned before. These structures represented those near global or local energy minima. All the structures were superimposed with the x-ray structure using the α-carbons of the protein. We then calculated Rpath and grouped the structures into bins with a width of 0.2Å. The averaged binding energy for structures within each bin was then calculated and plotted against Rpath (Fig. 3). This resembles a potential-of-mean-force calculation; it is much more approximate but significantly cheaper to obtain to give preliminary insights into the energy profile along the reaction coordinate defined by the single variable Rpath. From Fig. 3, one can see that the energy barrier for ligand entry was only ∼4kcal/mol whereas the energy barrier for ligand release was significantly larger at ∼23kcal/mol.
Using only one variable, Rpath, to define the reaction coordinate does not yield much insight into the structural adjustment of the ligand that was required for successful docking. To gain further insight, we used all the degrees of freedom of the ligand as well. To obtain the reaction coordinate in terms of all these degrees of freedom, we performed cluster analysis on structures within bins and then connected representative structures (chosen to be the centroid structures) in each bin with its adjacent bins. In this analysis, we allowed for the possibility of multiple pathways. For example, if the structures in some of the bins were better grouped into several clusters rather than one, multiple pathways were feasible at these locations. To determine which centroid structure in one bin should be connected to which centroid structure in adjacent bins, we performed similarity analysis. The similarity measure was simply the RMSD between centroid structures in adjacent bins. Table 2 gives two examples of such an analysis. From the first example, one can see that clusters 1, 2, and 3 in the 1.1Å bin were connected to clusters 1, 2, and 3 in the 1.3Å bin, respectively. In the second example, cluster 1 in the 9.3Å bin was connected to cluster 5 in the 9.5Å bin. On the other hand, it was sometimes difficult to make connections only between adjacent groups. In such cases, we also considered connections with next-next and next-next-next-nearest neighbors.
| Table 2 Examples of connecting clusters to form pathways |
| First example (full data) | Second example (partial data) | ||||
|---|---|---|---|---|---|
| 1.1Å bin | 1.3Å bin | 9.3Å bin | 9.5Å bin | ||
| Cluster 1 | 1*: 0.31† | Cluster 1 | 1: 14.71 | ||
| 2: 5.03 | 2: 4.08 | ||||
| 3: 4.40 | 3: 11.07 | ||||
| Cluster 2 | 1: 5.13 | 4: 7.03 | |||
| 2: 0.47 | 5: 1.12 | ||||
| 3: 2.90 | 6: 9.40 | ||||
| Cluster 3 | 1: 4.40 | 7: 13.35 | |||
| 2: 2.84 | 8: 8.86 | ||||
| 3: 0.38 | 9: 14.47 | ||||
| Bold indicates that this cluster was connected to the cluster on its adjacent bin on the left to form a portion of a pathway. |
| * Denotes cluster 1 in the 1.3Å bin. † Indicates the RMSD between the centroid structure of cluster 1 in the 1.1Å bin and that in the 1.3Å bin was 0.31Å. |
The following summarize the criteria used to construct the entire connections:
Fig. 4 shows several major pathways derived from this analysis. The small Rpath regions demonstrated that there were three pathways leading to the binding pocket. One pathway resulted from the merging of two pathways at large values of Rpath. However, only the pathway colored blue led to the correct docking pose. The cluster containing this structure in the 0.7Å bin had its centroid structure only 0.67Å away from the crystal structure. The other two pathways yielded two incorrect docking structures. Note that the pathways could not cross each other except at the point colored red or between points connected by red lines. Thus, for the incorrectly docked structures to reach the correctly docked structures, they needed to move out to a larger value of Rpath, at which it could switch to the pathway leading to the correct docking pose. For example, the incorrect docking pose in the 0.7Å bin for the pathway colored yellow could move out to 4.7Å, followed the red line to cross to the pathway labeled blue so that it could travel to the correct docking structure. From Fig. 4, one can also see that the pathways labeled green and cyan led to the same incorrectly docked structure because they converged into one pathway when Rpath was below ∼1.7Å. This incorrect docking structure had a much more unfavorable binding energy (−0.59kcal/mol) in comparison to the correct docking pose (−11.27kcal/mol). The pathway labeled yellow led to another incorrectly docked structure. However, this structure had a much more favorable binding energy, only a little more unfavorable (−9.75kcal/mol) than that of the correctly docked structure.
Beginning from the smallest value of Rpath, the correct docking pathway did not separate into branches until it reached 3.9Å. The inset of the figure showed these branches more clearly after removing the other pathways from the figure. By following the structural change along these pathways, we knew that the branch line colored blue was associated with the ligand initially placed near the surface; the black line was analogous except with the ligand flipped. The yellow line was associated with the two structures of the ligand, flipped relative to each other, that were initially placed at the surface of the protein. The purple line was a pathway resulting from the failure of the ligand to enter the binding pocket and drifted away from the protein surface. From these pathways, one can see that there are multiple pathways leading from the surface of the protein to the entrance of the binding pocket. On the other hand, there was only one pathway that led to successful docking after the ligand entered the pocket.
Fig. 5 demonstrates the structural change associated with the docking pathways. Each structure was a representative structure of a cluster that was closest to the centroid of the cluster. Four major pathways leading from the binding pocket to the surface of the protein are shown (Rpath from 0.7Å to 6.3Å). Each pathway is shown as a column in the figure. The structures at the smallest values of Rpath illustrate the correct docking pose and the two incorrect ones described above. Pathway I (column 1) led to the correct docking pose, and the energy profile was shown by the blue line in Fig. 3. Pathways II, III, and IV (columns 2, 3, and 4 in the figure) correspond to the yellow, cyan, and green lines in Fig. 4, respectively. The incorrectly docking structure in Pathway II was closer to the correct docking structure than that from Pathway III (and IV). Its aromatic ring and sulfate group were located almost the same way as the correct docking structure, except that the ring was flipped such that its nitro and hydroxyl substituents were in opposite sides of the ring in comparison to the correct docking structure. As shown in the energy plot in Fig. 4, these two docking poses had very similar energy, probably because the sulfate group provided the dominant contributions to binding with relatively minor assistance from the nitro and hydroxyl groups. On the other hand, the incorrectly docked structure from Pathway III was very different from those of Pathways I and II. The nitro group now occupied the same location as the sulfate group adopted by the above two structures. From the energy plot of Fig. 4, this structure bound much less favorably to the protein than the other two structures. This is probably because the nitro group did not interact as favorably as the sulfate group with this portion of the protein.
The strong interactions between the sulfate group and the protein also appeared to play an important role in determining how the ligand enters or leaves the binding pocket. By analyzing these pathways, the sulfate group was always the last part of pNCS to leave the pocket during unbinding. On the other hand, the sulfate group always entered the binding pocket first during binding. Two of our starting ligand structures at or near the protein surface were intentionally flipped such that the sulfate group pointed away from the binding pocket. But our simulation showed that the ligand had to flip around first before it could enter the binding pocket with the sulfate group heading the way.
Fig. 5 also shows the side-chain movement of nine residues (Phe-229, Ile-232, Asp-356, Gln-357, Arg-404, Ala-405, Val-407, Arg-409, and Gln-446) lying in or near the binding pocket. These residues were all involved in interacting with the ligand at one point or another. Table 3 shows the short-ranged interactions between the ligand and the protein along the correct docking pathways (from yellow to blue in the inset of Fig. 4). One can see that the sulfate group was interacting with the protein all the way from the surface of the protein to the binding pocket. This is consistent with our earlier discussions that the sulfate group led the ligand in docking into the protein. However, the residues that it interacted with changed as it moved into the binding pocket. At 6.7Å, it interacted with Arg-230 and Gln-446. At 5.5Å, Arg-404 took the place of Arg-230 although Gln-446 was still involved. At 4.3Å, Gln-357 also entered the picture. However, at 3.5Å, only Arg-404 maintained contact with the sulfate. At 2.7Å, Gln-446 and Gln-357 regained their interactions and Cys-403 and Ala-405 also became involved. The ligand had the largest number of short-ranged interactions at 0.7Å, where the ligand reached the binding pocket. Arg-409, in particular, formed many interactions with the sulfate group. The backbone of Gly-406, Val-407, and Gly-408 also participated in binding the ligand. The fact that the ligand formed many more interactions when it was at the binding pocket (Rpath=0.7Å) than when it was along the docking pathway and that most of the interactions were formed with the sulfate group again reinforces the notion that the sulfate group provided the dominant binding and steering force. The only interaction that did not involve the sulfate group was the one between the Arg-404 side chain and the hydroxyl group of pNCS. The hydroxyl group might also play a minor role in leading the ligand to the docking site. Table 3 shows that it interacted with Asp-356 and Arg-404 at 2.7Å, also with Gln-357 at 3.5Å, and with Gln-357 and Gln-446 only at 5.5Å and 6.7Å. On the other hand, the nitro group only participated when Rpath=4.3Å and did not have any short-range contact with the protein at shorter distances. Although Arg-409 formed tight interactions with pNCS in the binding pocket, it was not involved in the ligand-docking pathway. Arg-404, on the other hand, started interacting with the ligand at the surface and led it all the way to the binding pocket. The large movement of some residues observed during docking suggested that protein flexibility played an important role in facilitating docking. The flexibility of Ile-223 might also aid binding by moving away from the binding pocket to open up space for the ligand.
| Table 3 Interactions between pNCS and YopH along correct docking pathways (blue line or Pathway I and yellow line or Pathway II in the inset of Fig. 4) |
| 0.7Å (I) | 2.7Å (I) | 3.5Å (I) | 4.3Å (I) | 5.5Å (II) | 6.7Å (II) | ||
|---|---|---|---|---|---|---|---|
| C403-SG:OS2;3.51* | C403-SG:OS4;3.44 | R404-NH2:OS1;3.87 | R404-NH2:OS3;3.47 | R404-NH1:OS2;2.87 | R230-NH2:OS2;3.38 | ||
| C403-SG:OS3;3.62 | Q446-NE2:OS2;2.93 | R404-NH2:OS3;2.97 | R404-NE:OS3;2.86 | R404-NH2:OS2;3.39 | R230-NH2:OS3;3.20 | ||
| C403-SG:OS4;3.32 | Q446-OE1:OS2:3.78 | R404-NE:OS4;2.82 | R404-NH2:OS2;2.91 | Q446-NE2:OS1;3.15 | Q446-NE2:OS1;3.62 | ||
| R409-NH2:OS2;2.98 | D356-OD2:OS4:3.53 | R404-NH2:OS4;3.82 | Q357-NE2:OS2;3.70 | Q446-OE1:OS1;3.97 | Q446-NE2:OH;3.85 | ||
| R409-NE:OS2;3.82 | D356-OD1:OS4;3.61 | R404-NH2:OH;3.31 | Q357-NE2:OS4;3.12 | Q357-OE1:OH;3.97 | Q357-OE1:OH;3.8 | ||
| R409-NE:OS4;3.05 | D356-OD2:OS1;3.25 | D356-OD1:OH;3.01 | Q446-NE2:OS4;2.87 | Q357-NE2:OH;3.40 | R205-NH1:ON2;3.37 | ||
| R409-NH2:OS4;3.84 | D356-OD1:OS3;2.95 | Q357-NE2:OH;3.81 | Q446-OE1:OS4;3.36 | R205-NH2:ON2;2.98 | R205-NH1:ON1;3.50 | ||
| R409-NH:OS4;3.16 | D356-OD2:OS3;3.26 | D356-OD1:OS2;3.46 | R205-NH2:ON1;3.46 | ||||
| R404-NH1:OH;3.33 | D356-OD1:OH;3.92 | D356-OD1:OS3;3.29 | |||||
| R404-NH:OS2;3.40 | D356-OD2:OH;2.71 | Q446-NE2:ON2;3.41 | |||||
| A405-NH:OS2;3.06 | R404-NH1:OH;3.14 | A405-NH:OS3;3.95 | |||||
| V407-NH:OS3;3.34 | A405-NH:OS1;3.84 | ||||||
| G406-NH:OS3;3.51 | |||||||
| G408-NH:OS3;3.03 | |||||||
| G408-NH:OS4;3.69 | |||||||
| Interactions between two groups within 4.0Å are shown. OS2, OS3, and OS4 are the nonbridging oxygens of the sulphate group of pNCS. OS1 is the bridging oxygen. OH is the hydroxyl group of pNCS. ON1 and ON2 are the oxygens of the nitro group of pNCS. Residue and atom names are those used in CHARMM. |
| * Distances are in Å. |
The amino acids that interacted with the sulfate and the nitro group in the correct and incorrect docking structures adopted a similar conformation in the binding pocket. On the other hand, the residues that interacted with the other side of the ligand had a somewhat different conformation for the three different docking poses, consistent with the proposition that this part of the protein did not contribute as much to binding the ligand. Therefore, this portion of the protein might react passively according to whichever portions of the ligand were presented to them after the major interactions on the other side of the ligand were established.
Fig. 6 shows that the structure of the protein surrounding the predicted docking pose agreed well with the crystal structure. The side chains of 15 residues (Phe-229, Ile-232, Gln-290, Trp-354, Pro-355, Asp-356, Gln-357, Cys-403, Arg-404, Ala-405, Gly-406, Val-407, Gly-408, Arg-409, and Gln-446) are shown. They all agreed well with the crystal structure with the exception of three residues: Arg-404, Asp-356, and Gln-446. One reason that our model did not predict the structure of these three residues as well might be the involvement of water molecules in mediating the interactions between the ligand and the protein, as observed in the crystal structure 10. Because we used an implicit solvent model, such effects could not be accounted for.
The hydrogen bonding and polar interactions between the ligand and the protein for the crystal and predicted docking structure are shown in Figure 7AB, respectively, for comparison. It is evident that most of the interactions appearing in the crystal structure were also present in the docking structure. The strongest interactions presented by the sulfate group with the P-loop and Arg-409 of the protein were well described by the predicted docking structure. The bidentate hydrogen bonds between Arg-409 and Glu-290 were also tightly formed in the docking structure as in the crystal structure. On the other hand, the tight interactions between Arg-404, Asp-356, or Gln-446 with the ligand were lost in the predicted structure, as indicated by the increased distances. These interactions included the hydrogen bonds between Nɛ of Gln-446 and the nitro oxygen of pNCS, between Nɛ of Gln-357 and the phenolic oxygen of pNCS, and the polar interactions between the carboxyl oxygen of Asp-356 and the hydroxyl oxygen of pNCS. These distances increased by at least 1Å from the corresponding distances in the crystal structure. The tighter interactions in the crystal structure were maintained by two bound water molecules—Wat74 and Wat185—through water-mediated hydrogen bonds (Figure 7A). These water-mediated interactions could not be described by the implicit solvent model used in our docking simulation (Figure 7B). Because Asp-356 serves as a general acid in protein tyrosene phosphatase catalysis 26, these bound water molecules might also play critical roles in phosphotyrosine hydrolysis 27.
A mining minima approach utilizing simulated annealing cycling to explore ligand-docking pathways was tested in the docking of pNCS to the Yersinia protein tyrosine phosphatase YopH. To reduce computational costs, we found that a simple distance-dependent dielectric (ɛ(r)=4r) model was sufficiently reliable to identify the correct docking pose, as long as we applied weak RMSD constraints to the α-carbons of the protein. The structure of the protein in the binding pocket was also mostly recovered by our docking study. However, three residues in the binding pocket were not described as well by our model. This was because of the lack of explicit water molecules that could form water-mediated interactions as those observed in the crystal structure.
Clustering the structures near the energy minima and examining how they might be connected has yielded useful insights into the docking pathways. Our analysis demonstrated three different docking poses in the binding pocket; one of them was the correct docking structure and the other two were incorrect docking structures. We also observed four docking pathways leading from the surface of the protein to the correct docking pathway. The centroid structure from the cluster containing the lowest energy structure had an RMSD of only 0.67Å from the crystal structure. This structure was much better than that (1.66Å) determined by binding energy alone. Thus, using both binding energy and clustering could better identify the correct docking pose in docking simulations.
The sulfate group of pNCS appeared to play the most important role in contributing to binding and in directing the ligand toward the binding site during docking. The flexibility of the protein was also found to play an important role in facilitating ligand entry into or exit from the binding pocket.
We thank the University of Missouri Bioinformatics Consortium and the University of Missouri-St. Louis Information Technology Services for providing computational resources.
This research was supported by a research award from the University of Missouri-St. Louis, by a research board award from the University of Missouri System, and by the National Institutes of Health.
1. (1999). Docking of flexible ligands to flexible receptors in solution by molecular dynamics simulation. Proteins 35, 153–162. CrossRef | PubMed
2. (1997). Flexible docking of a ligand peptide to a receptor protein by multicanonical molecular dynamics simulation. Chem. Phys. Lett. 278, 297–301. PubMed
3. (1997). Binding pathway of retinal to bacterio-opsin: a prediction by molecular dynamics simulations. Biophys. J. 73, 2972–2979. Abstract | | PubMed
4. (1999). Forced unfolding of fibronectin type 3 modules: an analysis by biased molecular dynamics simulations. J. Mol. Biol. 288, 441–459. CrossRef | PubMed
5. (1991). The energy landscapes and motions of proteins. Science 254, 1598–1603. PubMed
6. (1983). Optimization by simulated annealing. Science 220, 671–680. PubMed
7. (1990). Protein tyrosine phosphatase activity of an essential virulence determinant in Yersinia. Science 249, 553–556. PubMed
8. (1985). Textbook of Medicine. (Philadelphia, PA: W. B. Saunders). PubMed
9. (1994). Crystal structure of Yersinia protein tyrosine phosphatase at 2.5Å and the complex with tungstate. Nature 370, 571–575. CrossRef | PubMed
10. (2003). Crystal structure of the Yersinia protein-tyrosine phosphatase YopH complexed with a specific small molecule inhibitor. J. Biol. Chem. 278, 33392–33399. CrossRef | PubMed
11. (1992). Nosé-Hoover chains—the canonical ensemble via continuous dynamics. J. Chem. Phys. 97, 2635–2643. CrossRef | PubMed
12. (2004). MMTSB tool set: enhanced sampling and multiscale modeling methods for applications in structural biology. J. Mol. Graph. Model. 22, 377–395. CrossRef | PubMed
13. (1983). CHARMM: a program for macromolecular energy, minimization, and dynamics calculations. J. Comput. Chem. 4, 187–217. CrossRef | PubMed
14. (1996). The x-ray crystal structures of Yersinia tyrosine phosphatase with bound tungstate and nitrate. J. Biol. Chem. 271, 18780–18788. CrossRef | PubMed
15. (1996). VMD: visual molecular dynamics. J. Mol. Graph. 14, 33–38. CrossRef | PubMed
16. (2004). Gaussian 03 Revision C.02. (Wallingford, CT: Gaussian Inc). PubMed
17. (1990). Atomic charges derived from semiempirical methods. J. Comput. Chem. 11, 431–439. CrossRef | PubMed
18. (2004). Extending the treatment of backbone energetics in protein force fields: limitations of gas-phase quantum mechanics in reproducing protein conformational distributions in molecular dynamics simulations. J. Comput. Chem. 25, 1400–1415. CrossRef | PubMed
19. (2002). Novel generalized Born methods. J. Chem. Phys. 116, 10606–10614. CrossRef | PubMed
20. (2003). New analytic approximation to the standard molecular volume definition and its application to generalized Born calculations. J. Comput. Chem. 24, 1348–1356. CrossRef | PubMed
21. (2006). Balancing an accurate representation of the molecular surface in generalized Born formalisms with integrator stability in molecular dynamics simulations. J. Comput. Chem. 27, 719–729. CrossRef | PubMed
22. (1987). ART 2: self-organization of stable category recognition codes for analog input patterns. Appl. Opt. 26, 4919–4930. PubMed
23. (1993). Statistical clustering techniques for the analysis of long molecular dynamics trajectories: analysis of 2.2-ns trajectories of YPGDV. Biochemistry 32, 412–420. PubMed
24. (1989). Adaptive Pattern Recognition and Neural Networks. (New York: Addison Wesley). PubMed
25. (2005). Representing receptor flexibility in ligand docking through relevant normal modes. J. Am. Chem. Soc. 127, 9632–9640. CrossRef | PubMed
26. (1994). Dissecting the catalytic mechanism of protein-tyrosin phosphatases. Proc. Natl. Acad. Sci. USA 91, 1624–1627. CrossRef | PubMed
27. (1995). A ligand-induced conformational change in the Yersinia protein tyrosine phosphatase. Protein Sci. 4, 1904–1913. PubMed