This work represents an extensive MD simulation / water-dynamics studies on a series of complexes of inhibitors (leupeptin, E-64, E-64-C, ZPACK) and plant cysteine proteases (actinidin, caricain, chymopapain, calotropin DI) of papain family to understand the various interactions, water binding mode, factors influencing it and the structural basis of differential inhibition.
The tertiary structure of the enzyme-inhibitor complexes were built by visual interactive modeling and energy minimization followed by dynamic simulation of 120 ps in water environment. DASA study with and without the inhibitor revealed the potential subsite residues involved in inhibition. Though the interaction involving main chain atoms are similar, critical inspection of the complexes reveal significant differences in the side chain interactions in S2-P2 and S3-P3 pairs due to sequence differences in the equivalent positions of respective subsites leading to differential inhibition.
The key finding of the study is a conserved site of a water molecule near oxyanion hole of the enzyme active site, which is found in all the modeled complexes and in most crystal structures of papain family either native or complexed. Conserved water molecules at the ligand binding sites of these homologous proteins suggest the structural importance of the water, which changes the conventional definition of chemical geometry of inhibitor binding domain, its shape and complimentarity. The water mediated recognition of inhibitor to enzyme subsites (Pn...H2O....Sn) of leupeptin acetyl oxygen to caricain, chymopapain and calotropinDI is an additional information and offer valuable insight to potent inhibitor design.
Recently the cysteine proteases of both the plant and animals have received a considerable attention because of their broad ranges of activities and critical role in the different intracellular / biological / pathological processes or disorders [1–7]. Uncontrolled proteolysis of certain proteases e.g. Cathepsins [8, 9], Caspases [10, 11], Cruzain [12, 13] causes several pathological disorders. Therefore the development of potential inhibitors which can modulate or can moderately control the proteolytic activity has become a challenge in drug design. For combating that challenge, some stimulating investigation on the structural aspects of the complexes on those kind of cysteine protease with different organic inhibitors or ligands e.g. E-64, E-64-C, leupeptin etc have been done by x-ray methods, which revealed the 3D-interactional information about the substrate binding chemistry and the topological requisite for substrate-mimicking inhibitors. However, due to the lacking of detail inhibitor binding information, we are interested to carry out those studies by MD-simulation methods on plant cysteine protease-inhibitor complex of papain [14, 15] superfamily.
In this regard, beside the common Sn-Pn (n = 1–3) interactional events of the inhibitor with the main chain of the active site residues in the respective complexes, the role of the water molecules can not be ignored as they are sometimes present as conserved at the ligand binding sites of homologous proteins. These structurally conserved water molecule can change or influence the shape and complimentarity of the inhibitor site, thus affect strategies for therapeutic design. So, in order to get valuable insight of the characteristics of the inhibitor, the modeled inhibitor-enzyme complex structures are analyzed in detail and compared with the crystallographic information. The model structures of the enzyme-inhibitor complexes are built for the present study using template of available x-ray structure of enzyme-inhibitor complexes from papain family. In each complex structure, either from a x-ray or model study, it is evident that binding and catalysis is a two fold mechanism and in each case the subsite (Sn-Pn) interactions  are the main features for consideration to understand their differences in specificity.
The chemical structures of the selected inhibitors E-64, E-64-C, leupeptin, ZPACK for the present study are shown in Fig. 1. The inhibitor, E-64, a trans epoxysuccinic acid attached to a modified dipeptide [(leucylamino)-4-guanidinobutane] [17, 18] is a potent irreversible (covalent type) inhibitor for cysteine proteases in general and its binding modes with papain , actinidin , caricain  and other lysozomal cysteine proteases  have been reviewed. The other analogue, E-64-C , consisting of epoxysuccinyl, leucyl, and an isoamyl group, is also being studied for its mode of different subsite binding [23–25]. The efficacy of this kind of inhibitor, makes the epoxide and its derivatives potential candidates as drug for the suppression of elevated levels of cysteine protease activity associated with certain disease states .
Leupeptin is another biologically important peptidyl aldehyde (AC-leu-leu-Arginal), a naturally occurring protease inhibitor  of microbial origin. The arginal residue at the C-terminus of leupeptin is found to be essential for its inhibitory activity which inhibits cysteine proteases as well as serine proteases . Aldehyde inhibitors such as leupeptin are promising lead compounds in the design of anticancer protease inhibitors [28, 29]. They are mostly composed of amino acids and therefore have few metabolic degradation products that are toxic .
ZPACK (N-benzoyloxycarbonyl-L-phenylalanine-L-alanylchloromethyl ketone) [17, 18], a substrate – like synthetic inhibitor , is specifically designed for cysteine proteases.
We have done model enzyme-inhibitor complexes for which no crystal structure is available with a view that, these structural study of the complexes by modeling along with the x-ray structures will afford further insight and complementary information in this area.
Results and discussion
All the inhibitors selected for the present study (E-64, E-64-C, Leupeptin, ZPACK) can form the covalent bond with active site Cys 25 Sγ atom (with distance ~1.80 Å) of each cysteine proteases. This crucial interactional step which is mainly manifested in their inhibitory activity seems to remain conserved in all of our considered thiol proteases (papain, caricain, chymopapain, actinidin, calotropin DI) belonging to papain superfamily. The study reveals that each of the inhibitor molecule forms hydrogen bonds with the S1 subsite residues of enzyme molecule almost in a same fashion (Table I) as were observed in x-ray structures. The main residues that are involved in the conserved interactions are Cys 25 Sγ, Cys 25 amide nitrogen, the side chain of Gln 19 Nε2H, His 159 Nδ1H and in some cases carbonyl oxygen of Asp 158. The other residue Gly 66 offers its carbonyl oxygen and amide nitrogen to form hydrogen bond with the backbone of inhibitor molecule in most of the cases which are shown in Table I. It is observed that all these interactions help to orient the inhibitor molecule suitably in the active site cleft for the required nucleophilic attack by the active Cys 25 Sγ atom in the complex structure . The S1-P1interactions of each of the inhibitor with the proteases papain, actinidin, caricain, chymopapain and calotropin DI are included in Table I. Both the MD simulation study and solved enzyme-inhibitor crystal structure of papain superfamily reveal, although S1-P1 interaction remains conserved / similar in all cases, however essential differences arise mainly at the S2-P2 and S3-P3 side chain interaction sites, depending on the amino acid residues constituting the respective sites of the proteases. The reaction propensities of the enzyme residues of the model enzyme-inhibitor complexes can be looked from DASA results (Fig. 6 – Fig. 9) which show distinctly the residues involved in electrostatic and hydrophobic interaction with the E-64, E-64-C, Leupeptin, ZPACK. The interaction pattern obtained from MD-simulation study indicates that the differential interaction in the S2 subsite is a dominant factor in defining the substrate specificity of cysteine proteases of papain family. This S2 subsite, which is generally hydrophobic in nature, is the best defined substrate binding site that deserved the term pocket and involves the residues both from -R and -L domain of enzyme . Table II shows the involved residues of the selected proteases in S2 subsite region. Another residue (205, papain numbering) Ser for papain, Met (211) for actinidin, Gln (205) for calotropin DI at the back of S2 pocket is found to play a significant role in deciding the subtrate specificity of these enzymes at the P2 position of an inhibitor. The residues involved in S3 binding site (Table II) are generally found to be present on the enzyme surface and the interaction seems to prevail only in the side chain region with no main chain interaction.
Probable interaction of the inhibitors with the active site residues ofActinidin, Chymopapain, Caricain and Calotropin DI
Binding mode of E-64 and E-64-C to proteases (Caricain, Chymopapain, Actinidin, Calotropin DI)
All the main chain hydrogen bonding are similar to those found in papain complexes (Table I). The P1 carbonyl oxygens of E-64 and E-64-C forms hydrogen bonds with Cys-25 N, Sγ and Gln 19 Nε2H. The P2 main chain keto O and amide NH (Table III) make two hydrogen bonds with NH and O of Gly-66 respectively in most of the cases forming one residue stretch parallel β sheet structure. Although the interaction at the S1 subsite of enzymes are similar in nature for both of the inhibitors, significant differences arise at the P2 position while making interaction to S2 pocket of the enzymes. P2 residue (Leu) of E-64 extends inside into the pocket (Fig. 10) whereas corresponding P2 residue of E-64-C is located just at the entry of the S2 hydrophobic pocket (Fig. 11) interacting weakly. This point also gets support from DASA study (Fig. 6 – Fig. 9) for E-64-C-protease complexes where the peak for one of the leading residue at the S2 pocket is either very low or absent : Gly-130 (for calotropin DI), Ile-133 (for chymopapain), Val-133 (for caricain), Ala-136 (for actinidin). Another significant feature arising from MD simulation study that also get support from DASA study, is the residues of proteases especially Asp-158 in model complexes with E-64, E-64-C show a considerable decrease of their accessibilities suggesting that the Sn (n = 1–3) subsites of the modeled complexes are perfectly shielded from the solvent phase by binding with the inhibitors. The large difference in accessibility may occur due to many interaction with the inhibitor molecule. There is a second mode of binding for E-64-C to protease where isoamylamide and leu group of E-64-C involved in hydrophobic interaction with S2 and S3 site (contrary to S3 and S2 sites)of proteases respectively, but it does not change any basic interaction characters at these sites. In both cases the attack on the active site thiol group of the cysteine proteases occur from the backside of the inhibitor resulting in opening of the epoxide ring and at the same time the formation of a covalent bond between inhibitor and enzyme [19, 20, 32, 33].
Binding mode of ZPACK and Leupeptin to proteases (Caricain, Chymopapain, Actinidin, Calotropin DI)
Both ZPACK (N-benzoyloxycarbonyl-L-phenylalanyl choromethyl ketone) and leupeptin form hydrogen bonds with the proteases in model complexes in usual fashion (Table I). But the important feature of β sheet nature of hydrogen bonding pair formed between the P2 moiety of ZPACK (phe) and Leupeptin (Leu) (Table III) with protease Gly-66 is antiparallel, different from E-64/E-64-C-protease complexes. The bulky P2 sidechain of ZPACK, which is Phe, is comfortably accommodated in the space available making hydrophobic interaction with all the residues Leu-133, Leu-157 (for chymopapain), Ala-136, Ile-160, Met-211 (for actinidin), Val-133 and Val-157 (for caricain), Gly-130, Val-154, Gln-205 (for calotropin DI). The P2 moiety (Table III) extends itself far inside the S2 pocket and thus away from carbonyl oxygen atom of Asp 158 (Fig. 12) which makes P2 backbone unable to form hydrogen bond with this residue. DASA study done on the model complexes with ZPACK shows the peak for Asp-158 for respective enzymes notably less prominent compared to other complexes. Thus it can be argued that the absence of this particular H-bond makes this protease-inhibitor complex less stable compared to other protease-inhibitor complexes. The residues involved in S2 subsite interaction for ZPACK-protease model are also same and responsible for preference of substrate splitting in leupeptin-protease complexes (Fig. 13). P2 moiety of leupeptin and E-64, despite being inside the S2 pocket, orients its backbone in such a way that they are able to form the additional H-bond (Table I) with carbonyl oxygen atom of Asp-158 giving extra stability to the respective complexes. There is another pocket on the left of the cleft called S3 subsite defined by the segment 61, 67 (papain numbering). The CBZ (benzoyloxycarbonyl) group of ZPACK simulating its P3 residue in the present complex makes aromatic-aromatic interaction with phenyl ring of S3 subsite with the shortest distance from aromatic to aromatic group > 4Å (Fig. 12), whereas the P3 moiety (Leu) of Leupeptin extends its sidechain towards the surface of S3 site interacting with its constituent residues (Fig. 13) of proteases.
Role of water in inhibitor binding to the respective proteases
Water molecules have endowed several important role in the recognition and stabilization of the interaction between the ligand and its site. Table IV(A,B,C) gives a generalized view of interaction with the water molecules in active site cleft of the proteases. Extensive MD simulation / water dynamics study done on a series of enzyme-inhibitor complexes from papain plant family demonstrates some unique features of interesting nature. Conformational and dynamic properties of the model complexes are compared among themselves and with respect to x-ray structure. Solvent networks found in the x-ray structure were reproduced by the simulation which was unbiased with respect to the crystalline hydration structure. These networks seem to play an important role in the stability of the complexes, evidence of this is found in the structure of the ligand water interaction of the active site. We have identified water molecules, by studying a series of MD simulated complexes, in the active site cleft that may be essential in binding of inhibitor in the cleft.
In all Leupeptin-protease model complexes we found P1 side chain (Arg) of leupeptin is extensively involved in water interaction network (Table IV(A,B,C)). These water interactions though less in number, are also found to be present in x-ray structure of papain-leupeptin complex . These interactions support the fact that P1 side chain of inhibitor, leupeptin, makes few interactions with S1 site to avoid steric hindrance with carbonyl group of Gly-23 of protease molecule and extend straight up out of the cleft towards solvent  making a number of interactions. Moreover the keto oxygen (O3) of ZPACK, epoxy and carbonyl oxygens of E-64, E-64-C seem to stabilize through strong and weak H-bond with water molecules within the groove of the enzymes. The amide nitrogen (N11) of ZPACK and keto oxygen (O9) of E-64, E-64-C are observed to form H-bonds with water sites thus assisting the chemical potentiality of P2 site. MD simulations also characterize some other important stereochemically potential interaction sites of inhibitors in complexation with different proteases associated with P3 site of E-64, leupeptin, ZPACK and involved in solvation network within the active site. Guanidinobutane group of Arg of E-64, keto oxygen (O13, O14) of CBZ of ZPACK and acetyl keto oxygen of leupeptin involved in water interaction at S3 site with each protease in model complexes (Table IV(A,B,C)). The water molecules easily access the surface region of S3 site of enzyme to interact with the polar group involved in P3 site of those inhibitor molecules. Solved x-ray structure of complexes from papain family also shows this solvation network of P3 site of inhibitor only for papain-E-64  and actinidin-E-64  complexes but few in number. The inhibitor bonded water molecular bridging H-bonds of leupeptin acetyl oxygen directly to protein S2 / S3 subsite residues at amide oxygen of Val-157 (in caricain) and Gly-66 (in chymopapain) and at side chain Tyr-67-OH group of calotropinDI are more conducive allowing the ligand binding by Pn...H2O...Sn type of interaction which are evidenced and are supported from our studies. Thus the hydration dynamic study on series of modeled complexes helps us to understand the role of water involved in stabilizing the inhibitor molecules in S2 / S3 subsite of proteases, where polar group involved in P3 site of inhibitor may add extra stability to enzyme-inhibitor complexes by water mediating interactions of ligand and enzyme residues involved in this segment.
The other key feature and the most important one that has emerged out from the study is the participation of a key water molecule (marked by ** in Table IV(A,B,C)) in the vicinity of oxyanion hole within the active site of proteases which seemingly play a vital role in inhibitor binding. This particular water molecule found near the region of oxyanion hole, comprising of Cys 25 NH and Gln 19 Nε2H, forms hydrogen bond with the carbonyl / carboxyl oxygen of P1 moiety of inhibitor at a distance of ~2.63 Å to 3.47 Å [O18 of E-64 and E-64-C, O3 of ZPACK and mainchainO of Arg of leupeptin] (Fig. 10 – Fig. 13). This observation when extended to the solved x-ray structure of papain family either native or complexed [15, 19–21, 33, 35, 36], we found that this particular water is also present in several of these cases (PDB codes are 1PPN, 2ACT, 2AEC, 1MEG, 1GEC, 1YAL, 1PPP, 1PE6). This important observation lead us to suggest that the water molecule present in the solved structure and model complexes may be involved in stabilizing the inhibitor molecule during nucleophilic attack of cysteine protease. And perhaps this water which behaves like a conserved water, is involved in acylation / deacylation step during catalysis by proteases in papain family. Table IV(A,B,C) highlights that for each inhibitor the interactions are not random, but rather highly specific, usually involving specific residues / regions of the ligand. These water molecules contribute directly to the stability of the complex by holding themselves in the right position through network of hydrogen bonds. These water networks are probably crucial for the stability of protein-ligand complexes and may be important for any site directed drug design strategies. Thus our observation demonstrates that incorporation of water molecules into the system improves interpretation and predictive ability of the models and are in good agreements with the x-ray structural analysis of some complexes and offers valuable insights into new characteristics of the ligands which may be exploited for design of more potent inhibitors.
Material and Method
The different inhibitors (Fig. 1) were docked to the active site cleft of the selected proteases from the papain family using chemical intuition and viable interaction strategy. The position and orientation of these inhibitors in the active site cleft was primarily based on the knowledge of x-ray structures [19–21, 30–34, 37–39] of different cysteine proteases with the inhibitors.
Each of the modeled structures was subjected to constrained energy minimization to relieve residual strain maintaining the integrity of the model. The modeling and energy minimization was done with the InsightII software package (Molecular Simulations Inc., San Diego, CA, version 95.0) on a Silicon Graphics Indigo workstation. The minimization was done with the Discover Platform of the InsightII program with a consistent valence force field (CVFF). The algorithms utilized for minimization were typically the conjugate gradient options in vacuum and were chosen to be distance dependent. The covalently linked ligand was minimized while the enzyme backbone was kept rigid using the Discover option constraint. During energy minimization, in addition to the usual energy terms (bond, angle, dihedral, improper dihedral, electrostatic and van der Waals), distance constraints were also introduced initially. Different selected inhibitors were docked into the caricain [21, 40], chymopapain , actinidin , calotropinDI  environments lead by the information of x-ray crystallographically solved complex structures of proteases from the papain family. The nonbonded interactions included van der Waal's and electrostatic forms which used the Lenard-Jones and Coulomb functions respectively. The cut off radius for nonbonded interactions was 30 Å.
Molecular dynamics (MD) simulations of the modeled enzyme-inhibitor complex structure
Each of the energy minimized structure was solvated by a 10 Å layer of water molecules using SOAK option of insight II (Molecular Simulations Inc., San Diego, CA, version 95.0). During molecular dynamic simulations and minimization, a distance dependent dielectric constant of 1.0 was used. The parameter used was the consistent valence forcefield (CVFF). Prior to simulations, the co-ordinates of all the enzyme-inhibitor atoms were fixed and solvent molecules were minimized using steepest decent method in all constrained minimizations for 1000 iterations. Finally, the conjugate gradient minimizations were performed with all atoms free to move until maximum derivative was < 0.5 Kcal/mol/Å. The minimized co-ordinates of the whole system were used as a starting point for NVT (constant volume and temperature) at 300K to generate possible stable conformation. After 20 ps equilibrium stage, the simulations were carried out using velocity verlet algorithm for another 100 ps. Only the amino acid residues and solvent within a 12 Å radius of the inhibitor was permitted to move during simulation; the remainder of protein was fixed throughout the simulation. Surrounding the sphere of moving residue and solvent was a layer of waters whose positions were fixed in order to ensure that none of the moving solvent molecules escaped from the vicinity of the active site simulation. The average structures over the last 50 ps of each kind of MD simulations were taken to find out probable interactions of the inhibitors with the active site residues (Fig 2 – Fig. 5).
All the optimized structures were finally analyzed by PROCHECK .
Solvent accessibility study
The difference accessible surface area (DASA) study of a protein with and without inhibitor reveals the potential interactional sites of the protein. In order to estimate the fit of contact between the protein and the inhibitor, the difference accessible surface area (DASA) was calculated using DASA = [ASA (enzyme) – ASA (complex)] where ASA represents the accessible surface area given by Lee and Richard  (1971). ASA (complex) and ASA (enzyme) are the accessible surface area of each residue in a protease structure with and without the inhibitor respectively. The DASA study, performed on each of the modeled complexes using Biosym (MSI, San Diego, CA, InsightII package, 95.0 version), shows distinctly the protease residues involved in electrostatic and hydrophobic interaction with the inhibitors. Difference plots of solvent accessibilities to indicate the interacting residues of different numbers of papain-family cysteine proteases are shown in Fig. (6,7,8,9).
Barrett AJ: Proteinases in Mammalian Cells and Tissues, Elsevier /North-Holland, Amsterdam, 1977.
We are thankful to the members of the Biophysics Department, Bose Institute. Prof. N. K. Sinha for his assistance and Prof. W. Saenger for providing us with Calotropin DI co-ordinate at 1.9 Å in 1992 through personal communication.
Authors and Affiliations
Biophysics Division, Bose Institute, P 1/12, C.I.T. Scheme VIIM, Calcutta, 700054, India
Suparna Bhattacharya, Sreya Ghosh, Sibani Chakraborty, Asim K Bera, Bishnu P Mukhopadhayay, Indrani Dey & Asok Banerjee
Bhattacharya, S., Ghosh, S., Chakraborty, S. et al. Insight to structural subsite recognition in plant thiol protease-inhibitor complexes : Understanding the basis of differential inhibition and the role of water.
BMC Struct Biol1, 4 (2001). https://doi.org/10.1186/1472-6807-1-4