ProFace: a server for the analysis of the physicochemical features of protein-protein interfaces
© Saha et al; licensee BioMed Central Ltd. 2006
Received: 20 February 2006
Accepted: 07 June 2006
Published: 07 June 2006
Molecular recognition is all pervasive in biology. Protein molecules are involved in enzyme regulation, immune response, signal transduction, oligomer assembly, etc. Delineation of physical and chemical features of the interface formed by protein-protein association would allow us to better understand protein interaction networks on one hand, and to design molecules that can engage a given interface and thereby control protein function on the other hand.
ProFace is a suite of programs that uses a file, containing atomic coordinates of a multi-chain molecule, as input and analyzes the interface between any two or more subunits. The interface residues are shown segregated into spatial patches (if such a clustering is possible based on an input threshold distance) and/or core and rim regions. A number of physicochemical parameters defining the interface is tabulated. Among the different output files, one contains the list of interacting residues across the interface. Results can be used to infer if a particular interface belongs to a homodimeric molecule.
A web-server, ProFace (available at http://www.boseinst.ernet.in/resources/bioinfo/stag.html) has been developed for dissecting protein-protein interfaces and deriving various physicochemical parameters.
Most proteins function by interacting with other molecules; the binding sites have evolved for achieving specific interactions and avoiding undesirable associations that would be deleterious to the normal functioning of the cell. Thus the interfaces between two protein subunits provide context for understanding the principles of molecular recognition. A large volume of structural data on protein interactions, either complexes between independent polypeptide chains, or oligomeric assembly of subunits, is available in the Protein Data Bank (PDB) , which has been used to generate diverse datasets of protein-protein interfaces . The physical and chemical features of the interfaces have been analyzed [3–8] and softwares/websites, such as Protein-Protein Interaction Server , MolSurfer , SPIN-PP , etc. are available for their calculations. Nevertheless, our understanding of the biomolecular interactions is not adequate enough, for example, to infer unambiguously the arrangement of the subunits in an oligomeric protein from crystallographic studies , or to ascertain a high success rate for the prediction of models of protein-protein complexes through docking methods .
Recently, protein-protein interfaces have been dissected from new perspectives [13, 14]. It has been shown that many large interfaces are not contiguous, but built of spatially demarcated surface patches. Such segregation into patches is also indicative of the location and distribution of water molecules held in the interface . Additionally, one can also divide the interface into core and rim regions using the difference of solvent accessibilities of residues and the chemical properties of each region are quite distinct. Interestingly, this division also mirrors the degree of conservation of interface residues in a family of homologous proteins , and this represents an important signature of protein interaction sites. Various other physicochemical parameters have also been developed [17, 18], which in combination, can distinguish the true oligomeric state (dimer, in particular) from the lattice contacts observed in protein crystals. In this article we describe a web-server, ProFace that dissects a given pro tein-protein interface and obtains various parameters to characterize it.
Implementation and results
Input file and parameters
All the protein chains should be contained in the input file in the PDB format and the user must indicate which chains (a maximum of three allowed) constitute each of the two components forming the interface between them. Also, one has to specify the way to display the dissected interface, i.e., to show the residues belonging to core and rim and/or in spatial patches. For clustering into patches the threshold distance has to be supplied. This distance should typically be half the maximum distance between any two interface atoms on a given protein chain – the latter distance is listed along with the other parameters in the output. Ideally, the number of patches should be the same on both the components and if this is not the case the threshold value may have to be slightly changed (increase to reduce the number of patches and vice-versa) to achieve this. The suggested values are 15 Å for protein-protein complexes  and 22 Å for homodimers , as these gave patches that were visually meaningful in the vast majority of the cases.
Output files and parameters
There are five types of output: a) plot of interface residues with secondary structural information; b) statistics of interface parameters; c) coordinates of interface atoms and the PDB files in which the interface residues are tagged; d) list of residue contacts across interface; and e) the view of the interface atoms.
Plot of interface residues with secondary structural information
Statistics of interface parameters
Interface parameters of c-AMP-dependent protein kinase complex (PDB code, 1ydr) 
Interface Area (Å2)
Interface Area/Surface Area
Number of atoms
Number of residues
Fraction of non-polar atoms
Non-polar interface area (Å2)
Fraction of fully buried atoms
Residue Propensity Score
Statistics on the core and rim regions of the interface in the file, 1ydr
Areas of individual patches in the interface of the two components in 1ydr
No. of patches
No. of residues in patches
Patch area (Å2)
The 4-digit code used to name the output files are randomly generated and does not have any correspondence to the input file name. The coordinates are stored in two types of files (with extensions .pdb and .int) and there are two files (corresponding to individual components) of each type. In the .pdb file the interface residues are distinguished from the remaining atoms in the structure on the basis of the content in the two columns – occupancy factor and B-factor. The non-interface residues have a value of 0.00 in these columns. For the interface residues, a) the occupancy is replaced by -5.00 (if it is a core residue) or 5.00 (if it is a rim residue); b) the B-factor column is replaced by a value 1.00 through 9.00, depending on the patch to which the residue belongs. In the .int file, only the interface atoms are kept, with the occupancy and the B-factor column modified as above (and an additional information on patches is also provided by appending labels a, b, c, ... to the keyword ATOM to correspond to patch numbers 1, 2, 3,...). Moreover, there are two additional columns, in which the ASAs of the constituent atoms in the individual component and in the complex are provided. One can use this information to calculate the interface area contributed by individual residues and, for example, correlate with the thermodynamic data on the free energy of binding .
View of the interface atoms
This can be done using either RasMol  or CHIME , depending on whichever program has been configured by the user on the machine. Clicking on the RasMol link will first enable the user to download the PDB file (with interface atoms), which can then be viewed by either program. Clicking on the CHIME link loads the PDB file directly in CHIME. As the B-factor column of the PDB file has been replaced by number codes indicating the patch to which the atoms belong, the interface atoms can be colored on the basis of patches using RasMol. Also, the PDB file generated by the program can be used in GRASP  to color the molecular surface according to the criterion of patch or core/rim region.
ProFace can be used to dissect a protein-protein interface, deriving physicochemical parameters. The output can be used to display the interface with standard softwares and understand the biological significance of the interaction.
Availability and requirements
Project name: ProFace
Project home page: http://www.boseinst.ernet.in/resources/bioinfo/stag.html
Operating system(s): Platform independent
Programming language: Java, C++
Other requirements: JRE 1.4.2.04 or higher, Chime plug-in 2.6 or higher; all of them are available for download at the above web address
Any restrictions to use by non-academics: None
We are grateful to Prof. J. Janin and Dr. F. Rodier, with whose collaboration many of the parameters discussed here were developed. The computational facility has been provided by the Department of Biotechnology, India, and the Council of Scientific and Industrial Research provided fellowships to RPS, RPB, AP and SM.
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic Acids Res 2000, 28: 235–242. [http://www.rcsb.org/] 10.1093/nar/28.1.235PubMed CentralView ArticlePubMedGoogle Scholar
- Keskin O, Tsai CJ, Wolfson H, Nussinov R: A new, structurally nonredundant, diverse data set of protein-protein interfaces and its implications. Protein Sci 2004, 13: 1043–1055. 10.1110/ps.03484604PubMed CentralView ArticlePubMedGoogle Scholar
- Chothia C, Janin J: Principles of protein-protein recognition. Nature 1975, 256: 705–708. 10.1038/256705a0View ArticlePubMedGoogle Scholar
- Argos P: An investigation of protein subunits and domain interfaces. Protein Eng 1988, 2: 101–113.View ArticlePubMedGoogle Scholar
- Lawrence MC, Colman PM: Shape complementarity at protein/protein interfaces. J Mol Biol 1993, 234: 946–50. 10.1006/jmbi.1993.1648View ArticlePubMedGoogle Scholar
- Jones S, Thornton JM: Principles of protein-protein interactions. Proc Natl Acad Sci USA 1996, 93: 13–20. [http://www.biochem.ucl.ac.uk/bsm/PP/server/] 10.1073/pnas.93.1.13PubMed CentralView ArticlePubMedGoogle Scholar
- Tsai CJ, Nussinov R: Hydrophobic folding units at protein-protein interfaces: implications to protein folding and to protein-protein association. Protein Sci 1997, 6: 1426–1437.PubMed CentralView ArticlePubMedGoogle Scholar
- Lo Conte L, Chothia C, Janin J: The atomic structure of protein-protein recognition sites. J Mol Biol 1999, 285: 2177–2198. 10.1006/jmbi.1998.2439View ArticlePubMedGoogle Scholar
- Gabdoulline RR, Wade RC, Walther D: MolSurfer: a macromolecular interface navigator. Nucleic Acids Res 2003, 31: 3349–3351. [http://projects.villa-bosch.de/mcm/software/molsurfer] 10.1093/nar/gkg588PubMed CentralView ArticlePubMedGoogle Scholar
- Ponstingl H, Kabir T, Thornton JM: Automatic inference of protein quaternary structure from crystals. J Appl Crystallogr 2003, 36: 1116–1122. 10.1107/S0021889803012421View ArticleGoogle Scholar
- Janin J: Assessing predictions of protein-protein interaction: the CAPRI experiment. Protein Sci 2005, 14: 278–283. 10.1110/ps.041081905PubMed CentralView ArticlePubMedGoogle Scholar
- Chakrabarti P, Janin J: Dissecting protein-protein recognition sites. Proteins 2002, 47: 334–343. 10.1002/prot.10085View ArticlePubMedGoogle Scholar
- Bahadur RP, Chakrabarti P, Rodier F, Janin J: Dissecting subunit interfaces in homodimeric proteins. Proteins 2003, 53: 708–719. 10.1002/prot.10461View ArticlePubMedGoogle Scholar
- Rodier F, Bahadur RP, Chakrabarti P, Janin J: Hydration of protein-protein interfaces. Proteins 2005, 60: 36–45. 10.1002/prot.20478View ArticlePubMedGoogle Scholar
- Guharoy M, Chakrabarti P: Conservation and relative importance of residues across protein-protein interfaces. Proc Natl Acad Sci USA 2005, 102: 15447–15452. 10.1073/pnas.0505425102PubMed CentralView ArticlePubMedGoogle Scholar
- Bahadur RP, Chakrabarti P, Rodier F, Janin J: A dissection of specific and non-specific protein-protein interfaces. J Mol Biol 2004, 336: 943–955. 10.1016/j.jmb.2003.12.073View ArticlePubMedGoogle Scholar
- Saha RP, Bahadur RP, Chakrabarti P: Interresidue contacts in proteins and protein-protein interfaces and their use in characterizing the interface. J Proteome Res 2005, 4: 1600–1609. 10.1021/pr050118kView ArticlePubMedGoogle Scholar
- Kabsch W, Sander C: Dictionary of protein secondary structure. Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983, 22: 2577–2637. [http://www.cmbi.kun.nl/gv/dssp] 10.1002/bip.360221211View ArticlePubMedGoogle Scholar
- Hubbard SJ: NACCESS: A program for calculating accessibilities. Department of Biochemistry and Molecular Biology University College of London; 1992. [http://wolf.bms.umist.ac.uk/naccess/]Google Scholar
- Sayle RA, Milner-White EJ: RasMol: biomolecular graphics for all. Trends Biochem Sci 1995, 20: 374–375. [http://www.bernstein-plus-sons.com/software/rasmol/] 10.1016/S0968-0004(00)89080-5View ArticlePubMedGoogle Scholar
- Nicholls A, Sharp K, Honig B: Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins 1991, 11: 281–296. [http://trantor.bioc.columbia.edu/grasp/] 10.1002/prot.340110407View ArticlePubMedGoogle Scholar
- Engh RA, Girod A, Kinzel V, Huber R, Bossemeyer D: Crystal structures of catalytic subunit of cAMP-dependent protein kinase in complex with isoquinolinesulfonyl protein kinase inhibitors H7, H8, and H89. Structural implications for selectivity. J Biol Chem 1996, 271: 26157–26164. 10.1074/jbc.271.42.26157View ArticlePubMedGoogle Scholar
- Wright CS: 2.2 Å resolution structure analysis of two refined N-acetylneuraminyl-lactose-wheat germ agglutinin isolectin complexes. J Mol Biol 1990, 215: 635–651.View ArticlePubMedGoogle Scholar