- Research article
- Open Access
Three dimensional shape comparison of flexible proteins using the local-diameter descriptor
© Fang et al; licensee BioMed Central Ltd. 2009
- Received: 22 December 2008
- Accepted: 12 May 2009
- Published: 12 May 2009
Techniques for inferring the functions of the protein by comparing their shape similarity have been receiving a lot of attention. Proteins are functional units and their shape flexibility occupies an essential role in various biological processes. Several shape descriptors have demonstrated the capability of protein shape comparison by treating them as rigid bodies. But this may give rise to an incorrect comparison of flexible protein shapes.
We introduce an efficient approach for comparing flexible protein shapes by adapting a local diameter (LD) descriptor. The LD descriptor, developed recently to handle skeleton based shape deformations , is adapted in this work to capture the invariant properties of shape deformations caused by the motion of the protein backbone. Every sampled point on the protein surface is assigned a value measuring the diameter of the 3D shape in the neighborhood of that point. The LD descriptor is built in the form of a one dimensional histogram from the distribution of the diameter values. The histogram based shape representation reduces the shape comparison problem of the flexible protein to a simple distance calculation between 1D feature vectors. Experimental results indicate how the LD descriptor accurately treats the protein shape deformation. In addition, we use the LD descriptor for protein shape retrieval and compare it to the effectiveness of conventional shape descriptors. A sensitivity-specificity plot shows that the LD descriptor performs much better than the conventional shape descriptors in terms of consistency over a family of proteins and discernibility across families of different proteins.
Our study provides an effective technique for comparing the shape of flexible proteins. The experimental results demonstrate the insensitivity of the LD descriptor to protein shape deformation. The proposed method will be potentially useful for molecule retrieval with similar shapes and rapid structure retrieval for proteins. The demos and supplemental materials are available on https://engineering.purdue.edu/PRECISE/LDD.
- Boundary Point
- Shape Descriptor
- Shape Deformation
- Query Protein
- Flexible Protein
The importance of protein molecule shape has been recognized in many structural biological applications such as computer aided molecular design, drug discovery, and protein structure retrieval [2–4]. The similarity comparison of protein molecule shape plays a central role in understanding of protein functions [4–8] of molecular systems, and leads to an alternative way to discover the specific nature of proteins efficiently besides the similarity measure from the alignment of sequence [9, 10] and secondary structure [11–16]. The underlying assumption is that the geometrically similar molecules are likely to share similar properties. The shape based methods could be made independently from chemical structural knowledge. The attractive advantage of shape based methods make protein comparison alignment-free and therefore very efficient. Some methods show their effectiveness for comparing proteins using the their surface shape [4–7, 17], but they cannot handle the deformation of the flexible protein well because proteins are treated as rigid bodies. This intrinsic weakness leads to significant comparison errors when flexible proteins in different conformation states are compared to each other.
Our work concentrates on a shape-based method for the deformation invariant shape representation using the local diameter (LD) descriptor for flexible protein comparison. The proposed method will mainly be used in any type of molecular shape comparison (MSC) based applications [2, 3, 18, 19], for example, virtual screening of compounds for similar shapes for drug discovery, and protein structure retrieval based on shape similarity .
The importance of a comparison of the shape of 3D objects' shape is increasing in the areas of computer vision, robotics, molecular biology, and others. The shape is usually expressed by a descriptor, which is a feature vector capturing some essence of a given shape [20–22]. A shape descriptor generally carries a number of desirable properties: transformation invariant, succinctness for representation, low computation expense, expressiveness of shape, etc. The concise form of the shape descriptor is of great benefit in shape comparison. Instead of alignment and superposition, a shape descriptor allows for the efficient shape comparison since only a simple calculation of some distance between feature vectors is required .
Additionally, the shape descriptor can be stored as an index in a database of shapes to enable real time query and retrieval. Protein shape analysis has been developing as an indispensable way to overcome the challenge of reaching the goal of disclosing the unknown functions of proteins. In the next part we review several recent works related to a protein shape.
The global protein shape descriptor proposed in  introduces a method to compare and classify proteins based on their topological properties. In , spherical harmonic expansion was applied to compare the protein binding pocket and ligand. A 3D shape is decomposed via spherical harmonics, and followed by computation of the distance in the coefficient space to evaluate the similarity among proteins. Petros  applied the spherical trace transform to a protein shape to produce a rotation invariant shape descriptor. Ingolf  successfully brought the moment invariants to the comparison of protein binding sites. In [26, 27] the authors introduced a technique to generate compact descriptors of the shapes of molecules and of macromolecular receptor sites. The ray-tracing technique is used to explore the volume enclosed by a ligand molecule, or the volume exterior to the active site of a protein. The shape descriptor is expressed as a histogram of the ray segment lengths obtained from the ray tracing within the volume of the molecule. Ballester et al. developed a shape comparison method named USR for screening a huge database compound to find out the similar molecular shapes [2, 3]. Furthermore, Lee  used a 3D Zernike descriptor to represent the global surface shape. The protein surface is compactly expressed as a series expansion of three-dimensional functions. The authors used the global surface shape similarity based technique for fast protein tertiary structure retrieval. By using this simple descriptor for indexing the database, it only takes seconds to search against a few thousand proteins.
Local diameter descriptor
Proteins undergo structural changes and shape deformation upon protein-protein or protein-ligand interaction [28–30]. A particular conformation state carries out a specific function of the protein. Beginning with the native conformation, the protein usually adapts its structure to changes in its surrounding environment [29, 30]. It is of importance to represent shapes of different protein conformations in an invariant form. Invariancy is associated with the shape descriptor not being effected by the deformation of the protein shape. None of the descriptors introduced above can support deformation in shape for comparison and matching. This is because those shape descriptors change significantly with deformation as well as topological changes.
Local diameter (LD) scalar function is defined on the boundary surface of a 3D shape. LD has been recently developed for a pose-oblivious shape descriptor in . Furthermore, the authors demonstrated that the local diameter descriptor is invariant of skeleton based shape deformation in . We applied the LD descriptor to capture the invariant properties of the shape deformation due to the motion of the protein backbone. The proposed novel approach includes three steps for characterizing the protein shape. First, a certain number of points is sampled from the protein surface. Then, every sampled point is assigned a value measuring the local diameter of the 3D shape in the neighborhood of that point. Lastly, a shape descriptor in the form of a 1D vector is built from the distribution of the local diameters. The LD descriptor captures the invariant essentials of the shape deformation and is expressed in a 1D vector. By using the LD descriptor, not only could we represent the shapes of flexible proteins consistently, but also we could characterize the protein shape independent of its atom coordinates and structural details. This approach is very computationally efficient since it is free of structural alignment and superposition.
In this section, we first provide a definition of the LD descriptor and then the procedure to compute the LD descriptor to represent the protein shape. The procedure starts with the description of the approach to extract the boundary points, followed by a comparison of sampling techniques. Next we present LD descriptor calculation algorithms, and finally, we explain the similarity measurement methods for shape descriptors.
The LD descriptor is a type of statistic based descriptor. It starts with sampling boundary points, measures the shape diameters on every sample point, and generates the one-dimensional histogram. In , the local diameter is defined as the distance from one surface point to the antipodal surface point using the inward-normal direction. The authors also proved that the shape diameter is an approximation of twice the shape radius . Shape radius measures the distance from a surface point to the object medial axis (skeleton). The shape radius is demonstrated to be invariant to skeleton based shape deformation.
However, the existing approaches suffer from the complexity of computing the skeletons . Local diameter is an alternative way to simplify and approximate the shape radius without any computation of the object's skeleton. Therefore, while preserving the properties of shape radius, the definition of local diameter is invariant to rigid body transformation (rotation, translation, and uniform scale), articulated deformations, and skeleton based movements . We could simulate the protein hinge-bending motion as one type of skeleton-based movement [33–35]. Therefore, the LD descriptor is reliable for capturing the invariancy associated with the shape deformation caused by the protein backbone's motions.
Extract the boundary from the whole volumetric model and sample boundary points,
Measure the local diameter for every sample point, and
Build the LD descriptor and score the similarity between descriptors.
Detection of boundary points
The volumetric representation, popularly used in biological research fields [4, 5, 36], is used to represent protein shape in this paper. A volumetric model is composed of a set of cubic units named voxels, which is the 3D counterpart of the 2D pixel of an image. A voxel is centered at an integral grid point and assigned a numeric value representing some measurement, for example, the density [37, 38]. The volumetric model is built in three steps as follows. First, we compute the Connolly surface (triangle mesh) of the molecule using the MSROLL program . Second, we place the triangle mesh in a 3D cubic grid of n3 (e.g. n = 65) compactly. Third, each lattice point is assigned either 1 or 0; 1 for point inside the surface and 0 for outside. The inside point is denoted as object point and the outside point is denoted as background point. For each lattice point, there is a set of 26 neighbor points. An object point lies on the boundary if at least one of its 26 neighbors is a background point.
Subset of boundary points
Local diameter calculation
Estimate the normal of the boundary point,
Define a virtual cone using this point as a vertex, centered around the opposite direction of the point's normal with a proper opening angle, and shoot a certain number of rays inside the cone to the opposite side, and
Intersect a ray with the opposite boundary side and record the distance from the point to the antipodal intersection point, and
Sort the distances and use the median as the local diameter.
The normal of the boundary point is unknown in the volumetric model. We introduce an efficient technique based on visibility to estimate the normal of the boundary points . The basic idea of the normal estimation is illustrated in Figure 2(C). The visible direction for a boundary point is defined as the vector linking it to one of the outside points in its neighbor. The green arrows in Figure 2(C) denote three visible vectors of the center point. The normal of a point is measured by the vector sum of all the visible directions. In the case of Figure 2(C), the normal for the center point is the vector sum of D 1, D 2, and D 3. The calculation of the normal performs the same procedure in the 2D and 3D cases except that the number of neighbors of one point is 8 in the 2D case while it is 26 in the 3D case.
Consider a virtual cylinder around the ray shot from the cone vertices,
Detect whether an intersection has occurred between that cylinder and boundary points and collect the inclusion points, and
Cluster the inclusion points, and
The nearest cluster is picked up as the final intersection point.
Note that the reader can find the algorithm details in . We provide a 2D illustration to demonstrate the framework for the algorithm. See Figure 2(D). The protein surface is represented as a solid dot contour; the black dots represent the boundary points. In Figure 2(D), an orange ray is shot from red point P 1, and a blue virtual cylinder is around the ray. We can see there are three clusters of intersection points, highlighted by three purple circles. The number of points included in the three clusters is one, two, and two respectively. The intersection algorithm picks up the nearest cluster as the final intersection point of the ray with the opposite boundary side.
Similarity measurement of shape descriptor
The last step is to express the values of the local diameters in the form of a one-dimensional histogram vector. After obtaining the local diameter of every boundary point, we compute the distribution of those values and build the histogram of values using 128 bins to define the descriptor. In order to compare the LD descriptor with popular shape descriptors, we introduce a distance descriptor, called Euclidean distance (ED) . The ED descriptor is formed by three steps: 1) sampling points from the shape surface, 2) computing the Euclidean distance between the pairwise sampled points, and 3) computing the distribution of distance values to build the histogram given the number of bins. Note that the distribution is estimated by counting the number of observations of distance values falling into a specific range adjusted by the number of bins. The histogram built from the distribution indicates the number of distance values per unit bin.
where N bin is the number of bins of histogram, and I A and I B denote shape descriptors of the two proteins, A and B, respectively.
We have implemented the LD descriptor and assessed its performance from the experimental results. The experimental comparison of the LD descriptor and existing structure based and shape-based methods will be studied in future work while applying the LD descriptor in different applications. The algorithms presented in the paper are implemented on a Pentium D 3.2 GHz computer with 1 G RAM running Windows XP. The proteins for the experiments have been chosen from the Database of Macromolecular Movement (MolMovDB) , which categorizes conformational changes of protein and allows the user to animate and visualize a particular motion through the Morph server. This database has been used in predicting protein structures and hinge predictor [43, 44]. There are two critical parameters, the opening angle of the cone and the number of rays in the cone that have to be carefully chosen in the experiments. The importance of cone angle has been discussed in [1, 31]. A small cone angle would make the LD descriptor too sensitive to local features and would lead to poor discrimination between different objects as well. On the other hand, a large opening angle would bring extra noise and errors due to the rays intersecting to unrelated parts of the object . The number of rays indicates the sampling density for rays shooting from the vertex within the cone. The number of rays would not affect much the performance of the LD descriptor. The authors in  tested the effect of various parameter settings and found that the opening angle of 120° and 50 rays are reasonable settings. In the experiments, we followed the setting in  choosing 120° and 50 rays. Here, we are not focused on a detailed analysis of the biological significance of the proteins, but the robust performance of the shape descriptor and we analyze the possible reasons causing incorrect comparison.
Visualization of Local Diameters
To summarize this section, the LD descriptor can measure the local diameter very well. The body of the domain usually has large diameters.
Shape Deformation Invariance
Two illustrative examples of proteins (PDB: 1BPB and 2BPG) are shown in the top part of Figure 4. The two conformations have a large shape change and even the shape topology is changed due to a small contact (highlighted by an ellipse in the figure) between the top portion with the bottom domain. The LD descriptor is shown in Figure 4(C). The plot shows that the histograms for the two shapes match well, indicating that the shape descriptor is effective enough to be invariant to the complex shape deformation even with topology changes. In contrast, the ED shape descriptor method fails to represent the shape deformation invariantly as shown in Figure 4(D) since there is a large deviation between the two shape descriptors.
Database Retrieval Performance
Before searching, we pre-calculated all shape descriptors of queries in the database and stored them as index files. Therefore, every protein is transformed to a compact 1D vector (e.g. 128 numbers). With the new form of representation, proteins can be retrieved extremely fast since only the distance between 1D vectors needs to be compared. If the query protein is already transformed into a shape signature, it only takes seconds to finish a query retrieval process. Meanwhile, if the query protein is not pre-calculated, the search engine would first build the shape descriptor and search against the database.
Specificity Sensitivity Analysis
where TP, the number of true positives, is the number of proteins included in the group that are the same as the query protein correctly retrieved in the search; FN, the number of false negatives, is the number of proteins that are included in the group the same as the query protein but missed in the search; FP, the number of false positives, is the number of proteins that are included in a different group from the query protein but inaccurately retrieved in the search. Thus, the denominator in Eq. 3 is the total number of all members in a group, and the denominator in Eq. 4 is the retrieval size.
Analysis of experimental results
We found there are a few cases where the local diameter descriptor cannot express the shape deformations of the shape protein appropriately. Here we list potential reasons causing the inaccuracy. First, the volumetric data is a low resolution record of protein shape. Some details of protein surface are missed during the voxelization process, which ultimately causes the improper representation. Second, the sampling of boundary points would cause the loss of some shape information. Third, the LD descriptor deals well with the shape deformation caused by the motion of the protein backbone. However, in some cases, the hinge or loop is elongated to a large extent, which harms the performance of the descriptor. Furthermore, we found that a few proteins of highly similar shape are assigned to different groups in the Molmovdb database. Our method likely fails to distinguish the proteins from those groups. For instance, there are two groups in the Molmovdb database, which include proteins functioning as ovotransferrin and lactoferrin, respectively. Our method fails to differentiate between these two groups due to the high similarity of the protein shape.
As the size of molecular databases increase exponentially, the development of the efficient screening methods has been receiving a lot of attention. We are working on two potential applications of the LD descriptor including the molecular shape similarity search for drug discovery and protein structure retrieval based on the shape surface similarity. There are two aspects of advantages for the LD descriptor in those applications. First, the LD descriptor can handle the flexible molecular shape comparison. Second, the LD descriptor is able to search against a large database rapidly.
Molecular shape comparison based drug design
One of the foundations for the rational drug design is the use of shape similarity identified compounds are expected to be active against a given target [2, 3, 27]. The virtual screening of compounds based shape matching has been recently developed for the drug discovery process [3, 18]. The underlying assumption is that the molecular shape has been widely acknowledged as a critical factor for biological activity and geometrical resemblance of molecular is closely associated with the functional similarity [2, 3, 27]. The recent two works are reviewed as follows. The authors in  proposed their shape signature method and applied it to the Tripos fragment database and the NCI database (113,331 compounds) under two different metrics. Ballester et al. [2, 3] introduced a ultrafast shape recognition method to finish a inter-database compounds search for similar molecular shapes. The databases used in their experiments include the Vendor Database (2,433,493 commercially available compounds) and an independent benchmark from DrugBank.
The existing shape based techniques first pre-compute a shape signature of the compounds in a large database, and a list of similar shape molecules can be ranked based on some distance metrics with a given lead molecule. In future work we will apply the the LD descriptor to some large and diverse molecular databases for the drug discovery. The LD shape descriptor could provide the alternative to existing molecular shape descriptor with a potentially better performance in terms of searching by matching the 3D conformation of molecules.
Hybrid scheme for protein structure retrieval
Fast protein structural retrieval techniques are necessary to deal with the increasing number of known protein structural data. Over the past years, many structural comparison methods have been proposed to solve the structure retrieval problem [11–16]. The widely used structural alignment methods, such as DALI  and CE , have been proposed to identify the defined best alignment. The structural alignment generally produces a superposition of corresponding atom pairs and the RMSD distance is calculated as the similarity metric. A structural alignment, FATCAT , is recently proposed for the flexible structure alignment. In addition, Liu  presented a new algorithm based on least median of squares to achieve a flexible structure alignment. The reader can refer  for comprehensive review of protein structure alignment methods. Generally, before comparing a pair of structures, an optimal sequence alignment, which has been shown to be NP-hard problem , should be first conducted to provide the corresponding residues. Therefore, the structural alignment methods generally have critical weak points in terms of computational complexity although they are able to provide good structure retrieval results.
The LD descriptor is capable of real time retrieving because it is an alignment-free technique, and the descriptor vector could be pre-computed and stored as an index on a hard disk. In the future work we will develop a hybrid scheme which includes both the structure alignment and shape based methods. The LD descriptor could work as a rapid primary filter to obtain the initial candidates and followed with an option to combine structure alignment based methods to refine the initial results. The structural comparison is only applied to the highly ranked protein candidates to reduce the time of a whole retrieval process. With this hybrid scheme, we can balance well between the searching time and the accuracy of structure retrieval results.
Capturing the molecule flexibility is important for understanding the properties of proteins. This paper shows how to represent the protein molecular shape by a deformation invariant shape descriptor and demonstrate its efficiency to search against a protein motion database. This descriptor is robust in recognizing the protein conformations since it has a number of attractive properties, namely, insensitivity to topology changes of protein shape, articulated deformation, and skeleton based movement. The experimental results show that the LD descriptor achieved a good performance in the retrieval of a protein motion database. The LD descriptor failed to recognize the large topological changes such as a large contact of the flexible portion with the stable domain and the deformation of the big elongation of the hinge region. Our work contributes to providing an alternative way for protein similarity measures. The proposed shape descriptor is a fast and efficient technique for comparing protein shape, and will be useful for potential applications, such as the drug discovery and structure retrieval.
The authors thanks S. Flores and M. Gerstein for providing the database. The authors also thank the Editor and the anonymous reviewers for their constructive feedback on an earlier version of this paper. This material is partly based upon work supported by the National Institute of Health (GM-075004). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Institute of Health.
- Gal R, Shamir A, Cohen-Or D: Pose-Oblivious Shape Signature. IEEE Transcations on Visualization and Computer Graphics 2007, 13: 261–271. 10.1109/TVCG.2007.45View ArticlePubMedGoogle Scholar
- Ballester PJ, Richards WG: Ultrafast shape recognition for similarity search in molecular databases. Proceedings of the Royal Society A 2007, 463: 1307–1321. 10.1098/rspa.2007.1823View ArticleGoogle Scholar
- Ballester PJ, Richards WG: Ultrafast shape recognition to search compound databases for similar molecular shapes. Journal of Computational Chemistry 2007, 28: 1711–1723. 10.1002/jcc.20681View ArticlePubMedGoogle Scholar
- Lee S, Li B, La D, Fang Y, Ramani K, Rustamov R, Kihara D: Fast Protein Tertiary Structure Retrieval Based on Global Surface Shape Similarity. Protein: Structure, Function and Bioinformatics 2008, 73: 1–10. 10.1002/prot.22141View ArticleGoogle Scholar
- Li B, Turuvekere S, Agrawal M, La D, Ramani K, Kihara D: Characterization of local geometry of protein surfaces with the visibility criterion. Proteins: Structure, Function, and Bioinformatics 2008, 71: 670–683. 10.1002/prot.21732View ArticleGoogle Scholar
- Sommer I, Muller O, Domingues FS, Sander O, Weickert J, Lengauer T: Moment invariants as shape recognition technique for comparing protein binding sites. Bioinformatics 2007, 23: 3139–3146. 10.1093/bioinformatics/btm503View ArticlePubMedGoogle Scholar
- Daras P, Zarpalas D, Axenopoulos A, Tzovaras D, Strintzis MG: Three-Dimensional Shape-Structure Comparison Method for Protein Classification. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2006, 3: 193–207. 10.1109/TCBB.2006.43View ArticlePubMedGoogle Scholar
- Connolly M: Measurement of protein surface shape by solid angles. Journal of Molecular Graphics 1986, 4: 3–6. 10.1016/0263-7855(86)80086-8View ArticleGoogle Scholar
- Chothia C, Lesk A: The relation between the divergence of sequence and structure in proteins. The EMBO Journal 1986, 5: 823–826.PubMed CentralPubMedGoogle Scholar
- Wilson CA, Kreychman1 J, Gerstein M: Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores. Journal of Molecular Biology 2000, 297: 233–249. 10.1006/jmbi.2000.3550View ArticlePubMedGoogle Scholar
- Ye Y, Godzik A: Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics 2003, 19: 246–255.Google Scholar
- Shindyalov I, Bourne P: Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Engineering 1998, 11: 739–747. 10.1093/protein/11.9.739View ArticlePubMedGoogle Scholar
- K Mizuguchi NG: Comparison of spatial arrangements of secondary structural elements in proteins. Protein Engineering 1995, 8: 353–362. 10.1093/protein/8.4.353View ArticleGoogle Scholar
- Gerstein M, Levitt M: Using Iterative Dynamic Programming to Obtain Accurate Pairwise and Multiple Alignments of Protein Structures. Proceedings of the Fourth International Conference on Intelligent Systems for Molecular Biology 1996, 59–67.Google Scholar
- Holm L, Sander C: Touring protein fold space with Dali/FSSP. Nucleic Acids Research 1998, 26: 316–319. 10.1093/nar/26.1.316PubMed CentralView ArticlePubMedGoogle Scholar
- Krissinel E, Henrick K: Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallographica Section D Biological Crystallography 2004, 60: 2256–2268. 10.1107/S0907444904026460View ArticleGoogle Scholar
- Mak L, Grandisona S, Morris RJ: An extension of spherical harmonics to region-based rotationally invariant descriptors for molecular shape description and comparison. Journal of Molecular Graphics and Modelling 2008, 26: 1035–1045. 10.1016/j.jmgm.2007.08.009View ArticlePubMedGoogle Scholar
- Meek PJ, Liu Z, Tian L, Wang CY, Welsh WJ, Zauhar RJ: Shape Signatures: speeding up computer aided drug discovery. Drug Discovery Today 2006, 11(19–20):895–904. 10.1016/j.drudis.2006.08.014View ArticlePubMedGoogle Scholar
- Nilakantan R, Bauman N, Venkataraghavan R: New method for rapid characterisation of molecular shapes: applications in drug design. Journal of Chemical Information and Computer Sciences 1993, 33: 79–85.PubMedGoogle Scholar
- Osada R, Funkhouser T, Chazelle B, Dokin D: Shape distributions. ACM Transactions on Grphics 2002, 33: 133–154.Google Scholar
- Vranic DV: 3D Model Retrieval. PhD thesis. Universitat Leipzig Institut fur Informatik; 2004.Google Scholar
- Pan JS, Luo H, Lu ZM, Chang JCH: A New 3D Shape Descriptor Based on Rotation. Proceedings of the Sixth International Conference on Intelligent Systems Design and Applications 2006.Google Scholar
- Jayanti S, Kalyanaraman Y, Iyer N, Ramani K: Developing an engineering shape benchmark for CAD models. Computer Aided Design 2006, 38: 939–953. 10.1016/j.cad.2006.06.007View ArticleGoogle Scholar
- Grant J, Pickup B: A Gaussian Description of Molecular Shape. Journal of Physical Chemistry 1995, 99: 3503–3510. 10.1021/j100011a016View ArticleGoogle Scholar
- Morris RJ, Najmanovich RJ, Kahraman A, Thornton JM: Real spherical harmonic expansion coefficients as 3D shape descriptors for protein binding pocket and ligand comparisons. Bioinformatics 2005, 21: 2347–2355. 10.1093/bioinformatics/bti337View ArticlePubMedGoogle Scholar
- Nagarajan K, Zauhar R, Welsh WJ: Enrichment of ligands for the serotonin receptor using the shape signatures approach. Journal of Chemical Information and Modeling (JCIM) 2005, 45: 49–57. 10.1021/ci049746xView ArticleGoogle Scholar
- Zauhar R, Moyna G, Tian L, Li Z, Welsh W: Shape signatures: A new approach to computer-aided ligand and receptor-based drug designe. Journal of Medicinal Chemistry 2003, 46: 5674–5490. 10.1021/jm030242kView ArticlePubMedGoogle Scholar
- Socci ND, Bialek WS, Onuchic JN: Properties and origins of protein secondary structure. Physical Review E 1994, 49: 3440–3443. 10.1103/PhysRevE.49.3440View ArticleGoogle Scholar
- Anfinsen C: Principles that Govern the Folding of Protein Chains. Science 1973, 181: 223–230. 10.1126/science.181.4096.223View ArticlePubMedGoogle Scholar
- Prosinecki V, Faisca PF, Gomes CM: Conformational States and Protein Stability from a Proteomic Perspective. Current Proteomics 2007, 4: 44–52. 10.2174/157016407781387375View ArticleGoogle Scholar
- Shapira L, Shamir A, Cohen-Or D: Consistent mesh partitioning and skeletonisation using the shape diameter function. The Visual Computer 2008, 24: 249–259. 10.1007/s00371-007-0197-5View ArticleGoogle Scholar
- Choi HI, Choi SW, Moon HP: Mathematical Theory of Medial Axis Transform. Pacific Journal of Mathematics 1997, 181: 57–88.View ArticleGoogle Scholar
- Wriggers W, Schulten K: Protein domain movements: detection of rigid domains and visualization of hinges in comparisons of atomic coordinates. Proteins 1997, 29: 1–14. 10.1002/(SICI)1097-0134(199709)29:1<1::AID-PROT1>3.0.CO;2-JView ArticlePubMedGoogle Scholar
- Jacobs DJ, Rader A, Kuhn LA, Thorpe M: Protein Flexibility Predictions Using Graph Theory. PROTEINS: Structure, Function, and Genetics 2001, 44: 150–165. 10.1002/prot.1081View ArticleGoogle Scholar
- Janin J, Wodak S: Structural domains in proteins and their role in the dynamics of protein function. Progress in Biophysics and Molecular Biology 1983, 42: 21–78. 10.1016/0079-6107(83)90003-2View ArticlePubMedGoogle Scholar
- Baker ML, Ju T, Chiu W: Identification of Secondary Structure Elements in Intermediate Resolution Density Maps. Structure 2007, 15: 7–19. 10.1016/j.str.2006.11.008PubMed CentralView ArticlePubMedGoogle Scholar
- Kaufman A, Cohen D, Yagel R: Volume Graphics. IEEE Computer 1993, 26: 51–64.View ArticleGoogle Scholar
- Kaufman A: Volume Visualization. ACM Computing Surveys 1996, 28: 165–167. 10.1145/234313.234383View ArticleGoogle Scholar
- Connolly M: Solvent-accessible surfaces of proteins and nucleic acids. Science 1983, 221: 709–713. 10.1126/science.6879170View ArticlePubMedGoogle Scholar
- Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY: An Efficient k-Means Clustering Algorithm:Analysis and Implementation. IEEE Transcations on Pattern Analysis and Machine Imtelligence 2002, 24: 881–892. 10.1109/TPAMI.2002.1017616View ArticleGoogle Scholar
- Liu YS, Yong JH, Zhang H, Yan DM, Sun JG: A quasi-Monte Carlo method for computing areas of point-sampled surfaces. Computer Aided Design 2006, 38: 55–68.View ArticleGoogle Scholar
- Echols N, Milburn D, Gerstein M: MolMovDB: analysis and visualization of conformational change and structural flexibility. Nucleic Acids Research 2003, 31: 478–482. 10.1093/nar/gkg104PubMed CentralView ArticlePubMedGoogle Scholar
- Flores S, Gerstein M: FlexOracle: predicting flexible hinges by identification of stable domains. BMC Bioinformatics 2007, 8: 215. 10.1186/1471-2105-8-215PubMed CentralView ArticlePubMedGoogle Scholar
- Damm K, Carlson H: Gaussian-Weighted RMSD Superposition of Proteins: A Structural Comparison for Flexible Proteins and Predicted Protein Structures. Biophysical Journal 2006, 90: 4558–4573. 10.1529/biophysj.105.066654PubMed CentralView ArticlePubMedGoogle Scholar
- Liu YS, Fang Y, Ramani K: Using least median of squares for structural superposition of flexible proteins. BMC Bioinformatics 2009, 10: 29. 10.1186/1471-2105-10-29PubMed CentralView ArticlePubMedGoogle Scholar
- Kolodny R, Koehl P, Levitt M: Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. Journal of Molecular Biology 2005, 346: 1173–1188. 10.1016/j.jmb.2004.12.032PubMed CentralView ArticlePubMedGoogle Scholar
- Lathrop R: The protein threading problem with sequence amino acid interaction preferences is NP-complete. Protein Engineering 1994, 7: 1059–1068. 10.1093/protein/7.9.1059View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.