A model of tripeptidyl-peptidase I (CLN2), a ubiquitous and highly conserved member of the sedolisin family of serine-carboxyl peptidases
© Wlodawer et al; licensee BioMed Central Ltd 2003
Received: 23 July 2003
Accepted: 11 November 2003
Published: 11 November 2003
Tripeptidyl-peptidase I, also known as CLN2, is a member of the family of sedolisins (serine-carboxyl peptidases). In humans, defects in expression of this enzyme lead to a fatal neurodegenerative disease, classical late-infantile neuronal ceroid lipofuscinosis. Similar enzymes have been found in the genomic sequences of several species, but neither systematic analyses of their distribution nor modeling of their structures have been previously attempted.
We have analyzed the presence of orthologs of human CLN2 in the genomic sequences of a number of eukaryotic species. Enzymes with sequences sharing over 80% identity have been found in the genomes of macaque, mouse, rat, dog, and cow. Closely related, although clearly distinct, enzymes are present in fish (fugu and zebra), as well as in frogs (Xenopus tropicalis). A three-dimensional model of human CLN2 was built based mainly on the homology with Pseudomonas sp. 101 sedolisin.
CLN2 is very highly conserved and widely distributed among higher organisms and may play an important role in their life cycles. The model presented here indicates a very open and accessible active site that is almost completely conserved among all known CLN2 enzymes. This result is somehow surprising for a tripeptidase where the presence of a more constrained binding pocket was anticipated. This structural model should be useful in the search for the physiological substrates of these enzymes and in the design of more specific inhibitors of CLN2.
Although the existence of tripeptidyl-peptidase I (TPP-I) was first noted over 40 years ago , the structural and mechanistic basis of its activity has been largely misunderstood until quite recently. The situation changed after it was shown that TPP-I is identical to an independently characterized enzyme named CLN2. It was also demonstrated that mutations leading to abolishment of the enzymatic activity of CLN2 were the direct cause of a fatal inherited neurodegenerative disease, classical late-infantile neuronal ceroid lipofuscinosis . This important observation was followed by the identification of CLN2 as a serine peptidase [3, 4], without, however, specifying its structural fold and the details of the catalytic site. More accurate placement of CLN2 within the context of a family of related enzymes became possible only after high-resolution crystal structures of two bacterial enzymes with a limited sequence similarity to CLN2, sedolisin and kumamolisin, became available [5–7]. These structures defined a novel family of enzymes, now called sedolisins or serine-carboxyl peptidases, that is characterized by the utilization of a fully conserved catalytic triad (Ser, Glu, Asp) and by the presence of an Asp in the oxyanion hole . Sedolisin and its several variants (e.g., kumamolisin, aorsin , and physarolisin ) have now been found in archaea, bacteria, fungi and amoebae, whereas the higher organisms seem to contain only variants of CLN2 . The physiological role of sedolisins in the lower organisms has not yet been elucidated.
Despite the potential medical importance of CLN2 and related enzymes, no systematic studies of their genomic distribution have been published to date. There are also no published reports of the crystallization of this enzyme. In the absence of an experimental structure obtained by crystallography or NMR it is sometimes necessary to resort to molecular modeling in order to provide a structural basis for the explanation of the biological properties of an enzyme, and, in particular, to initiate design of its inhibitors. Examples of such very successful and useful modeling efforts are provided by HIV protease , or very recently by the peptidase from a coronavirus involved in the severe acute respiratory syndrome , among others. We have now applied the tools of molecular homology modeling to predicting a structure of CLN2 that could be used as a basis for a search for the biological substrates of this family of enzymes and for the design of specific inhibitors.
Results and discussion
Sequence comparisons of the CLN2-like enzymes
CLN2 is present not only in fish, but also in amphibians, in particular in Xenopus tropicalis (a species of frog). A partial sequence of its sedolisin-TPP (AL594774) found in the EST database http://www.sanger.ac.uk/Projects/X_tropicalis/blast_server.shtml spans the middle part of the catalytic domain, without reaching the part of the active site closer to the N terminus that contains the aspartic and glutamic acids that belong to the catalytic triad. However, the sequenced part of the enzyme shows 75% identity with the fugu sedolisin-TPP, and 69% identity with human CLN2 (Figure 3). Sequence similarity to bacterial or fungal sedolisins is much lower, indicating that the enzyme found in frogs might also share the functional properties of the CLN2 subfamily.
The discovery of highly conserved CLN2-like enzymes not only in mammals but also in two fish species and one of frogs may indicate that these peptidases are universally present in the vertebrates, and that their important role identified in humans  and mice  might be a more general feature.
Modeling of the structure of human CLN2
Comparisons of the substrate binding pockets of CLN2 and sedolisin
Considerable differences are seen, however, at the bottom of the pocket, where the side chains of Asp179 in kumamolisin and the equivalent Ser190 in sedolisin make hydrogen bonds to the P1 tyrosine of the inhibitor (if present). The equivalent residue in CLN2 is Thr182, but it is very unlikely that it can assume an orientation that would allow it to make a hydrogen bond to a P1 tyrosine. Other polar residues in the vicinity are Glu175 in sedolisin and the corresponding Asp169 in kumamolisin. However, the residue found here in CLN2 is Cys170, much less polar than its counterparts. There is also no equivalent to the polar interaction between Glu175 and Glu171 in sedolisin, since the equivalent of the latter residue in CLN2 is Ser166, much smaller and pointing away.
The side of the S1 pocket that is created by the very flexible side chain of Arg179 in sedolisin contains only a much smaller Ser174 in CLN2, and thus is much more open in the latter protein. This part of the pocket, with the main chain of the protein quite distant from the substrate, is indeed not well conserved in these proteins, with kumamolisin missing it entirely due to a deletion in the corresponding sequence position. In summary, the S1 pocket in CLN2 has less polar character than the equivalent pocket in the related proteins, and is lacking direct polar anchors for any side chains that might be present in the substrate.
The S2 pocket in CLN2 is also quite open and accessible to solvent. It is most likely larger than the equivalent pockets in either sedolisin or kumamolisin, since these are limited by Trp81 in the former and Trp129 in the latter (these residues originate from different parts of the backbone in the two enzymes and are not topologically related). Tyr130, an equivalent of the latter residue in CLN2, is unlikely to come into direct contact with the P2 residue of the substrate due to its greater distance (almost 4 Å for the closest atoms).
An important remaining puzzle is that the predicted structure of CLN2 does not show any clear limitations of the S3 pocket that could explain the tripeptidase activity of this enzyme. The location of the P3 side chain of the substrate is ambiguous, since it could point in either one of two directions by exchange with the N-terminal amine. The only negatively charged residue of CLN2 that is found in this vicinity is Asp132. Although in the current model the distance between its carboxylate and the nitrogen of the N-terminus of the modeled substrate is about 6 Å, these two groups could be brought into hydrogen-bonding range by some allowed changes in the torsion angles of the protein. Such a conformational change would involve breaking of the hydrogen bond between Asp132 and Ser139. However, this latter interaction is not likely to be structurally crucial since the serine is not absolutely conserved in all CLN2-like enzymes.
Location of mutations implicated in disease
Location of mutations found through a genetic survey of families of classic late-infantile neuronal ceroid lipofuscinosis patients has been described previously . Most such mutations result in expression of either truncated enzyme or in incorrect intron-exon splices. However, some of the mutations lead to single amino acid substitutions in the mature enzyme. Such mutations include I92N, E148K, C170R and C170Y, V190D, G194E, Q227H, R252H, A259E, and S280L (Figure 5). Only the role of the latter mutation is completely clear, since it replaces the catalytic serine of CLN2 with a side chain that cannot support its enzymatic activity. No other residues appear to be located in the immediate vicinity of the substrate. Residues Val190, Gly194, and Arg252 are very highly conserved not only in CLN2 but also in other sedolisins and must play an important structural role. The reasons why the remaining mutations would lead to the loss of enzymatic activity are much less clear, but the wide distribution of these mutations in the structure supports the conclusion that any modifications to CLN2 that would abolish or impair its function could lead to the development of the disease.
Substrates and inhibitors of CLN2
Little is known at this time about biologically-relevant substrates of CLN2. Various defects that include truncations and single-site mutations in CLN2 have been found in the genes of patients that display symptoms of late-infantile neuronal ceroid lipofuscinosis . One of the symptoms of the disease is the accumulation of an autofluorescent material, ceroid-lipofuscin, in lysosomal storage bodies in various cell types, primarily in the nervous system. Since a major component of such bodies appears to be intact subunit c of mitochondrial ATP synthase, this protein has been implicated as a potential biological target of the protease. It has been shown recently that CLN2 can indeed degrade this subunit on its N terminus , but the unambiguous proof that this is indeed the most important target is still lacking. CLN2 is capable of processing a number of different angiotensin-derived peptides , with the efficiency of cleavage dependent on the length of such peptides. The most efficiently processed peptide consisted of 14 amino acids, with the tripeptide Asp-Arg-Val removed from its N terminus. The model of CLN2 presented here can easily accommodate this peptide on the P side of the substrate-binding site, although the exact mode of binding of the long P' portion of the substrate remains obscure. The observation that an analogous peptide acetylated on its N terminus cannot be processed supports the postulate that the interactions of the N-terminal amino group with the side chain of Asp132 may be the most important feature defining the tripeptidase specificity of CLN2. A number of different tripeptides can be serially processed from glucagon, with their sequences varying widely . Again, however, all of these tripeptides can be easily accommodated in the substrate-binding site of the CLN2 model. Other potentially biologically relevant substrates include cholecystokinin and possibly other neuropeptides .
An intriguing property of CLN2 is its reported ability to cleave collagen-related peptides . The tripeptides resulting from such processing include Gly-Pro-Met, Gly-Pro-Arg, and Gly-Pro-Ala. It has been recently reported that kumamolisin, and particularly a closely related protein from Alicyclobacillus sendaiensis (kumamolisin-As) can efficiently cleave not only collagen-related peptides, but also native type I collagen . With the substrate-binding site of CLN2 resembling that of kumamolisin more than sedolisin (the latter enzyme has low, if any, collagenase activity), the potential collagen-processing role of CLN2 might warrant further investigation.
Since the catalytic machinery of CLN2 matches closely that of sedolisin, kumamolisin, and other members of the family of serine-carboxyl peptidases, the enzymatic mechanism of all these enzymes is most likely the same. Design of inhibitors specific for CLN2 should incorporate the features that have been proven to be important for the related enzymes, such as the placement of an aldehyde functionality capable of making covalent interactions with the catalytic serine, or the utilization of chloromethyl ketone for the same purpose. Since the few inhibitors that have been successfully used in the studies of sedolisins are either longer than tripeptides or contain blocking groups on their N termini, new tripeptide-based inhibitors with free N termini are now being synthesized (Oda, unpublished). It will be necessary to test the binding properties of different substrates in order to determine the most promising peptide sequences. Analysis of the model of CLN2 suggests that the size of the S1 subsite is much larger than in either sedolisin or kumamolisin, and thus the use of a large P1 group might be indicated. Of course, the availability of an experimental crystal structure will make the design of inhibitors easier and we are continuing our efforts to crystallize CLN2 from different sources.
Three-dimensional, atomic-scale models of CLN2 were developed by exploiting the sequence similarity to the sedolisin and kumamolisin proteins (r.m.s. deviation of 1.0 Å for 273 pairs of Cα atoms in the core of the enzymes). Presently, these two enzymes are the only members of the newly-defined sedolisin/serine-carboxyl peptidase family  for which the crystal structures have been published [5–7]. The actual Protein Data Bank : http://www.pdb.org/) entries used in the modeling were 1GA4.pdb and 1GT9.pdb for sedolisin and kumamolisin, respectively.
The first step was to form a global, multiple sequence alignment between all known members of the sedolisin family. Studies have shown that incorporating the specific patterns of amino acid residue-type variation and conservation among a family of homologous proteins provides superior results over simple, pair-wise sequence alignment . Sequence files representing the different subfamilies were extracted from the non-redundant GenBank database  using sedolisin, kumamolisin, and the human CLN2 sequences as queries to the web-based version of the BLAST program : http://www.ncbi.nlm.nih.gov/BLAST/. Initial multiple sequence alignments were formed with the ClustalX computer program . As is expected for a family of proteins, highly-conserved segments were found aligned to the crystal structure-identified core regions of the sedolisin and kumamolisin sequences. Subsequently, the sequences were divided into two groups: those closer to sedolisin than kumamolisin and vise-a-versa. The alignment of these two groups was then manually set by the observed structural alignment of the sedolisin and kumamolisin proteins. Finally, some additional adjustment was required to correct the few places where highly conserved residues of the core regions were slightly out of alignment among different subfamilies of sequences.
The model of human CLN2 was built using the structure of sedolisin complexed with the inhibitor pseudo-tyrostatin [5, 6] as a template. The reason for this choice is that while different protein models were generally comparable, the chosen inhibitor was most compatible with the tripeptidase character of CLN2. With the correspondence of residues specified in the alignments, atomic coordinates were transferred to the target sequence by a variety of methods, including the homology modeling modules of the Look/GeneMine  and DeepView  computer program packages. For the core and active site of the protein, coordinates for identical residues were simply transferred unchanged; whereas, special care was required to position the side chains of residues differing from the template. This was first accomplished automatically by the two computer packages, then manually adjusted in the Quanta molecular modeling package (Accelrys, Inc.) to better mimic the templates and optimize the interactions with surrounding residues. A similar two-step approach was used to manifest the insertions and deletions in the variable, loop regions of the protein, where it was necessary to create new backbone as well as side chain coordinates for the models. It should be noted that, for obvious reasons, the conformation of poorly conserved loop regions is generally the least accurate aspect of a homology model. Fortunately, these problematic loops will not significantly affect the active site of the model, since only two of them impinge on the boundary of this highly conserved, functional region.
Refinement and analysis of the model
The model was finished by performing energy minimization in vacuo with the computer program CHARMM . This refined the structure by bringing the covalent geometry and non-bonded interactions into agreement with experimentally observed and calculated values. Such optimizations included adjusting bond lengths, 3-point angles and 4-point dihedral angles, as well as eliminating atomic overlap and forming salt-bridges and hydrogen bonds. Since presently the potential energy functions used to describe the atomic-scale models are not sufficiently comprehensive and accurate, the final energy of the model was not used as an indicator of the realistic quality of the structure. The final quality of the structure was analyzed with the computer program PROCHECK . The structure was deposited at the PDB under accession code 1R60.
Extensive discussions with Dr. A. Barrett (Wellcome Trust Sanger Institute, Hinxton, UK) are gratefully acknowledged. Modeling efforts were facilitated by Dr. Robert Jernigan (Iowa State University). This work was supported in part by a Grant-in-Aid for Scientific Research (B), 15380072, from the Ministry of Education, Culture, Sports, Science and Technology of Japan (to K.O.); by NIH grants DK18865 and AI39211 (to B.M.D.); and in part with Federal funds from the National Cancer Institute, National Institutes of Health, under contract No. NO1-CO-12400. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does the mention of trade names, commercial products or organizations imply endorsement by the U. S. Government.
- Ellis S: Studies on the serial extraction of pituitary proteins. Endocrinology 1961, 69: 554–570.View ArticlePubMedGoogle Scholar
- Sleat DE, Donnelly RJ, Lackland H, Liu CG, Sohar I, Pullarkat RK, Lobel P: Association of mutations in a lysosomal protein with classical late- infantile neuronal ceroid lipofuscinosis. Science 1997, 277: 1802–1805. 10.1126/science.277.5333.1802View ArticlePubMedGoogle Scholar
- Rawlings ND, Barrett AJ: Tripeptidyl-peptidase I is apparently the CLN2 protein absent in classical late-infantile neuronal ceroid lipofuscinosis. Biochim Biophys Acta 1999, 1429: 496–500. 10.1016/S0167-4838(98)00238-6View ArticlePubMedGoogle Scholar
- Lin L,, Sohar I,, Lackland H,, Lobel P: The human CLN2 protein/tripeptidyl-peptidase I is a serine protease that autoactivates at acidic pH. J Biol Chem 2001, 276: 2249–2255.View ArticlePubMedGoogle Scholar
- Wlodawer A, Li M, Dauter Z, Gustchina A, Uchida K, Oyama H, Dunn BM, Oda K: Carboxyl proteinase from Pseudomonas defines a novel family of subtilisin-like enzymes. Nature Struct Biol 2001, 8: 442–446. 10.1038/87610View ArticlePubMedGoogle Scholar
- Wlodawer A, Li M, Gustchina A, Dauter Z, Uchida K, Oyama H, Goldfarb NE, Dunn BM, Oda K: Inhibitor complexes of the Pseudomonas serine-carboxyl proteinase. Biochemistry 2001, 40: 15602–15611. 10.1021/bi011817nView ArticlePubMedGoogle Scholar
- Comellas-Bigler M, Fuentes-Prior P, Maskos K, Huber R, Oyama H, Uchida K, Dunn BM, Oda K, Bode W: The 1.4 Å crystal structure of kumamolysin: a thermostable serine- carboxyl-type proteinase. Structure 2002, 10: 865–876. 10.1016/S0969-2126(02)00772-4View ArticlePubMedGoogle Scholar
- Wlodawer A,, Li M,, Gustchina A,, Oyama H,, Dunn BM,, Oda K: Structural and enzymatic properties of the sedolisin family of serine-carboxyl peptidases. Acta Biochim Polon 2003, 50: 81–102.PubMedGoogle Scholar
- Lee BR, Furukawa M, Yamashita K, Kanasugi Y, Kawabata C, Hirano K, Ando K, Ichishima E: Aorsin, a novel serine proteinase with trypsin-like specificity at acidic pH. Biochem J 2003, 371: 541–548. 10.1042/BJ20021691PubMed CentralView ArticlePubMedGoogle Scholar
- Nishii W, Ueki T, Miyashita R, Kojima M, Kim YT, Sasaki N, Murakami-Murofushi K, Takahashi K: Structural and enzymatic characterization of physarolisin (formerly physaropepsin) proves that it is a unique serine-carboxyl proteinase. Biochem Biophys Res Commun 2003, 301: 1023–1029. 10.1016/S0006-291X(03)00083-4View ArticlePubMedGoogle Scholar
- Weber IT, Miller M, Jaskólski M, Leis J, Skalka AM, Wlodawer A: Molecular modeling of the HIV-1 protease and its substrate binding site. Science 1989, 243: 928–931.View ArticlePubMedGoogle Scholar
- Anand K,, Ziebuhr J,, Wadhwani P,, Mesters JR,, Hilgenfeld R: Coronavirus main proteinase (3CLpro) structure: basis for design of anti-SARS drugs. Science 2003, 300: 1763–1767. 10.1126/science.1085658View ArticlePubMedGoogle Scholar
- Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, Agarwala R, Ainscough R, Alexandersson M, An P, Antonarakis SE, Attwood J, Baertsch R, Bailey J, Barlow K, Beck S, Berry E, Birren B, Bloom T, Bork P, Botcherby M, Bray N, Brent MR, Brown DG, Brown SD, Bult C, Burton J, Butler J, Campbell RD, Carninci P, Cawley S, Chiaromonte F, Chinwalla AT, Church DM, Clamp M, Clee C, Collins FS, Cook LL, Copley RR, Coulson A, Couronne O, Cuff J, Curwen V, Cutts T, Daly M, David R, Davies J, Delehaunty KD, Deri J, Dermitzakis ET, Dewey C, Dickens NJ, Diekhans M, Dodge S, Dubchak I, Dunn DM, Eddy SR, Elnitski L, Emes RD, Eswara P, Eyras E, Felsenfeld A, Fewell GA, Flicek P, Foley K, Frankel WN, Fulton LA, Fulton RS, Furey TS, Gage D, Gibbs RA, Glusman G, Gnerre S, Goldman N, Goodstadt L, Grafham D, Graves TA, Green ED, Gregory S, Guigo R, Guyer M, Hardison RC, Haussler D, Hayashizaki Y, Hillier LW, Hinrichs A, Hlavina W, Holzer T, Hsu F, Hua A, Hubbard T, Hunt A, Jackson I, Jaffe DB, Johnson LS, Jones M, Jones TA, Joy A, Kamal M, Karlsson EK, Karolchik D, Kasprzyk A, Kawai J, Keibler E, Kells C, Kent WJ, Kirby A, Kolbe DL, Korf I, Kucherlapati RS, Kulbokas EJ, Kulp D, Landers T, Leger JP, Leonard S, Letunic I, Levine R, Li J, Li M, Lloyd C, Lucas S, Ma B, Maglott DR, Mardis ER, Matthews L, Mauceli E, Mayer JH, McCarthy M, McCombie WR, McLaren S, McLay K, McPherson JD, Meldrim J, Meredith B, Mesirov JP, Miller W, Miner TL, Mongin E, Montgomery KT, Morgan M, Mott R, Mullikin JC, Muzny DM, Nash WE, Nelson JO, Nhan MN, Nicol R, Ning Z, Nusbaum C, O'Connor MJ, Okazaki Y, Oliver K, Overton-Larty E, Pachter L, Parra G, Pepin KH, Peterson J, Pevzner P, Plumb R, Pohl CS, Poliakov A, Ponce TC, Ponting CP, Potter S, Quail M, Reymond A, Roe BA, Roskin KM, Rubin EM, Rust AG, Santos R, Sapojnikov V, Schultz B, Schultz J, Schwartz MS, Schwartz S, Scott C, Seaman S, Searle S, Sharpe T, Sheridan A, Shownkeen R, Sims S, Singer JB, Slater G, Smit A, Smith DR, Spencer B, Stabenau A, Stange-Thomann N, Sugnet C, Suyama M, Tesler G, Thompson J, Torrents D, Trevaskis E, Tromp J, Ucla C, Ureta-Vidal A, Vinson JP, Von Niederhausern AC, Wade CM, Wall M, Weber RJ, Weiss RB, Wendl MC, West AP, Wetterstrand K, Wheeler R, Whelan S, Wierzbowski J, Willey D, Williams S, Wilson RK, Winter E, Worley KC, Wyman D, Yang S, Yang SP, Zdobnov EM, Zody MC, Lander ES: Initial sequencing and comparative analysis of the mouse genome. Nature 2002, 420: 520–562. 10.1038/nature01262View ArticlePubMedGoogle Scholar
- Katz ML, Johnson GS: Mouse gene knockout models for the CLN2 and CLN3 forms of ceroid lipofuscinosis. Eur J Paediatr Neurol 2001, 5 Suppl A: 109–114.View ArticlePubMedGoogle Scholar
- Orry A,, Wallace BA: A proposed model for the late-infantile neuronal ceroid lipofuscinosis (Batten Disease) protein CLN2. Protein Pept Lett 1999, 6: 1–5.Google Scholar
- Schechter I, Berger A: On the size of the active site in proteases. I. Papain. Biochem Biophys Res Commun 1967, 27: 157–162.View ArticlePubMedGoogle Scholar
- Sleat DE, Gin RM, Sohar I, Wisniewski K, Sklower-Brooks S, Pullarkat RK, Palmer DN, Lerner TJ, Boustany RM, Uldall P, Siakotos AN, Donnelly RJ, Lobel P: Mutational analysis of the defective protease in classic late-infantile neuronal ceroid lipofuscinosis, a neurodegenerative lysosomal storage disorder. Am J Hum Genet 1999, 64: 1511–1523. 10.1086/302427PubMed CentralView ArticlePubMedGoogle Scholar
- Ezaki J, Takeda-Ezaki M, Kominami E: Tripeptidyl peptidase I, the late infantile neuronal ceroid lipofuscinosis gene product, initiates the lysosomal degradation of subunit c of ATP synthase. J Biochem (Tokyo) 2000, 128: 509–516.View ArticleGoogle Scholar
- Warburton MJ, Bernardini F: The specificity of lysosomal tripeptidyl peptidase-I determined by its action on angiotensin-II analogues. FEBS Lett 2001, 500: 145–148. 10.1016/S0014-5793(01)02608-4View ArticlePubMedGoogle Scholar
- Vines D, Warburton MJ: Purification and characterisation of a tripeptidyl aminopeptidase I from rat spleen. Biochim Biophys Acta 1998, 1384: 233–242. 10.1016/S0167-4838(98)00012-0View ArticlePubMedGoogle Scholar
- Bernardini F,, Warburton MJ: Lysosomal degradation of cholecystokinin-(29–33)-amide in mouse brain is dependent on tripeptidyl peptidase-1: implications for the degradation and storage of peptides in classical late-infantile neuronal ceroid lipofuscinosis. Biochem J 2002, 366: 521–529. 10.1042/BJ20020467PubMed CentralView ArticlePubMedGoogle Scholar
- McDonald JK, Hoisington AR, Eisenhauer DA: Partial purification and characterization of an ovarian tripeptidyl peptidase: a lysosomal exopeptidase that sequentially releases collagen-related (Gly-Pro-X) triplets. Biochem Biophys Res Commun 1985, 126: 63–71.View ArticlePubMedGoogle Scholar
- Tsuruoka N, Nakayama T, Ashida M, Hemmi H, Nakao M, Minakata H, Oyama H, Oda K, Nishino T: Collagenolytic serine-carboxyl proteinase from Alicyclobacillus sendaiensis strain NTAP-1: Purification, characterization, gene cloning, and heterologous expression. Appl Environ Microbiol 2003, 69: 162–169. 10.1128/AEM.69.1.162-169.2003PubMed CentralView ArticlePubMedGoogle Scholar
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic Acids Res 2000, 28: 235–242. 10.1093/nar/28.1.235PubMed CentralView ArticlePubMedGoogle Scholar
- Park J, Karplus K, Barrett C, Hughey R, Haussler D, Hubbard T, Chothia C: Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J Mol Biol 1998, 284: 1201–1210. 10.1006/jmbi.1998.2221View ArticlePubMedGoogle Scholar
- Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Rapp BA, Wheeler DL: GenBank. Nucleic Acids Res 2002, 30: 17–20. 10.1093/nar/30.1.17PubMed CentralView ArticlePubMedGoogle Scholar
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25: 3389–3402. 10.1093/nar/25.17.3389PubMed CentralView ArticlePubMedGoogle Scholar
- Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG: The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 1997, 25: 4876–4882. 10.1093/nar/25.24.4876PubMed CentralView ArticlePubMedGoogle Scholar
- Lee C,, Irizarry K: The GeneMine System for genome/proteome annotation and collaborative data mining. IBM Systems Journal 2001, 40: 592–603.View ArticleGoogle Scholar
- Guex N, Peitsch MC: SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis 1997, 18: 2714–2723.View ArticlePubMedGoogle Scholar
- Brooks BR,, Bruccoleri RE,, Olafson BD,, States DJ,, Swaminathan S,, Karplus M: CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. J Comp Chem 1983, 4: 187–217.View ArticleGoogle Scholar
- Laskowski RA, MacArthur MW, Moss DS, Thornton JM: PROCHECK: program to check the stereochemical quality of protein structures. J Appl Crystallogr 1993, 26: 283–291. 10.1107/S0021889892009944View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.