Research article | Open | Published:
Position-specific propensities of amino acids in the β-strand
BMC Structural Biologyvolume 10, Article number: 29 (2010)
Despite the importance of β-strands as main building blocks in proteins, the propensity of amino acid in β-strands is not well-understood as it has been more difficult to determine experimentally compared to α-helices. Recent studies have shown that most of the amino acids have significantly high or low propensity towards both ends of β-strands. However, a comprehensive analysis of the sequence dependent amino acid propensities at positions between the ends of the β-strand has not been investigated.
The propensities of the amino acids calculated from a large non-redundant database of proteins are found to be highly position-specific and vary continuously throughout the length of the β-strand. They follow an unexpected characteristic periodic pattern in inner positions with respect to the cap residues in both termini of β-strands; this periodic nature is markedly different from that of the α-helices with respect to the strength and pattern in periodicity. This periodicity is not only different for different amino acids but it also varies considerably for the amino acids belonging to the same physico-chemical group. Average hydrophobicity is also found to be periodic with respect to the positions from both termini of β-strands.
The results contradict the earlier perception of isotropic nature of amino acid propensities in the middle region of β-strands. These position-specific propensities should be of immense help in understanding the factors responsible for β-strand design and efficient prediction of β-strand structure in unknown proteins.
Secondary structural elements like α-helices and β-strands are important determinants of folded protein structure and topology. Helices and strands are regular repetitive structures; while α-helices are quasi-one-dimensional formed by local interactions [1, 2], long β-strands self-assemble into complex hydrogen-bonded β-sheets by long-range and inter-chain interactions [3–5]. Secondary structures are predicted on the basis of statistical analysis of known protein structures, fold recognition and multiple sequence alignments. Various close packing arrangements of these strands and helices are systematically optimized  to test the resultant tertiary structure or a specific fold. It is, therefore, important to understand the factors dictating the intrinsic preferences of amino acid residues for a particular secondary structure .
Statistical analysis of known proteins [7, 8] clearly reveals that amino acids have definite conformational preferences for one or the other type of secondary structure. Secondary structure prediction methods [9–13] systematically analyze how these preferences determine whether a given sequence will adopt an α-helical or a β-sheet topology or neither. Even the frequencies of occurrences of amino acid residues in a helix at the N-terminus end (N-cap), at the C-terminus end (C-cap) and at interior positions are very different [14–25]. This non-equivalence of different positions around the helix termini with respect to amino acid preferences is also supported by experimental results [26–36]. Though early studies establish distinct differences in the propensities of the amino acids at N-cap, N1, N2 and N3 positions [14, 15, 37–39], it was assumed that beyond the first few residues from both the termini, the individual propensities average out leading to essentially isotropic environments . An unexpected recent finding confirms that the sequence dependence of helical propensities at positions between the ends of helices are markedly different and they exhibit a distinct pattern throughout the helix length .
Despite the importance of β-strands as main building blocks in proteins, the propensity of β-strands is not well-understood as it has been more difficult to determine experimentally compared to α-helices. This is attributed to the fact that β-sheets do not fold independently. Another reason may be the structural context dependence of the amino acids in β-sheet formation. A statistical survey of the protein structure database correlates well with an average of the experimental scales to determine the β-sheet propensity  and supports the idea that the intrinsic β-sheet propensity plays a pivotal role in assessing protein stability . Various factors like the side-chain dependent steric interactions  and solvent screening of the backbone electrostatic interactions  dictate the preference of the amino acids for β-sheet formation. Conformational entropy analysis also quantitatively establishes  the role of steric clashes between the side-chain and local backbone of an amino acid as the dominant cause of intrinsic β-sheet propensity. Recently, it has been proved that even the conformational topology of the backbone influences the propensity of amino acids in β-sheets [46, 47]. These extensive analyses, however, do not reveal a clear and concise rationale of β-sheet propensity distribution and are far from being fully conclusive.
A recent study demonstrates that the different positions in the β-sheet are not isotropic with respect to amino acid propensities. There is a marked variation in the pattern of amino acid preferences in different positions around the N-cap and C-cap region of the β-sheet . However, a comprehensive analysis of the sequence dependent β-strand propensities at positions between the ends of the β-strand has not been investigated. In this article, we present a detailed and systematic position-wise dissection of the amino acid propensities in the different subregions of β-strands. We note that the inner positions of the β-strand exhibit an unexpected characteristic periodicity in the sequence-dependent propensities which is distinctly different from that of α-helices with respect to both strength and pattern in periodicity. Average hydrophobicity also follows a similar position dependent periodic pattern throughout the different regions of the β-strand. This work may have far-reaching implications on the formation and stability of β-strands and provide the necessary foundation both for improvising new secondary structure prediction algorithms and de novo protein design.
A non-redundant database of β-strand sequences was compiled from the May-2008 release of PDB-select . All protein chains in this database have sequence identity of ≤ 25%. High resolution protein structures determined by X-ray crystallography with resolution higher than 3Å and R-factor ≤ 0.3 were selected. The database consisted of a total of 2586 non-redundant protein chains from 2466 proteins. The secondary structure assignment of these protein chains was performed with the help of DSSP , which is the most widely used methodology to define secondary structures of proteins from experimentally determined tertiary structures. DSSP classifies the amino acids of a particular protein chain into 8 types of secondary structure classes: H(α-helix), G(310-helix), I(π-helix), E(β-strand), B(isolated β-bridge), T(turn), S(bend) and -(rest). In this work, E and B were annotated with β-strand conformation. The final database consisted of 15579 β-strands. Isolated β-bridges (annotated as B in DSSP) with length greater than 3 residues were not found in the database. In this work, β-strands were not differentiated as constituent strands of β-sheet or isolated β-bridges. DSSP predicts the residue numbers of the complementary strands constituting the hydrogen bonded β-strands which form the β-sheet structure. Each β-strand was designated to be part of parallel or antiparallel β-sheets if both the complementary strands are either parallel or antiparallel respectively. If one strand is parallel and the complementary strand is antiparallel then the strand was designated to be part of a mixed β-sheet structure . To verify that the results of this study are independent of any database bias, an additional database of β-strands from β-barrel proteins were also compiled (with sequence identity ≤ 25%, resolution ≤ 3Å and crystallographic R-factor ≤ 0.3).
DSSP considers H-bonds for the assignment of helices and sheets. In case of helices, N-Cap is referred to the first residue preceding the helix which is in non-helical conformation while C-Cap is assigned as the first residue succeeding the helix which is in non-helical conformation [15, 40]. Similarly N-Cap is the residue preceding the first β-strand residue i.e. N1 while C-Cap is the residue succeeding the last residue in β-strand i.e. C1 [37, 52]. According to this analysis, N-Cap and C-Cap residues were numbered as zero in the figures while the inner residues in the assigned β-strand range from N1 to N10 and C1 to C10.
Propensity, which is also referred to as conformational parameter, is used to quantify the intrinsic preference of a given amino acid for a specific position in a particular secondary structure . A position-wise analysis of the amino acid propensity from both N and C termini of the β-strand is performed. Position-specific propensity of an amino acid is defined as [16, 40]
where n ij and f ij are the number and fraction of the ith residue in jth position while n i and f i are the number and fraction of finding ith residue in whole database of 2586 non-redundant protein chains respectively. The summation is over i for the 20 amino acids. From this equation, it can be clearly seen that a value of propensity greater than one indicates a higher preference of the amino acid in that position whereas a value less than one implies lower preference of that amino acid .
χ 2 -values
The χ2 values at jth positions of β-strands are defined as
where n ij is the observed number of ith amino acid at jth position while nexpec ij is the expected number of ith amino acid at jth position. The expected number of amino acid, nexpec ij , at a given position j in strand is evaluated as
where r i is the number of ith amino acid in the reference distribution, N j number of amino acids at the jth position of the β-strand and R is the total number of amino acids in the reference distribution (distribution of amino acids in β-strands).
Local hydrophobicity is found to play a dominant role in the stabilization of secondary structures in proteins . Different scales are found to predict significantly different hydrophobicities with residues being strongly hydrophobic on one scale and mildly hydrophobic on another. In most cases, hydrophobicity is measured by the free energy change due to the transfer of a non polar solute into aqueous solution at a particular characteristic temperature [54, 55]. The average hydrophobicity of jth position in β-strand, Hb j , is calculated as 
where n ij and n j are the number of ith residues at jth position and the number of total residues at jth position respectively. is the experimentally measured free energy change resulting from the transfer of the ith amino acid from octanol to water .
Most of the experimental propensity scales are based on free energy differences. The propensities of amino acids may be converted into a free energy like term by the following equation 
where E(i, j) is the free energy of the ith amino acid residue at the jth position and P ij is the propensity calculated by eqn.(1). A very similar free energy criteria was used earlier to study the pairing of residues in the neighbouring strands of β-sheets  as well as ranking of various factors which contribute to β-sheet folding .
Results and Discussion
Number distribution of β-strands
Figure 1 illustrates the frequency of occurrence of strands as a function of the strand length in the chosen database. In agreement with earlier studies, the peak is observed near the strand length N = 5 residues . The number of occurrences fall off very sharply above the strand length of 15 residues which constitutes less than 1% of total strand population. Unlike helices, the number of occurrence of β-strands with respect to a specific strand length shows a minimal deviation from the fitted Gaussian curve.
Propensities of amino acids show significant deviation from the isotropic nature at middle region of β-strands
Propensities of 20 amino acids are examined for 10 inner positions from the cap residues in β-strand. Strands of length 10 residues or more are used for this study. A higher cut-off length is not possible as the number of strands significantly decreases beyond 15 residues as mentioned above. There are 1634 such strands in the chosen database. The position-specific physico-chemical properties are calculated considering this data-set of strands. The position-specific propensities of the amino acids in these β-strands are calculated according to equation 1. Figures 2 and 3 illustrate these position-specific propensities of amino acids from both termini of β-strands. The Figures are generated by dividing 20 amino acids into five groups namely (i)long polar (E, K, Q, R), (ii)short polar (D, N, S), (iii)hydrophobic aromatics (F, W, Y), (iv)aliphatics + cysteine (C, I, L, M, V) and (v)other (A, G, H, P, T) in accordance to their physico-chemical properties . The error bars in the figures depict the standard errors obtained by calculating the position-specific propensities of the amino acids in individual β-strands.
A recent study demonstrates that the propensities of amino acids are independent of positions in the middle region of β-strands . The results from this study exhibit a striking contrast to such an expected isotropic pattern of propensities. Figures 2 and 3 show a significant variation of the propensity in the inner positions of the β-strands. Even the neighbouring positions are found to possess appreciable differences in their propensity values for a given amino acid. For example, methionine at N3 position has propensity 0.67 ± 0.3 and at N4 its propensity is 1.25 ± 0.3. The corresponding numbers for glutamine are 0.64 ± 0.09 and 1.00 ± 0.1 respectively and that of tryptophan are 1.76 ± 0.4 and 0.70 ± 0.4. Another significant feature of Figures 2 and 3 is the periodicity. Propensities of the amino acids display a characteristic periodic behaviour with respect to positions from the cap residues in β-strands. However, unlike helices where all amino acids show a similar pattern of periodicity of approximately of the order of the structural repeat unit of α-helix, i.e. 3.6 residues , the periodicity in propensities of the amino acids for the different inner regions with respect to the cap residues in the strands vary from one amino acid to another. This variation in periodicity may be noticed even for amino acids belonging to the same physico-chemical group. For example, the propensities of arginine are 1.07 ± 0.06, 0.86 ± 0.04, 1.13 ± 0.05, 0.84 ± 0.07 for 2nd, 4th, 6th, 8th inner positions respectively from N-cap residue, while glutamine, belonging to the same physico-chemical group exhibits propensity values of 1.02 ± 0.1, 0.78 ± 0.1, 1.06 ± 0.1 for 2nd, 5th, 8th inner positions respectively from the N-cap residue. Hence arginine shows peaks in the propensity values at a difference of four residues, while peaks in the propensity values are displayed at a difference of six residues for glutamine in the β-strands. In other words, the periodicity pattern of the position-specific propensities does not always follow the structural repeat of β-strands for all amino acids.
To validate the robustness of the propensity values of Figures 2 and 3, two additional data-sets of β-strands are considered, the first set consists of strands of length 5 or more residues while the second data-set comprises of strands of length ranging from 5 to 9 residues. Additional file 1, Figures S1, S2, S3 and S4 depict the position-specific propensities of the amino acids in these data-sets from both termini of β-strands. The results suggest similar propensity trends with respect to positions in strands as shown in figure 2 and 3. There is a good correlation between the propensity values of figure 2 and 3 (upto fifth residue from cap positions) with that of the database consisting strands of length 5 residues or more (R = 0.86 from N-Cap and R = 0.88 from C-Cap) and strands of length between 5 to 9 residues (R = 0.82 from N-Cap and R = 0.84 from C-Cap). Detailed position-wise correlation coefficients are provided in the Additional file 1, Tables S1 and S2.
It may be observed that in case of arginine and glutamine, positions of the peak propensity values as is exhibited by most of the amino acids are at differences of multiples of two, the structural repeat unit of β-strands. Although position-wise periodicity in β-strands was explained previously with binary patterning (polar-nonpolar) of amino acids [51, 60] but neither the propensities of individual amino acids were considered nor the periodicity with respect to the cap positions in β- strands was shown. The present work with the help of considerably large database of longer strands shows that there are significant differences in the propensities of amino acids in inner positions with respect to the cap residues in strands.
Although long polar residues are considered to be unfavourable in β-strand structures [7, 41, 61], yet some of these residues show higher preferences for more than one inner positions (P ij > 1). Formation of β-strands even with frequent occurrences of amino acids, which have low intrinsic preferences for these structures, can be attributed to the fact that secondary structure formation is driven by the periodic occurrence of amino acids more than their intrinsic preferences . As shown in Figures 2 and 3, the position-specific propensities of these residues are found to obey periodic trend though the pattern and strength of periodicity vary from residue to residue leading to the formation of β-strands.
Short polar amino acids exhibit a distinctly different distribution as compared to that of long polar residues. These amino acids are known to have low preferences for β-strands [7, 41, 61] and the results (Figures 2 and 3) demonstrate very weak periodic dependence of these residues with respect to the positions in β-strands. Asn and Asp at C-cap can form H-bond with the NH group of the residue following this position by their Oδ atom and hence terminate β-sheet formation. Towards N-terminus, Asn and Asp turn the backbone of the strand preventing β-sheet extension from the N direction. The under-representation of aspartic acid and asparagine in the interiors of β-strands is due to the destabilizing effect arising from the removal of backbone-backbone H-bonding between the partner strands of β-sheet . Among the residues from this group serine shows the strongest periodic nature in its propensity values.
Unlike the residues in polar groups, hydrophobic aromatic group amino acids are considered to be more preferred in β-strand structures [7, 41, 61]. Analysis of results confirms this trend. Yet a few inner positions of β-strands are under-represented by these residues e.g. N4 (= 0.70 ± 0.4), N6 (= 0.90 ± 0.4), C5 (= 0.94 ± 0.4), C7 (= 0.86 ± 0.4) positions by tryptophan, N5 (= 0.99 ± 0.1), N8 (= 0.92 ± 0.1) positions by tyrosine and C1 (= 0.86 ± 0.09), C6 (= 0.99 ± 0.09), C8 (= 0.97 ± 0.08) positions by phenylalanine. This leads to a weak periodic pattern of position-specific propensities of hydrophobic aromatic amino acids, shown in Figures 2 and 3. These amino acids usually have a very high preference for β-strand structures. This under-representation may be explained in terms of non-polar residue periodicity which initiates strand formation. In agreement with the earlier studies  a very low preference of hydrophobic aromatic amino acids is observed for both N-cap and C-cap positions.
Amino acids from aliphatic+cys group are also considered to be hydrophobic and are highly preferred in strand structures [7, 41, 61]. However, in contrast to hydrophobic aromatics, residues from this group show strong positional preferences in β-strands. For example values of cysteine starting from N1 position are 0.62 ± 0.2, 1.58 ± 0.2, 0.54 ± 0.2, 1.27 ± 0.2, 1.46 ± 0.2, 1.27 ± 0.2, 0.84 ± 0.2, 1.27 ± 0.2, 1.04 ± 0.2, 0.77 ± 0.3 upto N10 position. A similar trend of values for cysteine is also observed in the C-terminus. This position-specific periodic nature of cysteine can be rationalized based on the fact that cysteine pairs at the non-hydrogen bonded positions in antiparallel sheets favour disulphide bridge among them . Methionine also exhibits a strong position-specific propensity pattern. Propensity values of methionine from N1 to N10 position are 0.84 ± 0.3, 0.91 ± 0.2, 0.67 ± 0.2, 1.25 ± 0.3, 0.88 ± 0.3, 1.18 ± 0.3, 0.84 ± 0.3, 1.14 ± 0.3, 0.88 ± 0.3, 1.11 ± 0.3. So periodicity in the position-specific propensities of methionine resembles that of the structural repeat of β-strands.
Analogous to helices, the amino acids belonging to other group do not show characteristic positional dependence in their propensity values in β-strands. Among the five residues belonging to this group only threonine is preferred in β-strands [7, 41, 61]. Results in Figures 2 and 3 show that threonine is highly preferred throughout all positions in strand structures. Another member of this group, glycine, is the only amino acid which is non-chiral. Small volume of the hydrogen in glycine imparts a local flexibility to the local peptide structure. In the present work, glycine is found to have very high preferences for both N-cap and C-cap positions (refer to Figures 2 and 3). In agreement with the earlier results, glycine is confirmed to be a strand terminator . A notable difference from the earlier work is under representation of glycine in the middle region of β-strands. It is observed from Figures 2 and 3 that proline has higher preference for both the cap positions and is scarcely found in β-strands. Proline rarely fits into the regular part of helices or sheets as it lacks a NH group in the backbone for participating in H-bonding and restricted values of torsion angles . The low occurrence of proline in strands is due to the fact that it has only one rotatable angle and so it loses less entropy in forming regular structures . Surprisingly histidine is found to obey a periodic nature in its propensity values with respect to positions from cap residues. Previous studies have demonstrated that histidine is under-represented in the middle region of β-strands  which is in accordance with the low preference of histidine in strands [7, 41, 61]. However, in this work, we find quite unexpectedly that histidine is preferred in mid positions like N1 (= 1.30 ± 0.2), N3 (= 1.07 ± 0.2), N7 (= 1.17 ± 0.2) and N10 (= 1.27 ± 0.2). This together with the other positions where histidine is under-represented (P ij < 1), show a weak periodic nature in the position-specific propensity of histidine from the N-cap position. A similar trend for histidine is also found for inner positions with respect to the C-cap position.
Propensities are independent of orientations of partner strands as well as class of proteins
Among the 1634 β-strands (length ≥ 10 residues) considered in this study, 1250 are found to be antiparallel, 51 are parallel while 333 of them are mixed. Additional file 1, Figures S5-S10 depict the position-specific propensities of amino acids in all three types of strands from both termini. The correlations of propensities of each amino acids in these strands with respect to the results given in the main text are shown in Additional file 1, Table S3. Excellent correlation is observed for the propensities of all amino acids between the antiparallel β-strands and the original database. Due to less number of parallel as well as mixed strands in the database, some positions show fluctuating propensity values. By neglecting these positions where the propensity fluctuations are greater or less than 50% of their original values, the number of amino acids which have either moderate or good correlation with the original database are found to be 13 and 19 respectively for parallel and mixed β-strands database.
As mentioned in the methods section an additional database of β-strands from β-barrel proteins are also considered in this study. This additional database consists of 274 β-strands with length ≥ 10 residues. Position-specific propensities of amino acids of these strands from both termini are shown in Additional file 1, Figures S11 and S12. A similar analysis shows that 11 amino acids have moderate or good correlation in propensity values with that of the original database. This shows that the propensity values are not biased towards any particular class of proteins. Moreover, the database comprising of 1634 β-strands is compiled from various proteins belonging to different classes and folds of the SCOP classified proteins  (class and fold annotation of these protein chains are provided in Additional file 2).
χ2-values depict significant difference in propensities at different positions in strands
To evaluate the significance of the anisotropic propensities at different positions in strands, χ2 values for all positions are calculated according to equation 2. All 1634 β-strands are included in the reference distribution for calculating these χ2-values. For a 19-dimensional system, such as amino acid distribution in different classes, χ2 value at 95% level of confidence (probability of accepting the null hypothesis, P < 0.05) should be greater than 30.14 to reject the null hypothesis. Table 1 shows the χ2 values for different positions in β-strands from both termini. Except N7, N10 and C8 (eventhough N7 position has > 90% while C8 has > 80% level of confidence), the differences in distributions of amino acids for each position in the β-strand are found out to be highly significant. χ2 values are also calculated by considering the 1634 β-strand sequences excluding the cap positions. The results are shown in parenthesis of Table 1 for N1 to N10 and C1 to C10 positions. The trend in χ2 values are more or less similar to that obtained by including the cap positions.
Periodic propensity values are not an artefact of the amino acid composition in β-strands. To verify this a random scrambling of the 1634 β-strand sequences (including the cap residues) are done to get 1634 random peptide sequences. Position-specific propensities of amino acids in these random peptide sequences from both the termini are shown in Additional file 1, Figures S13 and S14. It may be observed that the position-specific propensity curves are almost flat, without any periodicity in the peak propensity values. The featureless curves of the random peptide sequences lack marked periodicity especially when the propensities of the amino acids are observed with respect to the C-terminus. Hence it may be emphasized that the periodicity is position-specific and not a consequence of the amino acid composition of β-strands.
Average hydrophobicity is periodic in nature from cap position in β-strands
The propensity results suggest that both the polar and the hydrophobic residues have specific positional preference to occur in β-strands. To explore further, position-specific average hydrophobicity is calculated for 10 inner positions from both termini of β-strands according to equation 4. Figure 4 illustrates this average hydrophobicity plotted vs. positions in β-strands. The graphs show average hydrophobicity is periodic in nature with respect to positions in β-strands. The periodicity of average hydrophobicity with respect to positions in secondary structures has been observed earlier and is thought to play an important role in stabilizing these structures [53, 65]. However, it was not investigated in relation to the cap position in β-strands. Present investigation with the help of a much larger database shows that average hydrophobicity is periodic in nature in inner positions with respect to cap residues in strands. A similar result is obtained in case of helices earlier . The position-specific periodic nature of average hydrophobicity is different in the strength of periodicity for β-strands as compared to helices. While for helices position-specific average hydrophobicity ranges between around 0.2 to 0.8 Kcal/mol, the range in case of strands is between 0.6 to 0.8 Kcal/mol. Very low hydrophobicity of N-cap (0.08 Kcal/mol) and C-cap (0.1 Kcal/mol) residues also confirm the presence of a hydrophilic barrier at both termini of β-strands (data not shown in Figure 4).
Free energy values are different at different positions
The χ2 values from equation 2 clearly indicate that inner positions from both termini of β-strands have their intrinsic characteristic amino acid requirements. Differences in the amino acid propensities at each of these 10 positions are markedly pronounced, especially at the cap positions and in the middle positions, i.e., across N4-N10 at the N-terminus and C3-C8 at the C-terminus. Most of the propensity scales reflect the free energy differences of the different amino acid residues in the respective sequence positions. This free energy difference is calculated by equation 5. The free energy values (Additional file 1, Tables S4 and S5) clearly distinguish the specific positional preferences of the respective amino acids, which may be used for designing β-strands. In agreement with the previous studies  Ala is found to be more stable than Gly in sheets. The position-specific propensities of amino acids are found to correlate well with position-specific hydrophobicities. The position-specific propensity of valine is very high and is directly proportional to that of position-specific hydrophobicity at both N- and C-terminus with correlation coefficient of 0.96 and 0.92 respectively. For proline, the hydrophobicity trend is exactly opposite in keeping with the values of position-specific propensity (R = -0.85 and -0.87 respectively from N- and C-terminus). In general, the hydrophobic amino acids have a better correlation of their propensity values to the position-specific hydrophobicity in comparison to the polar ones (Additional file 1, Table S6). This trend in position-specific propensity and the correlation between its values with position-specific hydrophobicity can be a suitable input for secondary structure prediction algorithms and de novo protein design.
In contrast to the earlier findings, amino acid propensities are found to be position-specific throughout β-strands. Periodicity plays the role of an important stabilizing factor for the secondary structures, especially for α-helices . This work, for the first time, presents a detailed analysis of the position-specific propensities of amino acids in β-strands with a large database of non-redundant proteins. Analogous to the α-helices, the position-specific propensities of amino acids in β-strands are found to exhibit an unusual characteristic periodic behavior with respect to the cap residues of β-strands in both termini. This periodic nature is different for different amino acids, even amino acids belonging to same physico-chemical groups display different patterns in their position-specific propensity. In a nutshell amino acids belonging to aliphatics+cys group (particularly cysteine and methionine) show strong periodicity in their propensity values; long polar (particularly arginine and glutamine) and hydrophobic aromatic (particularly phenylalanine) group amino acids show very mild periodicity in their position-specific propensity values; while no periodic pattern is found for amino acids belonging to short polar and other group. The position-dependence of these residues may be attributed due to the fact that different residues (e.g. polar, aromatic etc.) may have different tendencies to appear inside the protein core as opposed to the surface. The positions of the peak values of propensity displayed by most of the amino acids are at differences of multiples of two, the structural repeat unit of β-strands. Average hydrophobicity also shows position-specific periodic nature in strands. The physico-chemical characteristics of the amino acids combined with the position-specific propensity and hydrophobicity measures may direct de novo designing of proteins, primarily comprising of β-strands (β-sheet proteins), whose structures are difficult to determine .
Yang AS, Honig B: Free energy determinants of secondary structure formation: I. α -Helices. J Mol Biol 1995, 252: 351–365. 10.1006/jmbi.1995.0502
Emberly EG, Mukhopadhyay R, Wingreen NS, Tang C: Flexibility of α -Helices: Results of a Statistical Analysis of Database Protein Structures. J Mol Biol 2003, 327: 229–237. 10.1016/S0022-2836(03)00097-4
Chothia CJ: Conformation of twisted β -sheets in proteins. J Mol Biol 1973, 75: 295–302. 10.1016/0022-2836(73)90022-3
Yang AS, Honig B: Free energy determinants of secondary structure formation. II. Antiparallel β -sheets. J Mol Biol 1995, 252: 366–376. 10.1006/jmbi.1995.0503
Emberly EG, Mukhopadhyay R, Tang C, Wingreen NS: Flexibility of β -sheets: Principal component analysis of database protein structures. Proteins 2004, 55: 91–98. 10.1002/prot.10618
Koehl P, Levitt M: Strcture-based conformational preferences of amino acids. Proc Natl Acad Sci USA 1999, 96: 12524–12529. 10.1073/pnas.96.22.12524
Chou PY, Fasman GD: Conformational Parameters for Amino Acids in Helical, β -Sheet, and Random Coil Regions Calculated from Proteins. Biochemistry 1974, 13: 211–222. 10.1021/bi00699a001
Malkov SN, Zivkovic MV, Beljanski MV, Hall MB, Zaric SD: A reexamination of the propensities of amino acids towards a particular secondary structure: classification of amino acids based on their chemical structure. J Mol Model 2008, 14: 769–775. 10.1007/s00894-008-0313-0
Baker D, Sali A: Protein structure prediction and structural genomics. Science 2001, 294: 93–96. 10.1126/science.1065659
Hardin C, Pogorelov TV, Luthey-Schulten Z: Ab initio protein structure prediction. Curr Opin Struct Biol 2002, 12: 1756–181. 10.1016/S0959-440X(02)00306-8
Moult J: Predicting protein three-dimensional structure. Curr Opin Biotechnol 1999, 10: 583–588. 10.1016/S0958-1669(99)00037-3
Rost B: Review: protein secondary structure prediction continues to rise. J Struct Biol 2001, 134: 204–218. 10.1006/jsbi.2001.4336
Schonbrun J, Wedemeyer WJ, Baker D: Protein structure prediction in 2002. Curr Opin Struct Biol 2002, 12: 348–354. 10.1016/S0959-440X(02)00336-6
Argos P, Palau J: Amino acid distribution in protein secondary structures. Int J Prot Pept Res 1982, 19: 380–393. 10.1111/j.1399-3011.1982.tb02619.x
Richardson JS, Richardson DC: Amino acid preferences for specific locations at the ends of α -helices. Science 1988, 240: 1648–1652. 10.1126/science.3381086
Kumar S, Bansal M: Dissecting α -Helices: Position-Specific Analysis of α -Helices in Globular Proteins. Proteins 1998, 31: 460–476. 10.1002/(SICI)1097-0134(19980601)31:4<460::AID-PROT12>3.0.CO;2-D
Presta LG, Rose GD: Helix signals in proteins. Science 1988, 240: 1632–1641. 10.1126/science.2837824
Petukhov M, Munoz V, Yumoto N, Yoshikawa S, Serrano L: Position dependence of non-polar amino acid intrinsic helical propensities. J Mol Biol 1998, 278: 279–289. 10.1006/jmbi.1998.1682
Petukhov M, Uegaki K, Yumoto N, Yoshikawa S, Serrano L: Position dependence of amino acid intrinsic helical propensities. II. Non-charged polar residues: Ser, Thr, Asn, and Gln. Protein Sci 1999, 8: 2144–2150. 10.1110/ps.8.10.2144
Penel S, Hughes E, Doig AJ: Side chain structures in the first turn of the α -helix. J Mol Biol 1999, 287: 127–143. 10.1006/jmbi.1998.2549
Petukhov M, Uegaki K, Yumoto N, Serrano L: Amino acid intrinsic α -helix dependence at several positions of C terminus. Protein Sci 2002, 11: 766–777. 10.1110/ps.2610102
Cochran DAE, Penel S, Doig AJ: Effect of the N1 residue on the stability of the α -helix for all 20 amino acids. Protein Sci 2001, 10: 463–470. 10.1110/ps.31001
Wilson CL, Boardma AJ, Doig PE, Hubbard SJ: Side improved prediction of N-termini of α -helices using empirical information. Proteins 2004, 57: 322–330. 10.1002/prot.20218
Cochran DAE, Doig AJ: Effect of the N2 residue on the stability of the α -helix for all 20 amino acids. Protein Sci 2001, 10: 1305–1311. 10.1110/ps.50701
Fonseca NA, Camacho R, Magalhaes AL: Amino acid pairing at the N- and C-termini of helical segments in proteins. Proteins 2008, 70: 188–196. 10.1002/prot.21525
Serrano L, Fersht AR: Capping and α -helix stability. Nature 1989, 342: 296–299. 10.1038/342296a0
Serrano L, Neira JL, Sancho J, Fersht AR: Effect of alanine versus glycine in α -helices on protein stability. Nature 1992, 256: 453–456. 10.1038/356453a0
Lecomte JTJ, Moore CD: The role of neutral histidine at the N-cap position. J Am Chem Soc 1991, 113: 9663–9665. 10.1021/ja00025a037
Bell JA, Becktel WJ, Sauer C, Baase WA, Matthews BW: Dissection of helix capping in T4 lysozyme by structural and thermodynamic analysis of 6 amino acid substitutions at Thr 59. Biochemistry 1992, 31: 3590–3596. 10.1021/bi00129a006
Chakrabartty A, Doig AJ, Baldwin RL: Helix capping propensities in peptides parallel thoes in proteins. Proc Natl Acad Sci USA 1993, 90: 11332–11336. 10.1073/pnas.90.23.11332
Farood B, Feliciano EJ, Nambiar KP: Stabilisation of α -helical structures in short peptides via end capping. Proc Natl Acad Sci USA 1993, 90: 838–842. 10.1073/pnas.90.3.838
Lyu PC, Wemmer DE, Zhou HX, Pinker RJ, Kallenbach NR: Capping interactions in isolated α -helices:Position-dependent substitution effects and structure of a serine-capped helix. Biochemistry 1993, 32: 421–425. 10.1021/bi00053a006
Yumoto N, Murase S, Hattori T, Yamamoto H, Tatsu Y, Yoshikawa S: Stabilisation of α -helix in C-terminal fragments of neuropeptide-Y. Biochem Biophys Res Commun 1993, 196: 1490–1495. 10.1006/bbrc.1993.2420
Doig AJ, Chakrabartty A, Kingler TM, Baldwin RL: Determination of free energies of N-capping in α -helices by modification of Lifson-Roig helix-coil theory to include N- and C-capping. Biochemistry 1994, 33: 3396–3403. 10.1021/bi00177a033
Doig AJ, Baldwin RL: N- and C-capping preferences for all 20 amino acids in α helical peptides. Protein Sci 1995, 4: 2247–2251. 10.1002/pro.5560041101
Petukhov M, Yumoto N, Murase S, Onmura R, Yashikawa S: Factors that affect the stabilisation of α -helices in short peptides by a capping box. Biochemistry 1996, 35: 387–397. 10.1021/bi9513766
Aurora R, Srinivasan R, Rose GD: Rules for α -helix termination by glycine. Science 1994, 264: 1126–1130. 10.1126/science.8178170
Blader M, Zhang XJ, Matthews BW: Structural basis of amino acid α -helix propensity. Science 1993, 260: 1637–1640. 10.1126/science.8503008
Padmanabhan S, Marquesee S, Ridgeway T, Laue TM, Baldwin RL: Relative helix-forming tendencies of nonpolar amino acids. Nature 1990, 344: 268–270. 10.1038/344268a0
Engel DE, DeGrado WF: Amino Acid Propensities are Position-dependent Throughout the Length of α -Helices. J Mol Biol 2004, 337: 1195–1205. 10.1016/j.jmb.2004.02.004
Koehl P, Levitt M: Strcture-based conformational preferences of amino acids. Proc Natl Acad Sci USA 1999, 96: 12524–12529. 10.1073/pnas.96.22.12524
Munoz V, Serrano L: Intrinsic Secondary Structure Propensities of the Amino Acids, Using Statistical ϕ-ψ Matrices: Comparison With Experimental Scales. Proteins 1994, 20: 301–311. 10.1002/prot.340200403
Bai Y, Englander SW: Hydrogen bond strength and β -sheet propensities: the role of a side chain blocking effect. Proteins 1994, 18: 262–266. 10.1002/prot.340180307
Avbelj F, Moult J: Role of electrostatic screening in determining protein main chain conformational preferences. Biochemistry 1995, 34: 755–764. 10.1021/bi00003a008
Street AG, Mayo SL: Intrinsic β -sheet propensities result from van der Waals interactions between side chains and the local backbone. Proc Natl Acad Sci USA 1999, 96: 9074–9076. 10.1073/pnas.96.16.9074
Koh E, Kim T, Cho HS: Mean curvature as a major determinant of β -sheet propensity. Bioinformatics 2006, 22: 297–302. 10.1093/bioinformatics/bti775
Pal D, Chakrabarti P: Beta-sheet propensity and its correlation with parameters based on conformation. Acta Crystallogr D, Biol Crystallogr 2000, 56: 589–594. 10.1107/S090744490000367X
FarzadFard F, Gharaei N, Pezeshk H, Marashi S: β -Sheet capping: Signals that initiate and terminate β -sheet formation. J Struct Biol 2008, 161: 101–110. 10.1016/j.jsb.2007.09.024
Hobohm U, Sander C: Enlarged representative set of protein structures. Protein Sci 1994, 3: 522–524. 10.1002/pro.5560030317
Kabsch W, Sander C: Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features. Biopolymers 1983, 22: 2577–2637. 10.1002/bip.360221211
Mandel-Gutfreund Y, Gregoret LM: On the Significance of Alternating Patterns of Polar and Non-polar Residues in Beta-strands. J Mol Biol 2002, 323: 453–461. 10.1016/S0022-2836(02)00973-7
Duan M, Huang M, Ma C, Li L, Zhou Y: Position-specific residue preference features around the ends of helices and strands and a novel strategy for the prediction of secondary structures. Protein Sci 2008, 17: 1505–1512. 10.1110/ps.035691.108
Kanehisa MI, Tsong TY: Local Hydrophobicity Stabilizes Secondary Structures in Proteins. Biopolymers 1980, 19: 1617–1628. 10.1002/bip.1980.360190906
Tanford C: The Hydrophobic Effect. 2nd edition. Wiley, New York; 1979.
H HJ, S SR, Record MTJ: Role of the hydrophobic effect in stability of site-specific protein-DNA complexes. J Mol Biol 1989, 209: 801–816. 10.1016/0022-2836(89)90608-6
Wimley WC, Creamer TP, White SH: Solvation energies of amino acid side chains and backbone in a family of host-guest pentapeptides. Biochemistry 1996, 35: 5109–5124. 10.1021/bi9600153
Zhu H, Braun W: Sequence specificity, statistical potentials, and three-dimensional structure prediction with self-correcting distance geometry calculations of β -sheet formation in proteins. Protein Sci 1999, 8: 326–342. 10.1110/ps.8.2.326
Parisien M, Major F: Ranking the factors that contribute to protein β -sheet folding. Proteins 2007, 68: 824–829. 10.1002/prot.21475
Sreerama N, Venyaminov SY, Woody RW: Estimation of the number of α -helical and β -strand segments in proteins using circular dichroism spectroscopy. Protein Sci 1999, 8: 370–380. 10.1110/ps.8.2.370
West MW, Hecht MH: Binary patterning of polar and nonpolar amino acids in the sequences and structures of native proteins. Protein Sci 1995, 4: 2032–2039. 10.1002/pro.5560041008
Creighton TE: Proteins:Structure and Molecular Properties. 2nd edition. W. H. Freeman and Company; 1984.
Xiong H, Buckwalter BL, Shieh HM, Hecht MH: Periodicity of polar and nonpolar amino acids is the major determinant of secondary structure in self-assembling oligomeric peptides. Proc Natl Acad Sci USA 1995, 92: 6349–6353. 10.1073/pnas.92.14.6349
Fooks HM, Martin ACR, Woolfson DN, Sessions RB, Hutchinson EG: Amino Acid Pairing Preferences in Parallel β -Sheets in Proteins. J Mol Biol 2006, 356: 32–44. 10.1016/j.jmb.2005.11.008
Murzin AG, Brenner SE, T H, Chothia C: SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 1995, 247: 536–540.
Cid H, Bunster M, Arriagada E, Campos M: Prediction of secondary structure of proteins by means of hydrophobicity profiles. FEBS Lett 1982, 150: 247–254. 10.1016/0014-5793(82)81344-6
Hu X, Wang H, Ke H, Kuhlman B: Computer-Based Redesign of a β Sandwich Protein Suggests that Extensive Negative Design Is Not Required for De Novo β Sheet Design. Structure 2008, 16: 1799–1805. 10.1016/j.str.2008.09.013
The authors gratefully acknowledge the financial support from DST (project no. SR/S1/PC-07/06), India. Nicholus Bhattacharjee acknowledges CSIR, India for providing financial assistance in form of JRF.
NB and PB have designed research. NB has performed research. PB and NB have analyzed data, written the manuscript and approved the final version.