 Research article
 Open Access
SELECTpro: effective protein model selection using a structurebased energy function resistant to BLUNDERs
 Arlo Randall^{1, 2} and
 Pierre Baldi^{1, 2}Email author
https://doi.org/10.1186/14726807852
© Randall and Baldi; licensee BioMed Central Ltd. 2008
 Received: 26 June 2008
 Accepted: 03 December 2008
 Published: 03 December 2008
Abstract
Background
Protein tertiary structure prediction is a fundamental problem in computational biology and identifying the most nativelike model from a set of predicted models is a key subproblem. Consensus methods work well when the redundant models in the set are the most nativelike, but fail when the most nativelike model is unique. In contrast, structurebased methods score models independently and can be applied to model sets of any size and redundancy level. Additionally, structurebased methods have a variety of important applications including analogous fold recognition, refinement of sequencestructure alignments, and de novo prediction. The purpose of this work was to develop a structurebased model selection method based on predicted structural features that could be applied successfully to any set of models.
Results
Here we introduce SELECTpro, a novel structurebased model selection method derived from an energy function comprising physical, statistical, and predicted structural terms. Novel and unique energy terms include predicted secondary structure, predicted solvent accessibility, predicted contact map, βstrand pairing, and sidechain hydrogen bonding.
SELECTpro participated in the new model quality assessment (QA) category in CASP7, submitting predictions for all 95 targets and achieved top results. The average difference in GDTTS between models ranked first by SELECTpro and the most nativelike model was 5.07. This GDTTS difference was less than 1% of the GDTTS of the most nativelike model for 18 targets, and less than 10% for 66 targets. SELECTpro also ranked the single most nativelike first for 15 targets, in the top five for 39 targets, and in the top ten for 53 targets, more often than any other method. Because the ranking metric is skewed by model redundancy and ignores poor models with a better ranking than the most nativelike model, the BLUNDER metric is introduced to overcome these limitations. SELECTpro is also evaluated on a recent benchmark set of 16 small proteins with large decoy sets of 12500 to 20000 models for each protein, where it outperforms the benchmarked method (ITASSER).
Conclusion
SELECTpro is an effective model selection method that scores models independently and is appropriate for use on any model set. SELECTpro is available for download as a stand alone application at: http://www.igb.uci.edu/~baldig/selectpro.html. SELECTpro is also available as a public server at the same site.
Keywords
 Solvent Accessibility
 Protein Structure Prediction
 Consensus Method
 Acceptor Atom
 Model Selection Method
Background
Selecting the most nativelike model from a set of possible models is a crucial task in protein structure prediction. A variety of Model Quality Assessment Programs (MQAPs) have been developed that assign numeric scores to models in a set, and then use the scores to rank the models and ultimately select a single model. MQAP methods can be divided roughly into three categories based on the type of information they use: evolutionary methods use sequence or profile similarity between target sequence and template, consensus methods use similarity between models, and structurebased methods use model coordinates [1]. Each category of methods has inherent strengths and weaknesses.
Evolutionary methods can provide quality scores that have been shown to correlate with structural similarity to native [2]. However, for lower confidence alignments the scores do not correlate well with structural similarity. Furthermore, identification of the best template and specific alignment can be difficult. In addition, models built from multiple templates or templatefree methods cannot be scored appropriately by evolutionary methods alone.
Consensus methods take advantage of the observation that similar models produced by different predictors tend to be more accurate than those that are structural outliers. In practice, consensus methods outperform the methods they draw from, and they rarely pick a very poor model. The disadvantage, however, is that when the best model is a structural outlier it will be overlooked for lack of popularity [1]. Also, consensus methods are not appropriate for selecting from small sets of structurally diverse models, especially in the extreme case of a twomodel set.
While consensus methods depend on similarity between models, structurebased methods calculate scores on each model independently. For this reason, structurebased methods can be applied to model sets of any size and diversity, and will produce the same score for a model regardless of the other models in the set. Structurebased methods can also be used for templatefree modeling [3–6] and model refinement procedures [7, 8]. One weakness of high resolution structurebased methods, including protein free energy approximation functions [9–12] and physics based approaches [13, 14], is their sensitivity to local structural irregularities such as steric clashes and chain breaks, which can significantly bias scores on otherwise accurate models. Even slight differences in model backbones can produce significantly different scores [15]. Lower resolution structurebased methods, such as statistical potentials [6, 16, 17], are more robust to backbone variation, but are sensitive to extended low contactorder regions in the models.
Here we describe SELECTpro, a novel structurebased MQAP that combines high and low resolution energy terms into a model selection method that is effective on model sets of variable size, diversity, and target difficulty. Most of our assessment is calculated from the CASP7 model quality assessment category (QA) results published online [18]. The QA category provides a framework for the unbiased evaluation of MQAPs on ensembles of models produced by diverse automated prediction methods.
Results and discussion
Quality of Model Ranked First (M_{QA1}) Relative to Most NativeLike Model (M_{max})
Summary Results  Common Subset Results  

Group  Targets ^{a}  ΔGDT_{QA 1}= 0  ΔGDT_{QA 1%}< 1  ΔGDT_{QA 1%}< 10  $\overline{\Delta GD{T}_{QA1}}$  ΔGDT_{QA 1}= 0  ΔGDT_{QA 1%}< 1  ΔGDT_{QA 1%}< 10  $\overline{\Delta GD{T}_{QA1}}$  pvalue 
699_1  95 (124)  12  18  66  5.07           
713_1  95 (124)  7  11  63  5.44  12  18  66  5.07  2.5E01 
634_1  95 (124)  7  15  53  7.75  12  18  66  5.07  1.6E03 
704_1  95 (124)  5  8  49  7.76  12  18  66  5.07  3.5E04 
178_1  95 (124)  8  12  59  8.44  12  18  66  5.07  3.0E03 
633_1  95 (124)  6  9  52  10.12  12  18  66  5.07  1.8E06 
692_1  95 (124)  6  9  52  10.16  12  18  66  5.07  1.2E06 
657_1  95 (124)  1  5  40  12.71  12  18  66  5.07  1.8E08 
691_1  95 (124)  0  1  24  15.10  12  18  66  5.07  2.2E13 
091_1  94 (123)  11  18  61  7.93  12  18  65  5.10  2.1E03 
026_1  94 (123)  1  2  40  9.30  12  18  65  5.10  1.2E07 
338_5  93 (122)  2  3  37  15.10  12  18  65  5.05  1.3E09 
556_1  93 (121)  10  15  51  6.83  12  18  64  5.15  1.8E02 
734_1  92 (120)  4  4  36  16.16  12  18  64  5.10  5.6E11 
718_1  92 (119)  1  3  32  14.04  11  17  64  5.19  1.6E10 
717_1  87 (112)  3  7  36  10.15  10  15  59  5.31  4.3E08 
016_1  86 (111)  5  9  49  7.93  10  16  58  5.26  1.4E03 
038_1  85 (108)  3  7  60  5.75  11  16  58  5.34  1.2E01 
276_1  80 (104)  5  5  39  8.94  11  17  54  5.21  7.7E07 
013_1  78 (100)  4  6  41  9.86  10  15  56  4.87  2.0E05 
703_1  69 (86)  3  6  35  8.74  9  15  45  5.35  1.2E04 
191_1  61 (78)  2  5  32  9.35  7  10  39  6.04  2.3E03 
066_1  55 (72)  1  2  14  23.19  7  10  45  4.09  4.3E10 
Recovery of Top GDTTS Model (M_{max})
SetAll  SetComplete  

Summary Results  Common Subset Results  Summary Results s  Common Subset Results  
Group  Targets ^{a}  $\overline{rank}$  $\overline{\Delta GD{T}_{BLUNDER}}$  $\overline{rank}$  $\overline{\Delta GD{T}_{BLUNDER}}$  pvalue  Group  Targets  $\overline{rank}$  $\overline{\Delta GD{T}_{BLUNDER}}$  $\overline{rank}$  $\overline{\Delta GD{T}_{BLUNDER}}$  pvalue 
699_1 ^{b}  95 (124)  29.8  11.8        699_1  95 (124)  17.8  10.4       
704_1  95 (124)  46.5  17.8  29.8  11.8  2.7E06  633_1  95 (124)  20.7  11.8  17.8  10.4  4.7E02 
178_1  95 (124)  42.3  19.6  29.8  11.8  2.9E04  634_1  95 (124)  29.5  12.7  17.8  10.4  5.7E02 
657_1  95 (124)  78.5  37.0  29.8  11.8  3.9E20  704_1  95 (124)  24.1  13.1  17.8  10.4  1.1E02 
634_1  94 (121)  52.0  16.5  29.2  11.7  1.3E02  178_1  95 (124)  24.1  13.7  17.8  10.4  6.5E03 
091_1  94 (123)  27.2  17.4  29.3  11.9  2.2E05  657_1  95 (124)  53.5  32.0  17.8  10.4  8.6E18 
633_1  94 (121)  39.0  20.6  29.2  11.7  1.3E08  713_1  94 (122)  18.3  10.9  17.9  10.4  2.0E01 
026_1  94 (123)  55.9  22.7  29.4  11.6  3.2E10  692_1  94 (122)  20.6  11.6  17.7  10.3  6.7E02 
556_1  93 (121)  33.8  11.7  29.0  11.7  *  091_1  94 (123)  16.8  12.3  17.4  10.4  2.4E02 
692_1  93 (119)  38.7  20.6  29.2  11.6  1.1E08  026_1  94 (123)  37.3  18.3  17.6  10.2  1.5E07 
691_1  93 (120)  98.1  28.6  28.6  11.7  9.6E19  691_1  94 (123)  54.4  22.2  17.4  10.4  2.4E14 
338_2  93 (122)  60.4  30.2  30.2  11.9  2.7E15  556_1  93 (121)  21.2  10.3  17.2  10.2  4.9E01 
713_1  92 (116)  26.4  12.8  29.6  11.8  3.2E01  338_2  93 (122)  28.2  16.8  18.0  10.4  1.5E08 
734_1  89 (116)  55.2  31.5  29.3  11.2  1.6E15  734_1  88 (115)  28.9  18.1  17.3  9.6  7.0E09 
718_1  83 (105)  81.6  31.9  30.5  12.0  1.6E14  718_1  83 (105)  46.4  26.9  17.6  10.4  4.5E13 
717_1  78 (98)  46.8  22.8  30.9  12.0  3.4E09  717_1  78 (98)  28.4  16.4  17.6  10.3  2.1E05 
013_1  78 (100)  60.1  27.5  30.2  12.0  1.5E09  013_1  78 (100)  32.4  17.6  18.5  10.3  3.7E06 
276_1  78 (102)  52.9  28.9  29.6  11.6  3.3E12  276_1  78 (102)  29.0  18.7  17.5  10.2  8.5E10 
038_1  70 (87)  25.9  11.9  27.6  11.8  4.6E01  038_1  74 (95)  19.8  10.7  17.4  10.4  3.6E01 
703_1  69 (86)  37.2  20.6  31.5  11.9  5.1E07  703_1  69 (86)  20.6  14.5  17.6  10.5  4.7E04 
191_1  61 (78)  45.5  21.9  26.2  12.6  1.0E06  191_1  61 (78)  27.6  15.2  16.7  11.1  2.8E03 
066_1  55 (72)  91.1  54.6  30.5  10.5  5.1E24  066_1  55 (72)  48.0  46.5  18.5  9.1  1.8E18 
016_1  53 (72)  30.9  20.0  31.2  12.5  2.0E05  016_1  53 (70)  18.2  18.5  17.8  11.0  1.4E05 
Correlation of Selected Groups
Group  Targets  SetAll $\overline{PC}$  SetComplete $\overline{PC}$  ΔPC 

634_1 (Pcons) ^{a}  95  0.811  0.847  0.036 
713_1 (CircleQA) ^{b}  95  0.765  0.823  0.058 
633_1 (ProQ) ^{b}  95  0.716  0.781  0.064 
699_1 (SELECTpro) ^{b}  95  0.676  0.763  0.087 
556_1 (LEE) ^{c}  93  0.814  0.792  0.023 
To make fair comparisons to groups participating on only a subset of targets, common subset comparisons between SELECTpro and each of these groups are included in Tables 1 and 2. Only groups participating on at least half of the targets are included, and for groups with multiple submissions only the best one is shown. In the results tables any value that is better than SELECTpro is underlined.
For multiple domain targets, the sum of GDTTS over all domains is used as the GDTTS of the model. Since the QA predictions correspond to the entire structures, it is impossible to fairly assess the domains independently.
To assess the significance of the summary statistics compared in Table 1, Table 2, and Figure 2, we performed paired ttests between SELECTpro each other group on common subsets of targets (or targets and models when appropriate). All pvalues from the tests appear in the tables and figure, but only statistically significant pvalues (p < .05) are shown in bold.
The following notations are used throughout the results section:

M_{max}: The model with the highest GDTTS among all server models.

M_{QA 1}: The model with the highest QA score.

N_{ T }: The number of targets a group made valid predictions on.

N_{ D }: The number of domains a group made valid predictions on.
The recovery of M_{max} by a QA predictor can only be evaluated if M_{max}was scored by the predictor. In most cases QA predictors did not provide scores for all available server models, and frequently there is no score for M_{max}. For example, predictor 016_1 (AMBER/PB) made submissions on 86 targets, but M_{max} is only scored for 53 of these targets – so only these targets (N_{ T }= 53) can be evaluated for this predictor.
Quality of Model Ranked First (M_{QA1}) Relative to Most NativeLike Model (M_{max})
In this section on the assessment of the model ranked first, and the corresponding Table 1, we use the following three metrics:

ΔGDT_{ QA1 }= GDTTS(M_{max})  GDTTS(M_{QA 1}) : The GDTTS difference between M_{max} and M_{QA 1}measures how much is lost by selecting M_{QA 1}rather than M_{max} for a single target.

$\overline{\Delta GD{T}_{QA1}}$ = ΣΔGDT_{QA 1}/N_{ D }: The average ΔGDT_{QA 1}is a simple way of assessing the quality of M_{QA 1}over all targets.

ΔGDT_{QA 1%}= ΔGDT_{QA 1}/GDTTS(M_{max}) : The GDTTS difference percentage allows for comparison across targets with different numbers of domains and difficulty levels.
The columns of Table 1 are: (1) group number; (2) number of targets the group made predictions on; (3) number of targets such that ΔGDT_{QA 1}= 0; (4) number of targets such that ΔGDT_{QA 1%}< 1%; (5) number of targets such that ΔGDT_{QA 1%}< 10%; and (6) $\overline{\Delta GD{T}_{QA1}}$. The common subset results section has an additional column for the pvalue of the paired ttest using ΔGDT_{QA 1}. The rows are sorted first by the number of targets and then by $\overline{\Delta GD{T}_{QA1}}$. Of the groups participating on all 95 targets, SELECTpro has the lowest average ΔGDT_{QA 1}, with a value of 5.07, followed closely by group 713_1 (CircleQA), with a value of 5.44. Predictor 038_1 (GeneSilico) has an average ΔGDT_{QA 1}of 5.75, with predictions on 85 targets. In common subset comparisons with these two groups SELECTpro is not significantly better, with pvalues of .25 and .12 respectively. In common subset comparisons with all remaining groups SELECTpro is significantly better.
Another way to assess the quality of M_{QA 1}over many targets is to count the number of targets such that M_{QA 1}is the best model, or nearly the best, in the set. A method that performs very well on most targets, but very poorly on a few, would still be recognized by this criteria. SELECTpro recovers the best model for 12 targets, selects a model with ΔGDT_{QA 1%}< 1% for 18 targets, and selects a model with ΔGDT_{QA 1%}< 10% for 66 targets. Group 091_1 (MaOPUS) also performs well, with 11, 18, and 61 targets in the respective categories. Only the 60 targets with ΔGDT_{QA 1%}< 10% of predictor 038_1 (GeneSilico) on its 85 target subset are better than SELECTpro in common subset comparison (58 for SELECTpro).
The BLUNDER Measure Recovery of M_{max}
How well does a QA predictor recover M_{max}? The traditional metric to assess M_{max} recovery is the rank of M_{max}, and the average rank over many targets ($\overline{rank}$). While rank captures some important information, it ignores the redundancy of models and the quality of models ranked better than M_{max}. Consider the following hypothetical situation: group A ranks M_{max} 10^{th} and all nine models ranked above it are redundant with ΔGDT of ~2.0, group B ranks M_{max} 5^{th} and the four models ranked above it are diverse with a ΔGDT between 10.0 and 20.0. Which group has done a better job of recovering M_{max}? In this example, the rank metric favors group B, although group A ranks only a single redundant model above M_{max}. In addition, the models ranked better than M_{max} by group A have only slightly lower GDTTS than M_{max}, while the models ranked better than M_{max} by group B are significantly worse than M_{max}. To address these weaknesses of the rank metric, we introduce the BLUNDER metric, which focuses on the worst model ranked better than M_{max} (the most embarrassing blunder). This measure is not affected by model redundancy and measures the quality of models ranked above M_{max}. The BLUNDER metric is defined using the following notation, and used in the assessment of the recovery of M_{max} and the corresponding Table 2 and Figure 1:

M_{ BLUNDER }: The model with the minimum GDTTS among models ranked better than M_{max}.

ΔGDT_{ BLUNDER }= GDTTS(M_{max})  GDTTS(M_{ BLUNDER }) : The GDTTS difference between M_{max} and M_{ BLUNDER }measures the size of the worst blunder.

$\overline{\Delta GD{T}_{BLUNDER}}$ = ΣΔGDT_{ BLUNDER }/N_{ D }: The average ΔGDT_{ BLUNDER }measures how well a method robustly recovers M_{max} over many targets.

ΔGDT_{BLUNDER%}= ΔGDT_{ BLUNDER }/GDTTS(M_{max}) : The ΔGDT_{ BLUNDER }percentage allows for comparison across targets with different numbers of domains and difficulty levels.
Figure 1 contains graphs of the frequency of recovering M_{max} using the rank (A) and ΔGDT_{BLUNDER%}(B) measures on SetComplete. SELECTpro ranks M_{max} first for 15 targets, in the top five for 39 targets, and in the top ten for 53 targets. SELECTpro's ΔGDT_{BLUNDER%}values are less than 10% of GDTTS(M_{max}) for 40 targets and less than 20% for 63 targets. These results are best among all QA participants. The average M_{max} recovery results are summarized in Table 2. The results columns are (1) average rank ($\overline{rank}$) and (2) average ΔGDT_{ BLUNDER }($\overline{\Delta GD{T}_{BLUNDER}}$) on SetAll and SetComplete. The common subset results section also includes a column for the pvalue of a paired ttest using ΔGDT_{ BLUNDER }(pvalue). Rows are sorted separately for each dataset by N_{ T }first and then $\overline{\Delta GD{T}_{BLUNDER}}$. On SetComplete SELECTpro has a $\overline{\Delta GD{T}_{BLUNDER}}$ of 10.4. In common subset comparisons one group has a lower $\overline{rank}$: group 091_1 (MaOPUS) with $\overline{rank}$ of 16.8 on 94 targets compared to 17.4 for SELECTpro. On SetAll SELECTpro did not submit a score for M_{max} of target T0356 (HHpred2_TS1) due to a processing error. In order to make complete common subset comparisons when possible we added in the SELECTpro score for HHpred2_TS1. SELECTpro ranks it 86^{th} and ΔGDT_{ BLUNDER }= 50.0. Both results are significantly worse than the SELECTpro averages.
Pearson Correlation for Individual Proteins
The assessor evaluation of the quality assessment category [18] focused on the Pearson Correlation between the QA scores and GDTTS. Here we use the Pearson Correlation only to highlight some of the difficulties for structurebased methods in dealing with incomplete models, as well as basic nonprotein like structural features. Approximately half of the models in SetAll are incomplete, with backbone coordinates missing for one or more residues.
Incomplete models present a challenge to SELECTpro and other structurebased methods because the scores for each model are only comparable when calculated on coordinates for the same set of residues. Another issue is that some complete models have severe chainbreaks, severe steric clashes, or significant portions modeled only as extended chains. These local problems can overwhelm the energy of what may otherwise be a good model. Consensus methods do not suffer from these local structure problems. Given this rationale, one would expect structurebased methods to see the most improvement in terms of average Pearson Correlation on SetComplete relative to SetAll. Table 3 shows the average Pearson Correlation of five selected groups. Predictors 713_1 (CircleQA), 633_1 (ProQ), and SELECTpro are structurebased MQAPs, while 634_1 (Pcons) is a consensus method and 556_1 (LEE) scored structures based on the GDTTS similarity to their human Model 1 CASP7 prediction [18]. As expected, the structurebased MQAPs improve more than the structural similaritybased methods. The even greater increase in Pearson Correlation for SELECTpro can be accounted for by the failure to generate appropriate complete models for some of the incomplete models resulting in QA scores calculated on extended chains.
Reranking Top Server Group Models
Predictors in CASP may submit up to five models, but CASP evaluation focuses on the model designated as Model 1. Clearly, the selection of Model 1 is critical in the CASP setting and for protein structure prediction in general. Figure 2 contains the results when SELECTpro is used to rerank the five models submitted by each of the top ten servers from CASP7, compared to each server's results. In the following assessment M_{maxg} is the model with the highest GDTTS of the five models submitted by a server. Figure 2 (A) shows that SELECTpro recovers M_{maxg} more frequently than 8 of the top 10 server groups; in addition, when SELECTpro is used to select Model 1 the average GDTTS increases for 7 of 10 sever groups; however, the increase is only statistically significant for 3 groups. SELECTpro improves using both criteria for the top 3 server groups (ZhangServer, Pmodeller6, and ROBETTA). These results highlight the utility of SELECTpro for the task of model selection. The comparisons made here are fair because structurebased methods can be applied in the server setting to any number of models.
Large Decoy Set Model Selection
Here we analyze SELECTpro's model selection capability on the large decoy sets for 16 small proteins from a recent ITASSER benchmark set [19]. The ITASSER prediction method generates 12500 to 20000 different backbone conformations. The complete decoy sets can be downloaded from [20]. The consensus method SPICKER [21] is used to cluster the models and a centroid model is built from the first cluster. A second round of simulation resolves the steric clashes in the centroid model and results in the final predicted model. The centroid model and final model are not part of the decoy set. In order to make a fair model selection comparison the decoy model closest to the centroid is used as ITASSER's M_{QA 1}.
On the benchmark set SELECTpro has an average GDTTS of 63.7, while ITASSER has an average GDTTS of 62.1. SELECTpro's average ΔGDT_{QA 1}is 9.2 and ITASSER's ΔGDT_{QA 1}is 10.7. Figure 3 displays the GDTTS results for the individual proteins in the benchmark set. Different symbols are used to indicate the GDTTS of M_{max} (□), the GDTTS of SELECTpro's M_{QA 1}(×), and the GDTTS of ITASSER's M_{QA 1}(+) for each protein. A paired ttest of the hypothesis that SELECTpro and ITASSER's mean performance are equal produces a pvalue of .19, which is not statistically significant, but does give some evidence that SELECTpro can select a very good model from a large set of decoys at least well as an established method that utilizes consensus methods.
Conclusion
A MQAP that can select the most nativelike model from a set of possibilities has a variety of applications in protein structure prediction. The new quality assessment category introduced in CASP7 allows for the unbiased assessment of MQAPs on the models produced by automated predictors. This category allows researchers to focus on the model scoring aspect of protein structure prediction.
The results presented in this work demonstrate that SELECTpro, a structurebased model selection method, consistently selects one of the best models from the large diverse sets of models produced by automated predictors, across all levels of target difficulty. On these large diverse sets of models, SELECTpro also recovers the single most nativelike model well compared to other methods. On the small sets of five models submitted for each target by the top automated predictors, in most cases SELECTpro selects better models than the predictors themselves.
Since SELECTpro and other structurebased methods score models independently, they can be incorporated into the model selection pipelines of individual protein structure prediction servers. For this reason, it may help predictors if the CASP organizers distinguished methods that score models independently from those that do not.
Consensus and structurebased methods can be combined to achieve improved results. For example, the metaserver method Pmodeller [22] combines consensus (Pcons [23]) and structurebased methods (ProQ [24]) to predict protein structures more accurately than either method in isolation. The assessment of the QA category by CASP assessors recognized the consensus method Pcons (group 634_1) for the high Pearson Correlation between their scores and model GDTTS on most targets [18]. In their own assessment the authors of Pcons recognized that while consensus methods perform well in most cases, "when most of the models are incorrect and the few correct models are outliers a consensus based approach cannot be expected to make an optimal choice." [1] For instance, they identified three particular targets in CASP7 where their consensus method failed: T0283, T0350, and T0351 [1]. The Pcons average ΔGDT_{QA 1}on these three targets is 30.8. The same research group's structurebased method ProQ (group 633_1) has an average ΔGDT_{QA 1}of 17.2. In contrast, on these three targets SELECTpro has an average ΔGDT_{QA 1}of only 7.1. This example highlights the potential of combining SELECTpro with existing model selection methods.
SELECTpro has been made publicly available as a server, where users may submit from 2 to 100 models for evaluation. In addition to the global confidence scores, the scores of individual energy terms are also returned to the user by email for each model submitted. SELECTpro is one of several protein structure tools in the SCRATCH suite of predictors [25], and is available through: http://www.igb.uci.edu/~baldig/selectpro.html.
Methods
Datasets
All of the comparative analysis in this work is performed on the server models and quality assessment predictions submitted in the CASP7 [26] experiment. The CASP QA experiment is particularly relevant for the evaluation of model selection methods for several reasons: (1) the QA predictors were blind to the true structures at the time of prediction making it impossible for methods to be tuned to improve results; (2) the set of proteins is diverse: the 95 targets range in size from 68 to 530 amino acids, come from a variety of organisms, and span the full range of prediction difficulty; (3) each target has more than 200 predicted models that contain the types of errors that occur in automated structure prediction; (4) the protein set is not selected by any of the participating QA groups; (5) the models are scored by a variety of methods and the results are publicly available. We perform analysis on the set of all models (SetAll) and a subset of models (SetComplete) that are complete and free of gross structural irregularities, as described below. All of the ABIpro models and some of the 3Dpro models were optimized using the exact energy function of SELECTpro. These models are removed because of the obvious bias towards these models. In recent CASP experiments the GDTTS [27] has been used as the primary automatic structural similarity measure. The published GDTTS values from the CASP7 website are the only structural similarity measure used in this work.
SetAll
SetComplete
The scores produced by SELECTpro are comparable on complete models of the same sequence. There is no standard for the handling of incomplete models and we assume that participating groups took a variety of approaches. Using only complete models ensures that the MQAP scores are calculated from the same coordinates. Thus, the models retained in SetComplete are screened first for completeness. Models missing backbone coordinates for one or more residues are removed. This leaves 14,611 models.
Structurebased MQAPs are susceptible to local structural irregularities in models, and will tend to score such models poorly. This is why methods developed to select nearnative models from sets of decoys remove such models from consideration [31]. We apply additional filters (described below) for C_{α}C_{α} clashes, C_{α}C_{α} chain breaks, and expanded termini to remove an additional 1,217 models leaving 13,494 more plausible models in SetComplete.
The expanded termini filter removes models where a large portion of the structure is modeled as expanded chain with no nonlocal interactions. The screening procedure is: scan from the Nterminus until three consecutive residues have a contact number of at least 10, and repeat from the Cterminus. The contact number of a residue is defined here as the number of other C_{β} atoms within 10 Ǻ of the residue's C_{β} [3]. If the sum of low contact number termini residues is at least 20% of N, the model is filtered out.
Model Representations
Reduced representation
In the reduced representation the heavy backbone atoms, carbonyl oxygen, amide hydrogen (N, C_{α}, C, O, H), and C_{β} are represented explicitly. For glycine residues a pseudo C_{β} is calculated. The sidechain atoms are represented by a single united point (centroid) [32, 33]. The centroid is calculated as the mean of the position of the heavy sidechain atoms. For glycine and alanine the centroid (CT) is set to the C_{β} atom. Only the heavy backbone atoms (N, C_{α}, C) are used as input to SELECTpro and the positions of additional atoms and centroids are calculated from these.
All heavyatom representation
In the all heavyatom representation the centroid is removed and the heavy side chain atoms are represented explicitly. The sidechains are initially placed onto the backbone of the reduced representation in their most likely conformation according to the SCWRL backbonedependent rotamer library [34]. The sidechain placements are then optimized using the SELECTpro allatom energy terms (described below) in conjunction with the rotamer library.
Energy Functions Overview
E_{ REDUCED }is the combined energy calculated from the reduced representation. E_{ REDUCED }is a linear combination of predicted (E_{PREDSS}, E_{PREDSA}, E_{PREDCM}), physical (E_{VDWREP}), and statistical (E_{CTREP}, E_{STATENV}, E_{STATPWCI}, E_{STATPWCD}, E_{ ROG }) terms:E_{ REDUCED }= w_{1}E_{PREDSS}+ w_{2}E_{PREDSA}+ w_{3}E_{PREDCM}+ w_{4}E_{ BETA }+ w_{5}E_{VDWREP}+ w_{6}E_{CTREP}+ w_{7}E_{STATENV}+ w_{8}E_{STATPWCI}+ w_{9}E_{STATPWCD}+ w_{10}E_{ ROG }
E_{ALLATOM}consists of the energy terms that depend on the all heavyatom representation. E_{ALLATOM}is a linear combination of the following physical terms:E_{ALLATOM}= w_{11}E_{SCHB}+ w_{12}E_{LENJONES}+ w_{13}E_{ SOLVATION }+ w_{14}E_{ ELECTRO }
E_{ FINAL }is the sum of E_{ REDUCED }and E_{ALLATOM}, and is used for the final scoring of models by SELECTpro. The individual energy terms are outlined briefly below and the detailed description of the novel terms follow in the remainder of this section. Underlined terms are adapted from previously described energy terms their details are included in the Appendix.
Parameter Weights
The parameter weights were determined by repeatedly varying individual weights and maximizing the sum of the GDTTS of the lowest E_{ FINAL }models on a training set built from CASP6 protein domains. For each CASP6 protein domain a set of 500 decoy models was generated using fragment assembly with the RMSD to native as the dominant term in the objective function [3].
E_{REDUCED}
E_{PREDSS}: predicted secondary structure
E_{PREDACC}: predicted solvent accessibility
E_{PREDCM}: predicted contact map
E_{ BETA }: sheet formation
E_{ BB  REP }: backbone repulsion
E_{ CT  REP }: centroid repulsion
E_{ STAT  ENV }: residue environment potential [3]
E_{ STAT  PW  CI }: context independent pairwise potential [3, 16]
E_{ STAT  PW  CD }: context dependent pairwise potential [6]
E_{ ROG }: compactness
E_{ALLATOM}
E_{SCHB}: sidechain hydrogen bonding
E_{ LEN  JONES }: van der Waals forces [10]
E_{ SOLVATION }: solvation effects [35]
E_{ ELECTRO }: electrostatic interactions
Throughout this work the convention of all capital letters referring to global energy and all lower case referring to local energy is used. For instance, E_{PREDCM}refers to the global contact map energy and E_{predcm}(i,j) refers to the contact map energy between residues i and j.
Parameter notation used in energy equations
Model variables
r_{i,x,j,y}: distance between atom x of residue i and atom y of residue j
r_{x,y}: distance between atom x and atom y
v_{i,x,j,y}: vector from atom x of residue i to atom y of residue j
u_{i,x,j,y}: unit vector calculated from v_{i,x,j,y}
N_{ i }: number of residues in contact with residue i, with contact defined as r_{ i,Cβ,j,Cβ }< 10 Ǻ
phi_{ i }: Phi angle of residue i
psi_{ i }: Psi angle of residue i
Protein specific input parameters
aa_{ i }: amino acid type of residue i
ss_{ i }: predicted secondary structure of residue i (H,E,C)
acc_{ i }: predicted solvent accessibility of residue i ('e': exposed, '', buried)
cmap_{i,j}: predicted contact/noncontact between residues i and j, with contact defined as r_{i,Cα,j,Cα}< 12 Ǻ
Protein independent parameters
I_{ value }: ideal parameter value for a given calculation
σ_{ value }: standard deviation value for a given calculation
vdw_{ x }: van der Waals radius of atom x
vdw_{x+y}: vdw_{ x }+ vdw_{ y }
Ω_{statenv}: precalculated statistics for use in E_{STATENV}
Ω_{statpwoi}: precalculated statistics for use in E_{STATPWCI}
Ω_{statpwod}: precalculated statistics for use in E_{STATPWCD}
D_{min,pwod}: minimum interaction distance for centroid pairs used in E_{STATPWCD}
D_{max,pwod}: maximum interaction distance for centroid pairs used in E_{STATPWCD}
D_{minCT}: minimum distances between centroids of amino acid pairs observed in pdb_select25 [36].
Reduced Representation Energy Term Details
The details of how the novel reduced representation energy terms are calculated are presented in this section. The predicted structural terms E_{PREDSS}, E_{PREDACC}, and E_{PREDCM}and the βstrand pairing term, E_{ BETA }, are novel and unique to SELECTpro. Additional reduced representation terms are adapted from previously published work and their details are included in the Appendix.
Predicted structural features overview
The predicted structural feature predictions used in E_{PREDSS}, E_{PREDACC}, and E_{PREDCM}come from the SCRATCH suite of predictors [25]. Each predictor is trained in a supervised fashion using curated nonredundant datasets extracted from the PDB [37]. The secondary structure (SSpro [38]) and solvent accessibility (ACCpro [39]) predictors use ensembles of 1DRNN (one dimensionalrecursive neural network) architectures [40]. The contact map predictor (CMAPpro [41]) uses ensembles of 2DRNN architectures [40].
E_{PREDSS}: predicted secondary structure
The definition of E_{predstrand}(j) is equivalent to E_{predhelix}(i), but with I_{ Eφ }, σ_{ Eφ }, I_{E ψ}and σ_{E ψ}in place of the corresponding helical values.
E_{PREDACC}: predicted solvent accessibility
E_{PREDCM}: predicted contact map
E_{BETA}: strand pairing
The formation of hydrogen bonds between the residues of βstrand partners is a major determinant of the tertiary structure of β and α/β proteins. The β hydrogen bonding treatment described here favors realistic strand pairing and sheet formation. The treatment also efficiently accommodates bulges in strands because it does not force the register between two paired strands. E_{ BETA }is the global strand pairing energy that penalizes the hydrogen bonding of β residues between strand pairs. E_{betasp}(β_{ k }→β_{ w }) is the strand pairing energy of strand β_{ k }to strand β_{ w }. E_{betasp}is only commutative if the two strands have the same length. E_{betahb}(i,j) is the hydrogen bonding penalty between residues i and j.
Between two antiparallel strand partners, only every other pair of residues is hydrogen bonded. For the pairs that are not hydrogen bonded, a pseudobonding calculation is used. The hydrogen bonding energy and pseudobonding energy are both calculated and the minimum of the two is used in E_{betahb}(i,j).
AllAtom Energy Term Details
The allatom energy terms depend on atomatom interactions when all heavy atoms are included in the model. In the allatoms energy equations x and y refer to atoms in the model and the residue positions are not referenced. The van der Waals radii and welldepths (ε_{ x }, used in E_{LENJONES}) come from the CHARMM19 parameter set [43]. The sidechain hydrogen bonding term, E_{SCHB}, is described in detail here because it is unique to SELECTpro. The details of E_{LENJONES}, E_{ SOLVATION }, and E_{ ELECTRO }are provided in the Appendix.
E_{SCHB}: sidechain hydrogen bonding
Appendix
In the interest of completeness and reproducibility we include the details of the energy terms that are adapted from previous work.
Reduced Representation Energy Term Details
E_{BBREP}: backbone repulsion
E_{CTREP}: centroid repulsion
E_{STATENV}: residue environment potential
E_{STATPWCI}: context independent pairwise interactions
E_{STATPWCD}: context dependent pairwise potential
E_{ROG}: compactness
AllAtom Energy Term Details
E_{LENJONES}: van der Waals forces
E_{SOLVATION}: solvation effects
E_{ELECTRO}: electrostatics
Availability and requirements

Project home page: http://www.igb.uci.edu/~baldig/selectpro.html

Operating system: linux for stand alone version, server is platform independent

Programming language: C++ and Perl

Software requirements: Perl

Disk space requirements: 1.6 Gb for full version, 13 Mb without feature predictors
Declarations
Acknowledgements
Work supported by NIH grant LM0744301, NSF grants EIA0321390 and IIS0513376, and a Microsoft Faculty Research Award to PFB.
Authors’ Affiliations
References
 Wallner B, Elofsson A: Prediction of global and local model quality in CASP7 using Pcons and ProQ. Proteins 2007, 69(Suppl 8):184–193. 10.1002/prot.21774View ArticleGoogle Scholar
 Cozzetto D, Tramontano A: Relationship between multiple sequence alignments and quality of protein comparative models. Proteins 2005, 58: 151–157. 10.1002/prot.20284View ArticleGoogle Scholar
 Simons KT, Kooperberg C, Huang E, Baker D: Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J Mol Biol 1997, 268: 209–225. 10.1006/jmbi.1997.0959View ArticleGoogle Scholar
 Kihara D, Lu H, Kolinski A, Skolnick J: TOUCHSTONE: An ab initio protein structure prediction method that uses threadingbased tertiary restraints. Proc Natl Acad Sci USA 2001, 98: 10125–10130. 10.1073/pnas.181328398View ArticleGoogle Scholar
 Boniecki M, Rotkiewicz P, Skolnick J, Kolinski A: Protein fragment reconstruction using various modeling techniques. J Comput Aided Mol Des 2003, 17: 725–738. 10.1023/B:JCAM.0000017486.83645.a0View ArticleGoogle Scholar
 Kolinski A: Protein modeling and structure prediction with a reduced representation. Acta Biochim Pol 2004, 51: 349–371.Google Scholar
 Sanchez R, Sali A: Comparative protein structure modeling. Introduction and practical examples with modeller. Methods Mol Biol 2000, 143: 97–129.Google Scholar
 Qian B, Ortiz A, Baker D: Improvement of comparative model accuracy by freeenergy optimization along principal components of natural structural variation. Proc Natl Acad Sci USA 2004, 101: 15346–15351. 10.1073/pnas.0404703101View ArticleGoogle Scholar
 Lazaridis T, Karplus M: Discrimination of the native from misfolded protein models with an energy function including implicit solvation. J Mol Biol 1999, 288: 477–487. 10.1006/jmbi.1999.2685View ArticleGoogle Scholar
 Kuhlman B, Baker D: Native protein sequences are close to optimal for their structures. Proc Natl Acad Sci USA 2000, 97: 10383–10388. 10.1073/pnas.97.19.10383View ArticleGoogle Scholar
 Vorobjev Y, Hermans J: Free energies of protein decoys provide insight into determinants of protein stability. Protein Sci 2001, 10: 2498–2506. 10.1110/ps.ps.15501View ArticleGoogle Scholar
 Felts A, Gallicchio E, Wallqvist A, Levy R: Distinguishing native conformations of proteins from decoys with an effective free energy estimator based on the OPLS allatom force field and the Surface Generalized Born solvent model. Proteins 2002, 48: 404–422. 10.1002/prot.10171View ArticleGoogle Scholar
 Dominy B, Brooks C: Identifying nativelike protein structures using physicsbased potentials. J Comput Chem 2002, 23: 147–160. 10.1002/jcc.10018View ArticleGoogle Scholar
 Oldziej S, Czaplewski C, Liwo A, Chinchio M, Nanias M, Vila JA, Khalili M, Arnautova YA, Jagielska A, Makowski M, Schafroth HD, Kazmierkiewicz R, Ripoll DR, Pillardy J, Saunders JA, Kang YK, Gibson KD, Scheraga HA: Physicsbased proteinstructure prediction using a hierarchical protocol based on the UNRES force field: Assessment in two blind tests. Proc Natl Acad Sci USA 2005, 102: 7547–7552. 10.1073/pnas.0502655102View ArticleGoogle Scholar
 Shortle D, Simons KT, Baker D: Clustering of lowenergy conformations near the native structures of small proteins. Proc Natl Acad Sci USA 1998, 95: 11158–11162. 10.1073/pnas.95.19.11158View ArticleGoogle Scholar
 Simons KT, Ruczinski I, Kooperberg C, Fox BA, Bystroff C, Baker D: Improved recognition of nativelike protein structures using a combination of sequencedependent and sequenceindependent features of proteins. Proteins 1999, 34: 82–95. 10.1002/(SICI)10970134(19990101)34:1<82::AIDPROT7>3.0.CO;2AView ArticleGoogle Scholar
 Vendruscolo M, Najmanovich R, Domany E: Can a pairwise contact potential stabilize native protein folds against decoys obtained by threading? Proteins 2000, 38: 134–148. 10.1002/(SICI)10970134(20000201)38:2<134::AIDPROT3>3.0.CO;2AView ArticleGoogle Scholar
 Cozzetto D, Kryshtafovych A, Ceriani M, Tramontano A: Assessment of predictions in the model quality assessment category. Proteins 2007, 69: 175–183. 10.1002/prot.21669View ArticleGoogle Scholar
 Wu S, Skolnick J, Zhang Y: Ab initio modeling of small proteins by iterative TASSER simulations. BMC Biol 2007, 5: 17. 10.1186/17417007517View ArticleGoogle Scholar
 Zhang 2007 Decoy Sets[http://zhang.bioinformatics.ku.edu/ITASSER/decoys/]
 Zhang Y, Skolnick J: SPICKER: A clustering approach to identify nearnative protein folds. J Comput Chem 2004, 25: 865–871. 10.1002/jcc.20011View ArticleGoogle Scholar
 Wallner B, Fang H, Elofsson A: Automatic consensusbased fold recognition using Pcons, ProQ, and Pmodeller. Proteins 2003, 53(Suppl 6):534–541. 10.1002/prot.10536View ArticleGoogle Scholar
 Lundstrom J, Rychlewski L, Bujnicki J, Elofsson A: Pcons: a neuralnetworkbased consensus predictor that improves fold recognition. Protein Sci 2001, 10: 2354–2362. 10.1110/ps.08501View ArticleGoogle Scholar
 Wallner B, Elofsson A: Can correct protein models be identified? Protein Sci 2003, 12: 1073–1086. 10.1110/ps.0236803View ArticleGoogle Scholar
 SCRATCH Cheng J, Randall AZ, Sweredoski M, Baldi P: SCRATCH: a protein structure and structural feature prediction server. Nucleic Acids Res 2005, (33 Web Server):W72W76. 10.1093/nar/gki396Google Scholar
 Moult J, Fidelis K, Kryshtafovych A, Rost B, Hubbard T, Tramontano A: Critical assessment of methods of protein structure predictionRound VII. Proteins 2007, 69(Suppl 8):3–9. 10.1002/prot.21767View ArticleGoogle Scholar
 Zemla A, Veclovas C, Moult J, Fidelis K: Processing and analysis of CASP3 protein structure predictions. Proteins 1999, 37(Suppl 3):22–29. Publisher Full Text 10.1002/(SICI)10970134(1999)37:3+<22::AIDPROT5>3.0.CO;2WView ArticleGoogle Scholar
 Sali A, Blundell TL: Comparative protein modeling by satisfaction of spatial restraints. J Mol Biol 1993, 234: 779–815. 10.1006/jmbi.1993.1626View ArticleGoogle Scholar
 MartinRenom MA, Stuart A, Fiser A, Sanchez R, Melo F, Sali A: Comparative protein structure modeling of genes and genomes. Annu Rev Biophys Biomol Struct 2000, 29: 291–325. 10.1146/annurev.biophys.29.1.291View ArticleGoogle Scholar
 Fiser A, Do RK, Sali A: Modeling of loops in protein structures. Protein Sci 2000, 9: 1753–1773.View ArticleGoogle Scholar
 Tsai J, Bonneau R, Morozov AV, Kuhlman B, Rohl CA, Baker D: An Improved Protein Decoy Set for Testing Energy Functions for Protein Structure Prediction. Proteins 2003, 53: 76–87. 10.1002/prot.10454View ArticleGoogle Scholar
 Baker D, Bystroff C, Fletterick RJ, Agard DA: PRISM: topologically constrained phased refinement for macromolecular crystallography. Acta Crystallogr D Biol Crystallogr 1993, 49: 429–39. 10.1107/S0907444993004032View ArticleGoogle Scholar
 Sun S: Reduced representation approach to protein tertiary structure prediction: statistical potential and simulated annealing. J Theor Biol 1995, 172: 13–32. 10.1006/jtbi.1995.0002View ArticleGoogle Scholar
 Canutescu AA, Shelenkov AA, Dunbrack RL: A graphtheory algorithm for rapid protein sidechain prediction. Protein Sci 2003, 12: 2001–2014. 10.1110/ps.03154503View ArticleGoogle Scholar
 Lazaridis T, Karplus M: Effective Energy Function for Proteins in Solution. Proteins 1999, 35: 133–152. 10.1002/(SICI)10970134(19990501)35:2<133::AIDPROT1>3.0.CO;2NView ArticleGoogle Scholar
 Hobohm U, Sander C: Enlarged representative set of protein structures. Protein Sci 1994, 3: 522–524.View ArticleGoogle Scholar
 Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic Acids Res 2000, 28: 235–242. 10.1093/nar/28.1.235View ArticleGoogle Scholar
 Pollastri G, Przybylski D, Rost B, Baldi P: Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins 2002, 47: 228–235. 10.1002/prot.10082View ArticleGoogle Scholar
 Pollastri G, Baldi P, Fariselli P, Casadio R: Prediction of coordination number and relative solvent accessibility in proteins. Proteins 2002, 47: 142–153. 10.1002/prot.10069View ArticleGoogle Scholar
 Baldi PF, Pollastri G: The principled design of largescale recursive neural network architectures–DAGRNNs and the protein structure prediction problem. J Mach Learn Res 2003, 4: 575–602. 10.1162/153244304773936054Google Scholar
 Pollastri G, Baldi P: Prediction of contact maps by GIOHMMs and recurrent neural networks using lateral propagation from all four cardinal corners. Bioinformatics 2002, 18: S62S70.View ArticleGoogle Scholar
 Kortemme T, Morozov AV, Baker D: An orientationdependent hydrogen bonding potential improves prediction of specificity and structure for proteins and proteinprotein complexes. J Mol Biol 2003, 326: 1239–1259. 10.1016/S00222836(03)000214View ArticleGoogle Scholar
 Neria E, Fischer S, Karplus M: Simulation of activation free energies in molecular systems. J Chem Phys 1996, 105: 1902–1921. 10.1063/1.472061View ArticleGoogle Scholar
 Skolnick J, Kolinski A, Ortiz AR: MONSSTER: A method for folding globular proteins with a small number of distance restraints. J Mol Biol 1997, 265: 217–241. 10.1006/jmbi.1996.0720View ArticleGoogle Scholar
 Privalov PL, Makhatadze GI: Contribution of hydration to protein folding thermodynamics II. The entropy and Gibbs energy of hydration. J Mol Biol 1993, 232: 660–679. 10.1006/jmbi.1993.1417View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.