Sample | Size | averaging method | SSEA | DSSP | Ours |
---|
| | | | NCL | CL | NCL | CL |
ALL | 1183 | U | 2.30 | 2.27 | 2.49 | 2.36 | 2.50 |
| | R | 2.08 | 2.07 | 2.27 | 2.09 | 2.26 |
| | L | 1.71 | 1.70 | 1.84 | 1.68 | 1.85 |
MEDIUM | 631 | U | 1.82 | 1.87 | 1.98 | 1.81 | 2.04 |
| | R | 1.62 | 1.66 | 1.77 | 1.59 | 1.78 |
| | L | 1.18 | 1.18 | 1.27 | 1.11 | 1.26 |
LONG | 475 | U | 1.96 | 2.03 | 2.05 | 1.92 | 2.00 |
| | R | 1.81 | 1.85 | 1.90 | 1.76 | 1.86 |
| | L | 1.64 | 1.68 | 1.73 | 1.61 | 1.71 |
RANDOM | 591 | U | 1.76 | 1.77 | 1.87 | 1.88 | 1.98 |
| | R | 1.64 | 1.63 | 1.73 | 1.71 | 1.81 |
| | L | 1.42 | 1.37 | 1.47 | 1.43 | 1.53 |
- Average log-odds score of various clustering functions. Sample MEDIUM consists of those protein domains in ALL that have between 70 and 140 residues, and LONG are those that are longer. RANDOM is the average of 40 samples obtained by splitting ALL in a random fashion into equal parts (on the average). Averaging methods: U is unweighted, R is weighted with the root of fold size and L is weighted with the fold size (in a sample); in each case folds that have fewer than 2 representatives in a sample are excluded. SSEA is the score computed by SSEA program from DSSP output, DSSP is the score obtained from DSSP output and our alignment program, "ours" uses our structure determination and our alignment programs. Our annotations of closed loops were transferred to DSSP output to obtain CL version of that score.