Skip to main content

Table 4 Comparison of prediction quality measured via accuracy, MCC and AROC between the proposed method that uses the set of 88 features (including composition, collocation, pI and hydrophobicity), a method that uses the 86 composition and collocation features, and a method that uses only pI and hydrophobicity features.

From: CRYSTALP2: sequence-based protein crystallization propensity prediction

Dataset

Method (# features)

Accuracy

MCC

AROC

TEST-RL

only pI and hydrophobicity (2 features)

67.4

0.38

0.63

 

only composition and collocation (86 features)

62.8

0.26

0.66

 

CRYSTALP2 (88 features)

69.8

0.40

0.72

TEST

only pI and hydrophobicity (2 features)

66.0

0.37

0.66

 

only composition and collocation (86 features)

63.2

0.26

0.69

 

CRYSTALP2 (88 features)

75.7

0.52

0.79

TEST-NEW

only pI and hydrophobicity (2 features)

68.8

0.41

0.71

 

only composition and collocation (86 features)

61.9

0.24

0.66

 

CRYSTALP2 (88 features)

69.3

0.39

0.74

  1. Results are based on training the classification model on FEAT dataset and testing on TEST-RL, TEST, and TEST-NEW datasets, respectively.