Table 3 Discriminatory power of structure and sequence derived quantities

From: Discrimination of thermophilic and mesophilic proteins

Numerical index Thermophile (127 pairs) Hyperthermophile (122 pairs)
Contact Network Derived Quantities
coordination number (no cutoff) 0.559 0.689
clustering coefficient (no cutoff) 0.551 0.672
characteristic path (no cutoff) 0.520 0.631
Combined Sequence and Structure Including Threading Potentials
total count 400 over-rep quads/residue 0.850 0.943
4-body potential/residue (20Å cutoff) 0.858 0.844
4-body potential/residue (no cutoff) 0.843 0.852
4-body potential/res (hyper only, no cutoff) ----- 0.820
4-body potential/res (meso only, no cutoff) 0.732 0.803
4-body potential/res (thermo only,no cutoff) 0.866 -----
ProsaII combined score 0.554 0.693
Delaunay Simplex Geometry
median circumsphere radius(no cutoff) 0.701 0.639
mean tetrahedrality (no cutoff) 0.598 0.574
number simplices/residue (10Å cutoff) 0.528 0.557
number simplices/residue (no cutoff) 0.567 0.697
Volume/Surface Area/Compactness
Naccess solvent accessible area 0.567 0.598
Delaunay surface area (no cutoff) 0.606 0.669
van der Waals area 0.559 0.549
Delaunay volume (no cutoff) 0.598 0.701
Van der Waals volume 0.528 0.598
Delaunay area/volume (10Å cutoff) 0.583 0.549
Delaunay area/volume (no cutoff) 0.669 0.803
van der Waals area/volume 0.512 0.557
packing density 0.543 0.549
van der Waals volume/Delaunay volume 0.685 0.779
mean B-factor 0.661 0.533
Secondary Structure
secondary structure content (H+E 3 state DSSP) 0.614 0.689
Sequence Length
number of residues 0.528 0.672
Sequence Composition
total Kyte-Doolittle hydrophobicity 0.575 0.549
sd Kyte-Doolittle hydrophobicity 0.677 0.836
CvP bias 0.803 0.918
(E+K)/(Q+H) 0.591 0.861
IVYWREL 0.827 0.926
  1. A table showing the discriminatory power of sequence and structure based indices-the fraction of thermophile/mesophile pairs for which the quantity was systematically higher or lower by any amount. The contact network quantities are described in the introduction. The four body threading contact potentials are described in [1]. The cutoff indicates that simplices with at least one edge longer than the cutoff were omitted when frequencies are tallied during the calculation of the potential. "Hyper only" indicates that the potential was trained only on chains from hyperthermophilic organisms. The Delaunay simplex geometry indices are discussed in the introduction. The volume and surface area criteria are fairly self-explanatory except, perhaps, for packing density that is defined here as the ratio of the van der Waals volume of the protein divided by the all atom Voronoi volume. The sequence composition based indices CvP, (E+K)/(Q+H), and IVYWREL are described in the introduction.