Comparative analysis of PDB and Phosphosite-plus datasets. A) rSASA values from Phosphosite-plus and PDB datasets were binned at regular intervals with a difference of 0.1. Data from phosphosite-plus were plotted on Y1 axis and those from PDB were plotted on Y2 axis. Majority of phosphorylation sites in PDB dataset are in well accessible regions of the protein while in PhosphoSitePlus, they are found in moderately accessible regions. Representative structures where different phosphosites are found in three different regions of accessibility are shown. B) Actin protein (PDBID: 1 T44) where the site lies in inaccessible region (rSASA: 0.11), in C, carbonic anhydrase II (PDBID: 1XEV) the site is in a moderately accessible region (rSASA: 0.3) and in D, recombining binding protein suppressor of hairless (PDBID: 3NBN), in a well accessible region (0.73). All protein structures were fetched from PDB by matching the Uniprot ID of the protein from the phosphosite data. Distribution of octapeptide secondary structure and their accessibility. E) Octapeptides from Phosphosite-plus dataset and F) Octapeptides from the PDB dataset.