Abstract:
Protein-Protein Interactions are critical to life, playing crucial roles in a variety of cellular processes. Hence, prediction of protein-protein interactions would help in gaining insights into cellular processes so that we may be able to manipulate and control it. In this study, we have developed knowledge-based pairwise statistical potentials based on experimentally derived structures for the prediction of protein-protein complexes. Structures of protein dimers in the Protein Data Bank (PDB) were used for the construction of the statistical potentials. A total of 96 different pairwise potentials were constructed for different values of five parameters: distance threshold for interactions, interacting atom types, weight type, weighting scheme and reference state. The performance of these potentials was benchmarked using Receiver Operating Characteristics (ROC) curves and Rank-Ordering. The side chain-side chain pairwise potentials were the best performers keeping all other parameters constant. The best performing pairwise potential could discriminate native structures from a sequence-randomized background in a benchmark set of 296 structures with a false positive rate of 1.4% and a true positive rate of 98.6%. This result is an improvement over the MODTIE potential which had a false positive rate of 28.5% and a true positive rate of 71.5%. The pairwise potentials are also complementary to each other, in the sense that they are efficient on different subsets of the benchmark set. Hence, a combination of the different potentials could result in better prediction accuracy. An attempt towards the development of a 5-body potential based on the pairwise potential was also initiated. Two different versions, an unweighted and a weighted potential were developed. The weighted multi-body potentials performed better than the unweighted potential. These multi-body potentials will be further refined, which is a work in progress. This prediction system will be bundled into a web server in the near future.