Please use this identifier to cite or link to this item: http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/7807
Title: Predicting residue-residue contacts at protein-protein interfaces using surface features - a machine learning approach
Authors: Sorzano Sánchez, Carlos Óscar
UNNI, ADITHYAN
Dept. of Biology
20181002
Keywords: Machine learning
Protein interactions
Issue Date: May-2023
Citation: 94
Abstract: Proteins interact with other macromolecular targets, such as small molecules, nucleic acids, and other proteins, via their surfaces. Protein-protein interactions are likely to be influenced by the geometrical and physicochemical properties of the surfaces of the interacting proteins. To take advantage of this for the purpose of protein-protein interface prediction, we modify BIPSPI, an XGBoost-based partner-specific protein interface predictor, using geometrical and chemical features extracted from protein surfaces in the form of patches by means of MaSIF, a framework for the extraction of meaningful features from the surfaces of proteins. We construct a map from the surface-patch level representation constructed by MaSIF to the residue-pair representation used by BIPSPI. We show that the addition of internally sorted protein surface patches to BIPSPI’s existing residue-pair representation increases the mean ROC-AUC performance of the existing predictor from 0.9153 to 0.9222 when evaluated with 10-fold cross-validation on a subset of Docking Benchmark v5.5. Additionally, we also evaluate the relative impact of the various features used in training on the performance of the combined model in terms of loss reduction over tree splits. We observe that sorting protein surface patches internally along the feature axes increases model performance and alters the relative impacts of various features. Furthermore, to reduce memory consumption while training with protein surface patches, we develop both principal component analysis-based and autoencoder-based approaches to patch compression. We observe that both methods exhibit competitive performance when trained with sorted patches but not unsorted patches.
URI: http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/7807
Appears in Collections:MS THESES

Files in This Item:
File Description SizeFormat 
20181002_Adithyan_Unni_MS_Thesis.pdfMS Thesis3.92 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.