Digital Repository

Predicting residue-residue contacts at protein-protein interfaces using surface features - a machine learning approach

Show simple item record

dc.contributor.advisor Sorzano Sánchez, Carlos Óscar
dc.contributor.author UNNI, ADITHYAN
dc.date.accessioned 2023-05-10T04:03:41Z
dc.date.available 2023-05-10T04:03:41Z
dc.date.issued 2023-05
dc.identifier.citation 94 en_US
dc.identifier.uri http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/7807
dc.description.abstract Proteins interact with other macromolecular targets, such as small molecules, nucleic acids, and other proteins, via their surfaces. Protein-protein interactions are likely to be influenced by the geometrical and physicochemical properties of the surfaces of the interacting proteins. To take advantage of this for the purpose of protein-protein interface prediction, we modify BIPSPI, an XGBoost-based partner-specific protein interface predictor, using geometrical and chemical features extracted from protein surfaces in the form of patches by means of MaSIF, a framework for the extraction of meaningful features from the surfaces of proteins. We construct a map from the surface-patch level representation constructed by MaSIF to the residue-pair representation used by BIPSPI. We show that the addition of internally sorted protein surface patches to BIPSPI’s existing residue-pair representation increases the mean ROC-AUC performance of the existing predictor from 0.9153 to 0.9222 when evaluated with 10-fold cross-validation on a subset of Docking Benchmark v5.5. Additionally, we also evaluate the relative impact of the various features used in training on the performance of the combined model in terms of loss reduction over tree splits. We observe that sorting protein surface patches internally along the feature axes increases model performance and alters the relative impacts of various features. Furthermore, to reduce memory consumption while training with protein surface patches, we develop both principal component analysis-based and autoencoder-based approaches to patch compression. We observe that both methods exhibit competitive performance when trained with sorted patches but not unsorted patches. en_US
dc.language.iso en_US en_US
dc.subject Machine learning en_US
dc.subject Protein interactions en_US
dc.title Predicting residue-residue contacts at protein-protein interfaces using surface features - a machine learning approach en_US
dc.type Thesis en_US
dc.description.embargo One Year en_US
dc.type.degree BS-MS en_US
dc.contributor.department Dept. of Biology en_US
dc.contributor.registration 20181002 en_US


Files in this item

This item appears in the following Collection(s)

  • MS THESES [1705]
    Thesis submitted to IISER Pune in partial fulfilment of the requirements for the BS-MS Dual Degree Programme/MSc. Programme/MS-Exit Programme

Show simple item record

Search Repository


Advanced Search

Browse

My Account