Graph Representation Learning  for Binding Pocket Prediction in Proteins

KARAMPURI, YASH

DR Home
→
THESES & PROJECT REPORTS
→
MS THESES
→
View Item

dc.contributor.advisor	Laha, Arnab K.
dc.contributor.author	KARAMPURI, YASH
dc.date.accessioned	2025-05-14T04:15:45Z
dc.date.available	2025-05-14T04:15:45Z
dc.date.issued	2025-05
dc.identifier.citation	125	en_US
dc.identifier.uri	http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/9837
dc.description	All Python scripts used throughout this research, including those related to the supplementary studies, are available in the following GitHub repository: https://github.com/YashKarampuri/MSThesis-Supplementary.	en_US
dc.description.abstract	This study explores several key concepts in graph-based learning and applies them to the problem of ligand-binding pocket prediction and clustering on protein surfaces. First, we investigate graph embedding techniques, scalable feature learning with algorithms like Node2vec, and graph representation learning methods. We then explore neighborhood reconstruction methods and how multi-relational data and knowledge graphs can be used, building a solid foundation for applying graph-based techniques to biological data. Next, we focus on the problem of predicting and clustering ligand-binding pockets on protein surfaces. Using a graph-based approach, we first generate a set of evenly spaced points on the protein’s Solvent Accessible Surface (SAS) with a fast algorithm from the CDK library. For each point, we calculate feature descriptors based on the local chemical environment, including properties of solvent-exposed atoms, distance-weighted properties of nearby atoms (within 6A), and other neighborhood features. These descriptors are used to predict ligandability scores through Graph Neural Networks (GNN) and Graph Convolutional Networks (GCN). Points with high ligandability scores are then clustered using single-linkage clustering with a 3A cut-off to form pocket predictions. The predicted pockets are ranked by their cumulative ligandability scores. This method provides an efficient framework for identifying potential ligand-binding pockets, contributing to drug discovery and protein-ligand interaction studies.	en_US
dc.language.iso	en_US	en_US
dc.subject	Graph Neural Networks (GNNs)	en_US
dc.subject	Graph Representation Learning	en_US
dc.subject	Protein-Ligand Interaction	en_US
dc.subject	Structure-Based Drug Design	en_US
dc.subject	Node and Graph Embeddings	en_US
dc.title	Graph Representation Learning for Binding Pocket Prediction in Proteins	en_US
dc.title.alternative	Geometric Pre-Training of GNNs for Structure-Based Drug Design	en_US
dc.type	Thesis	en_US
dc.description.embargo	No Embargo	en_US
dc.type.degree	BS-MS	en_US
dc.contributor.department	Dept. of Mathematics	en_US
dc.contributor.registration	20201105	en_US

Files in this item

Name: 20201105_Karampuri ...

Size: 2.021Mb

Format: PDF

Description: MS Thesis

View/Open

This item appears in the following Collection(s)

MS THESES [1970]
Thesis submitted to IISER Pune in partial fulfilment of the requirements for the BS-MS Dual Degree Programme/MSc. Programme/MS-Exit Programme

Show simple item record

Search Repository

Advanced Search

Browse

All of Repository
This Collection
- Titles
- Authors
- By Advisor
- By Issue Date
- Subjects
- By Type
- By Department

Graph Representation Learning for Binding Pocket Prediction in Proteins

Files in this item

This item appears in the following Collection(s)

Search Repository

Browse

All of Repository

This Collection

My Account