Design and application of scalable machine learning algorithms in molecular recognition, structure prediction and drug discovery

GUPTA, ABHIJIT

Please use this identifier to cite or link to this item: http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/6626

Title:	Design and application of scalable machine learning algorithms in molecular recognition, structure prediction and drug discovery
Authors:	MUKHERJEE, ARNAB GUPTA, ABHIJIT Dept. of Chemistry 20152021
Keywords:	Machine Learning algorithm HPC drug discovery structure prediction deep learning self-supervised learning molecular recognition
Issue Date:	Aug-2021
Citation:	159
Abstract:	Starting with the problem of structure prediction, we leveraged machine learning to predict DNA conformation from its sequence accurately. We developed an end-to-end data-driven approach using machine learning and free energy calculations to offer a fresh perspective on this long-standing problem. Besides accurately predicting the DNA conformation, our model also explains why certain sequences adopt a particular conformation. Transitioning from the DNA to the world of proteins, we employed unsupervised learning (called hierarchical clustering) and our algebraic fitting algorithm to study the surface curvature of protein surfaces. We later used surface curvature to assess the shape complementarity among the interacting biomolecules, intending to devise a scoring algorithm for the fast selection of binders with complimentary curvature for a particular active site. To find out the binding mechanism at the molecular level, one needs to identify the appropriate reaction coordinate. Therefore, our next endeavour was to devised a novel approach based on regularized sparse autoencoders – an energy-based model, to predict a useful and physically intuitive set of reaction coordinates. Although finding strong binders is the first step towards finding a drug, it is not the most crucial step since all the binders to a receptor can not be characterized as drugs, which have to satisfy certain conditions called ADME condition. Therefore, finally, we tried to address this significant problem – “what makes a molecule a putative drug ?”. We used representation learning in conjunction with modern graph neural network architectures to learn and predict crucial attributes behind the prospective drug-like activity. Overall, the goal of the studies carried out in the thesis is to find a fast selection of putative drugs.
URI:	http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/6626
Appears in Collections:	PhD THESES

Files in This Item:

File	Description	Size	Format
20152021_Abhijit_Gupta.pdf	Ph.D Thesis	6.6 MB	Adobe PDF	View/Open

Show full item record