Please use this identifier to cite or link to this item: http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/6020
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorPriyakumar, U Devaen_US
dc.contributor.authorBAGAL, VIRAJen_US
dc.date.accessioned2021-07-06T05:42:51Z
dc.date.available2021-07-06T05:42:51Z
dc.date.issued2021-07
dc.identifier.citation51en_US
dc.identifier.urihttp://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/6020
dc.descriptionIn this thesis, we propose LigGPT, a transformer decoder model for conditional molecular generation. The main component of the model is the masked self-attention mechanism that allows it to learn the SMILES grammar and long range dependencies very well. LigGPT has comparable validity and uniqueness scores to other models on the MOSES dataset. It outperforms other models in terms of novelty and internal diversity on the MOSES dataset. The model performs better than other models on the GuacaMol dataset. Using saliency maps we show that the generative process of model is interpretable. LigGPT is more efficient than the famous character based recurrent neural network as is evident by training on only ten percent of data. Apart from unconditional generation, we show the ability of LigGPT to generate molecules based on properties. Moreover, it can also be trained to retain the scaffold structure while generating molecules having desired values of certain properties. This can have tremendous applications in any sector which involves the creation of novel molecules. We even demon- strate LigGPT’s usage in one shot lead optimization. Consequently, LigGPT is a strong model and has the capability of making a positive impact on real world application for molecular generation.en_US
dc.description.abstractDeep learning is being widely used for de novo generation of molecules. Molecules can be represented in the form of string of characters, SMILES representation, which allows the implementation of transformer architectures. In this work, we propose a transformer decoder based network for the generation of molecules with high validity, uniqueness and novelty. The proposed model is capable of conditional generation where the condition can be based on a scaffold or/and multiple physicochemical properties. Moreover, we show that saliency maps can be used to make the generative process interpretable.en_US
dc.language.isoen_USen_US
dc.subjectdeep learningen_US
dc.subjectmolecule generationen_US
dc.subjectnatural language generationen_US
dc.subjectinterpretabilityen_US
dc.subjectconditional generationen_US
dc.subjectlead optimizationen_US
dc.subjectself supervised learningen_US
dc.titleConditional Molecule Generation Using Transformer Decoderen_US
dc.typeThesisen_US
dc.type.degreeBS-MSen_US
dc.contributor.departmentDept. of Chemistryen_US
dc.contributor.registration20161150en_US
Appears in Collections:MS THESES

Files in This Item:
File Description SizeFormat 
viraj_master_thesis_final.pdf3.41 MBAdobe PDFView/Open    Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.