Digital Repository

Developing a Biophysically Grounded Deep Learning Model for Gene Expression Prediction

Show simple item record

dc.contributor.advisor Martinez-Corral, Rosa
dc.contributor.author SHIVHARE, PRARABDH
dc.date.accessioned 2025-05-27T10:38:28Z
dc.date.available 2025-05-27T10:38:28Z
dc.date.issued 2025-05
dc.identifier.citation 58 en_US
dc.identifier.uri http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/10101
dc.description.abstract In the past 2 decades of literature dealing with modeling complex systems, there has been a balance, or rather, a tension between the predictive power and the interpretability of machine learning models using vast amounts of data. Biological complex systems are no different. The past decade has seen an astonishing increase in the amount of publicly available functional genomics data. While the adoption of deep learning techniques to determine the sequence patterns, syntax and grammar in DNA sequence elements that govern gene regulatory activity has been a natural consequence, most of these investigations have adopted a ‘black box’ approach, with model predictions that are hard to interpret mechanistically. Multiple attribution strategies, which seek to extract meaningful post-hoc interpretations from neural networks have been proposed for addressing this problem. However, there remains a substantial gap in the literature between the outputs of such post-hoc methods and fully mechanistic models, specifically in the context of gene regulation. This problem can be at least partially overcome by including some level of mechanistic detail in the internal structure of deep learning algorithms. This can enable us to better understand the predictions of the model to obtain mechanistic insight. Here, we use a cell-state specific Massively Paraellel Reporter Assay dataset from hemotopoeitic stem cells to model gene regulation using deep learning to predict transcription factor (TF) binding on DNA sequence employing cell-state specific Chip-Seq data and graph-based representations of markov processes to model effects of bound TFs on different rate-limiting steps in the transcriptional cycle. Our model assumptions are grounded in recent biophysical findings in literature. en_US
dc.description.sponsorship Dr. Rosa Martinez-Corral, Dr. Lars Velten en_US
dc.language.iso en en_US
dc.subject Deep Learning en_US
dc.subject Biophysics en_US
dc.title Developing a Biophysically Grounded Deep Learning Model for Gene Expression Prediction en_US
dc.type Thesis en_US
dc.description.embargo One Year en_US
dc.type.degree BS-MS en_US
dc.contributor.department Dept. of Data Science en_US
dc.contributor.registration 20201147 en_US


Files in this item

This item appears in the following Collection(s)

  • MS THESES [1969]
    Thesis submitted to IISER Pune in partial fulfilment of the requirements for the BS-MS Dual Degree Programme/MSc. Programme/MS-Exit Programme

Show simple item record

Search Repository


Advanced Search

Browse

My Account