Digital Repository

Leveraging Deep Learning Models to Study Enhancer Evolution

Show simple item record

dc.contributor.advisor Zeitlinger, Julia
dc.contributor.author KOUNDINYA, AMRUTHAMSHU
dc.date.accessioned 2026-05-22T06:21:17Z
dc.date.available 2026-05-22T06:21:17Z
dc.date.issued 2026-05
dc.identifier.citation 88 en_US
dc.identifier.uri http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/11137
dc.description.abstract Gene regulation is fundamental to shaping morphological diversity, driven by cis-regulatory regions containing transcription factors motifs to orchestrate precise gene expression. While protein-coding genes are very well conserved in sequence, cis-regulatory regions are subject to strong sequence divergence. How enhancer sequences change while maintaining their function is not clear. Recent research has shown that chromatin accessibility depends on motif cooperativity that follows a flexible motif syntax and includes low-affinity motifs, providing a possible avenue by which motifs arise de novo and diverge over time. To test if evolutionary selection occurs at the level of chromatin accessibility, we used Drosophila trichome development as a model system. We comprehensively mapped the chromatin accessibility landscape across several Drosophila species by performing ATAC-seq on D. melanogaster , D. erecta, D. ananassae, and D. mojavensis embryos at the appropriate stage. This revealed that, despite considerable sequence divergence, the amount of chromatin accessibility in regulatory regions is highly conserved across species, consistent with evolutionary selection at this level. To precisely identify in an unbiased way which motifs and motif cooperativity rules drive the levels of chromatin accessibility in each species, we trained BPReveal deep learning models to predict bias-free accessibility profiles from DNA sequence. Interpreting these models revealed that strong sequence divergence between species is associated with a high turnover of individual motif instances across orthologous regions, as well as changes in motif affinity. Nevertheless, the type of motifs and their syntax rules are largely conserved across species, suggesting that the trans-environment of transcription factors is conserved. Consistent with this, models trained on one species perform well in predicting the ATAC-seq data from another species, with only small losses in performance with larger evolutionary distances. This suggests that cis-regulatory regions are not only subject to strong sequence divergence, but also change in the way they encode chromatin accessibility over evolutionary time. Since the chromatin accessibility levels are under strong evolutionary selection, these results suggest that cis-regulatory regions diverge rapidly because sequence changes have a relatively high probability of producing similar amounts of chromatin accessibility through an alternative sequence encoding. Taken together, our data support the hypotheses that the highly flexible sequence rules of chromatin accessibility are a facilitator of cis-regulatory sequence evolution. en_US
dc.description.sponsorship Stowers Institute for Medical Research en_US
dc.language.iso en en_US
dc.subject Genomics en_US
dc.subject Evolution en_US
dc.subject Cis regulatory rules en_US
dc.subject Deep learning en_US
dc.subject chromatin accessibility en_US
dc.subject Gene regulation en_US
dc.subject Transcription factor en_US
dc.title Leveraging Deep Learning Models to Study Enhancer Evolution en_US
dc.type Thesis en_US
dc.description.embargo No Embargo en_US
dc.type.degree BS-MS en_US
dc.contributor.department Dept. of Biology en_US
dc.contributor.registration 20211224 en_US


Files in this item

This item appears in the following Collection(s)

  • MS THESES [2219]
    Thesis submitted to IISER Pune in partial fulfilment of the requirements for the BS-MS Dual Degree Programme/MSc. Programme/MS-Exit Programme

Show simple item record

Search Repository


Advanced Search

Browse

My Account