Abstract:
Chromatin is packed into basic repeating units called nucleosomes, but how exactly nucleosomes influence gene regulation is not clear. S.cerevisiae has well-positioned nucleosomes throughout its genome, giving us an opportunity to study what positions them and how this regulates gene expression. Previously, the exact relationship between genomic sequence and nucleosome positioning has been hard to interpret given the complex nature by which nucleosomes are regulated by sequence features and chromatin remodelers. Sequence-to-function deep learning models have recently been used to identify complex non-linear patterns, making this a promising approach for learning sequence rules that position nucleosomes. This project leverages one such sequence-to-function deep learning model, BPReveal, to learn the sequence rules underlying genome-wide MNase-seq data. We show that BPReveal correctly learned important nucleosome-positioning sequences without prior knowledge. Since BPReveal has the ability to accurately predict genome-wide MNase-seq data, this study also shows that BPReveal can be used as a tool to design synthetic sequences such that alter nucleosome positioning at a specific locus in a desired fashion. We validated some of these designs experimentally and started to characterise the effect they have on gene expression by employing MS2-MCP based live imaging to detect single mRNAs across many cells. Overall this work is a proof-of-principle study that deep learning models can be used to better understand how DNA sequences position nucleosomes and thereby influence gene regulation.