Please use this identifier to cite or link to this item: http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/9232
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorBasu, Tanmay-
dc.contributor.authorSADHU, PRANAV-
dc.date.accessioned2024-12-16T07:17:21Z-
dc.date.available2024-12-16T07:17:21Z-
dc.date.issued2024-12-
dc.identifier.citation39en_US
dc.identifier.urihttp://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/9232-
dc.description.abstractPrecise inference of gene interaction networks is crucial for understanding complex biological processes and disease mechanisms. Traditional methods often rely on curated databases, which may overlook important but undis covered interactions. Various deep learning based approaches have been developed over the last two years to address this issue. However, most of these method are either database specific or consider specific kinds of genes. Therefore, a deep learning-based transformer model is introduced in this the sis to predict missing edges in gene interaction networks using the existing databases. The proposed method integrates heterogeneous gene interaction data with microarray expression data, leveraging the attention mechanisms in transformer models to uncover intricate relationships. In the first stage, the model processes a candidate gene’s one-hot encoding and its microarray expression values, constructing a fully connected network to generate individual embeddings, each of size d. These embeddings, concatenated into a vector of size 2d are passed through a standard transformer encoder, which reduces them to d-dimensional embeddings to extract significant information from both gene identity and expression. In the second stage, these transformer-generated embeddings for gene pairs are used to train an SVM classifier. The input to the classifier is the element-wise product of a gene pair’s embeddings, along with their known interaction labels. The performance of the proposed model is compared with the state of the arts in terms of AUC-ROC and AUPR using seven standard datasets, each corresponding to cell-type-specific ChIP-seq, Non-Specific ChIP-seq and STRING dataset ground truth networks. The empirical analysis shows that the proposed one outperforms the state of the arts, which indicates its potential for predicting new or undocumented interactions in biological networks. In future, the performance and scalability of the model need to be tested on various other types of reasonably large networks.en_US
dc.language.isoenen_US
dc.subjectGene Regulatory Networken_US
dc.subjectGene Embeddingsen_US
dc.subjectTransformer encoderen_US
dc.subjectLink Predictionen_US
dc.titleGeneNet Transformer : A Novel Transformer-Based Architecture for Gene Network Inferenceen_US
dc.typeThesisen_US
dc.description.embargoOne Yearen_US
dc.type.degreeBS-MSen_US
dc.contributor.departmentDept. of Biologyen_US
dc.contributor.registration20191152en_US
Appears in Collections:MS THESES

Files in This Item:
File Description SizeFormat 
20191152_Pranav_Sadhu_MS_ThesisMS Thesis934.08 kBAdobe PDFView/Open    Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.