dc.description.abstract |
Precise inference of gene interaction networks is crucial for understanding complex biological processes and disease mechanisms. Traditional methods often rely on curated databases, which may overlook important but undis covered interactions. Various deep learning based approaches have been developed over the last two years to address this issue. However, most of these method are either database specific or consider specific kinds of genes. Therefore, a deep learning-based transformer model is introduced in this the sis to predict missing edges in gene interaction networks using the existing databases. The proposed method integrates heterogeneous gene interaction data with microarray expression data, leveraging the attention mechanisms in transformer models to uncover intricate relationships. In the first stage, the model processes a candidate gene’s one-hot encoding and its microarray expression values, constructing a fully connected network to generate individual embeddings, each of size d. These embeddings, concatenated into a vector of size 2d are passed through a standard transformer encoder, which reduces them to d-dimensional embeddings to extract significant information from both gene identity and expression. In the second stage, these transformer-generated embeddings for gene pairs are used to train an SVM classifier. The input to the classifier is the element-wise product of a gene pair’s embeddings, along with their known interaction labels.
The performance of the proposed model is compared with the state of the arts in terms of AUC-ROC and AUPR using seven standard datasets, each corresponding to cell-type-specific ChIP-seq, Non-Specific ChIP-seq and STRING dataset ground truth networks. The empirical analysis shows that the proposed one outperforms the state of the arts, which indicates its potential for predicting new or undocumented interactions in biological networks. In future, the performance and scalability of the model need to be tested on various other types of reasonably large networks. |
en_US |