Abstract:
Breast cancer remains a global health challenge, with over 1.5 million cancer patient and half a million deaths reported annually. The Indian subcontinent has witnessed a high prevalence rate, emphasizing the urgent need for scientific research. Drug resistance poses a significant hurdle in chemotherapy, with over 90% of cancer-related deaths attributed to this phenomenon. Understanding the genetic underpinnings of drug resistance is crucial for improving treatment efficacy and reducing tumor relapse. This study leveraged RNA-Seq data to identify differentially expressed genes of drug-resistant versus sensitive cancer cell lines. Utilizing the Galaxy platform, a comprehensive analysis was conducted, encompassing data collection, alignment, quantification, and differential expression analysis. Furthermore, various unsupervised machine learning techniques, including K-means clustering, Gaussian Mixture Model, and Hierarchical Clustering, were applied to discern underlying structures and patterns within the data. The evaluation of clustering methods was performed using the Davies-Bouldin Index (DBI), a metric assessing the separation and compactness of clusters. The results were employed to benchmark the performance of different clustering techniques across various cancer types and drug treatments. The study focuses on potential of computational models and omics data in predicting drug responses, thereby advancing personalized cancer therapy. This research contributes to the burgeoning field of personalized medicine, providing valuable insights into the genomic basis of drug resistance. By integrating bioinformatics tools, machine learning, and comprehensive data analysis, this study paves the way for more effective cancer treatment strategies, ultimately improving patient outcomes.