Accurate prediction of B-form/A-form DNA conformation propensity from primary sequence: A machine learning and free energy handshake

GUPTA, ABHIJIT; Kulkarni, Mandar; MUKHERJEE, ARNAB

DR Home
→
PUBLICATIONS & PATENTS
→
JOURNAL ARTICLES
→
View Item

Accurate prediction of B-form/A-form DNA conformation propensity from primary sequence: A machine learning and free energy handshake

GUPTA, ABHIJIT; Kulkarni, Mandar; MUKHERJEE, ARNAB

URI: https://doi.org/10.1016/j.patter.2021.100329 http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/7036 Date: 2021-09

Abstract:

DNA carries the genetic code of life, with different conformations associated with different biological functions. Predicting the conformation of DNA from its primary sequence, although desirable, is a challenging problem owing to the polymorphic nature of DNA. We have deployed a host of machine learning algorithms, including the popular state-of-the-art LightGBM (a gradient boosting model), for building prediction models. We used the nested cross-validation strategy to address the issues of “overfitting” and selection bias. This simultaneously provides an unbiased estimate of the generalization performance of a machine learning algorithm and allows us to tune the hyperparameters optimally. Furthermore, we built a secondary model based on SHAP (SHapley Additive exPlanations) that offers crucial insight into model interpretability. Our detailed model-building strategy and robust statistical validation protocols tackle the formidable challenge of working on small datasets, which is often the case in biological and medical data.

Show full item record

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

JOURNAL ARTICLES [6722]
Journal articles published by IISER Pune community

Search Repository

Advanced Search

Browse

All of Repository
This Collection
- Titles
- Authors
- By Advisor
- By Issue Date
- Subjects
- By Type
- By Department

Accurate prediction of B-form/A-form DNA conformation propensity from primary sequence: A machine learning and free energy handshake

Accurate prediction of B-form/A-form DNA conformation propensity from primary sequence: A machine learning and free energy handshake

Abstract:

Files in this item

This item appears in the following Collection(s)

Search Repository

Browse

All of Repository

This Collection

My Account