dc.contributor.author |
GUPTA, ABHIJIT |
en_US |
dc.contributor.author |
Kulkarni, Mandar |
en_US |
dc.contributor.author |
MUKHERJEE, ARNAB |
en_US |
dc.date.accessioned |
2022-06-13T04:29:20Z |
|
dc.date.available |
2022-06-13T04:29:20Z |
|
dc.date.issued |
2021-09 |
en_US |
dc.identifier.citation |
Patterns, 2(9), 100329. |
en_US |
dc.identifier.issn |
2666-3899 |
en_US |
dc.identifier.uri |
https://doi.org/10.1016/j.patter.2021.100329 |
en_US |
dc.identifier.uri |
http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/7036 |
|
dc.description.abstract |
DNA carries the genetic code of life, with different conformations associated with different biological functions. Predicting the conformation of DNA from its primary sequence, although desirable, is a challenging problem owing to the polymorphic nature of DNA. We have deployed a host of machine learning algorithms, including the popular state-of-the-art LightGBM (a gradient boosting model), for building prediction models. We used the nested cross-validation strategy to address the issues of “overfitting” and selection bias. This simultaneously provides an unbiased estimate of the generalization performance of a machine learning algorithm and allows us to tune the hyperparameters optimally. Furthermore, we built a secondary model based on SHAP (SHapley Additive exPlanations) that offers crucial insight into model interpretability. Our detailed model-building strategy and robust statistical validation protocols tackle the formidable challenge of working on small datasets, which is often the case in biological and medical data. |
en_US |
dc.language.iso |
en |
en_US |
dc.publisher |
Elsevier B.V. |
en_US |
dc.subject |
Machine learning |
en_US |
dc.subject |
Light GBM |
en_US |
dc.subject |
DNA sequence |
en_US |
dc.subject |
DNA conformation |
en_US |
dc.subject |
Nested cross-validation |
en_US |
dc.subject |
Genome |
en_US |
dc.subject |
2021 |
|
dc.title |
Accurate prediction of B-form/A-form DNA conformation propensity from primary sequence: A machine learning and free energy handshake |
en_US |
dc.type |
Article |
en_US |
dc.contributor.department |
Dept. of Chemistry |
en_US |
dc.identifier.sourcetitle |
Patterns |
en_US |
dc.publication.originofpublisher |
Foreign |
en_US |