Please use this identifier to cite or link to this item: http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/7036
Full metadata record
DC FieldValueLanguage
dc.contributor.authorGUPTA, ABHIJITen_US
dc.contributor.authorKulkarni, Mandaren_US
dc.contributor.authorMUKHERJEE, ARNABen_US
dc.date.accessioned2022-06-13T04:29:20Z
dc.date.available2022-06-13T04:29:20Z
dc.date.issued2021-09en_US
dc.identifier.citationPatterns, 2(9), 100329.en_US
dc.identifier.issn2666-3899en_US
dc.identifier.urihttps://doi.org/10.1016/j.patter.2021.100329en_US
dc.identifier.urihttp://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/7036
dc.description.abstractDNA carries the genetic code of life, with different conformations associated with different biological functions. Predicting the conformation of DNA from its primary sequence, although desirable, is a challenging problem owing to the polymorphic nature of DNA. We have deployed a host of machine learning algorithms, including the popular state-of-the-art LightGBM (a gradient boosting model), for building prediction models. We used the nested cross-validation strategy to address the issues of “overfitting” and selection bias. This simultaneously provides an unbiased estimate of the generalization performance of a machine learning algorithm and allows us to tune the hyperparameters optimally. Furthermore, we built a secondary model based on SHAP (SHapley Additive exPlanations) that offers crucial insight into model interpretability. Our detailed model-building strategy and robust statistical validation protocols tackle the formidable challenge of working on small datasets, which is often the case in biological and medical data.en_US
dc.language.isoenen_US
dc.publisherElsevier B.V.en_US
dc.subjectMachine learningen_US
dc.subjectLight GBMen_US
dc.subjectDNA sequenceen_US
dc.subjectDNA conformationen_US
dc.subjectNested cross-validationen_US
dc.subjectGenomeen_US
dc.subject2021
dc.titleAccurate prediction of B-form/A-form DNA conformation propensity from primary sequence: A machine learning and free energy handshakeen_US
dc.typeArticleen_US
dc.contributor.departmentDept. of Chemistryen_US
dc.identifier.sourcetitlePatternsen_US
dc.publication.originofpublisherForeignen_US
Appears in Collections:JOURNAL ARTICLES

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.