Please use this identifier to cite or link to this item:
http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/11097| Title: | Benchmarking Machine Learning Force Fields via Energy Landscape Exploration |
| Authors: | Poltavsky, Igor Tkatchenko, Alexandre SHARMA, ANAND Dept. of Physics 20211055 |
| Keywords: | Machine Learning Force Fields Atomistic Simulations Computational Chemistry Interatomic Potential Benchmarking |
| Issue Date: | May-2026 |
| Citation: | 81 |
| Abstract: | General-purpose machine learning force fields (GP-MLFFs) have emerged as a transformative approach in computational chemistry and materials science, combining near-quantum-mechanical accuracy with the computational efficiency of classical force fields for molecular dynamics simulation. However, ensuring the reliability of GP-MLFF predictions beyond their training regime remains a central and largely unresolved challenge. Traditional benchmarking approaches evaluate models on fixed test datasets, which are fundamentally limited in their ability to probe model behavior under genuine extrapolation, as no fixed dataset can adequately sample the vast configurational space a model may encounter during molecular dynamics simulations or when applied to novel chemical systems. In this work, we introduce a general, system- and model-agnostic benchmarking framework that directly this limitation. Rather than relying on predefined test sets, the framework evaluates a GP-MLFF's ability to represent the chemical space of local bonding motifs by using the model itself to generate molecular structures through relaxing randomly initialized atomic configuration. The resulting structures are evaluated through comparison with reference ab initio calculations, model's training data, and cross-model validation, providing both quantitative accuracy metrics and a model-agnostic measure of chemical plausibility. The framework is demonstrated on two state-of-the-art SO3-equivariant GP-MLFFs applied to the chemical space of H, C, N, and O atoms. The results reveal pronounced differences in generative behavior, chemical diversity, and force prediction accuracy between the two models, potentially traceable to their distinct training data compositions. The framework successfully probes extrapolative regimes, identifying model's bias and failure modes that traditional fixed-dataset benchmarks cannot detect. The presented framework offers a practical and extensible approach for evaluating GP-MLFF reliability beyond interpolative accuracy, with direct applications to active learning, training data augmentation, chemical space exploration, and systematic identification of model failure modes across chemical space. |
| URI: | http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/11097 |
| Appears in Collections: | MS THESES |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| 20211055_ANAND_SHARMA_MS_Thesis.pdf | MS Thesis | 2.51 MB | Adobe PDF | View/Open Request a copy |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.