Digital Repository

Forecasting Research Success through Learned Comparison of Scientific Ideas

Show simple item record

dc.contributor.advisor Patwardhan, Manasi
dc.contributor.author MULE, SRUJAN PRAKASH
dc.date.accessioned 2026-05-21T07:19:11Z
dc.date.available 2026-05-21T07:19:11Z
dc.date.issued 2026-05
dc.identifier.citation 126 en_US
dc.identifier.uri http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/11108
dc.description.abstract As generative language models (LMs) accelerate scientific research by automating hypothesis generation, a new bottleneck emerges: evaluating and filtering hundreds of LM generated ideas without exhaustive experimentation. This work asks whether LMs can learn to judge the empirical success of research ideas before any experiments are run. This thesis studies comparative empirical forecasting: given a benchmark-specific research goal and two candidate ideas, predict which will achieve better leaderboard performance. A dataset of 11,488 idea pairs grounded in objective outcomes from PapersWithCode is created for this task. While untrained 8B-parameter models struggle (≈30% accuracy), Supervised Fine-Tuning dramatically boosts performance to 77.1%, significantly outperforming frontier models like GPT-5 (61.1%). By framing evaluation as a reasoning task via Reinforcement Learning with Verifiable Rewards, models are trained to discover latent reasoning paths, achieving 71.35% accuracy with interpretable justifications. Crucially, these RL-trained variants demonstrate superior cross-domain generalization, achieving 67.49% on an independent test set and surpassing a zero-shot retrieval-augmented GPT-4.1 system by 16 percentage points. These results demonstrate that compute-efficient small language models can show potential as effective, objective verifiers, offering a scalable path for autonomous scientific discovery. en_US
dc.description.sponsorship TCS Research en_US
dc.language.iso en en_US
dc.subject AI for Scientific Discovery en_US
dc.subject Automated Scientific Discovery en_US
dc.subject Comparative Empirical Forecasting en_US
dc.subject Research Idea Evaluation en_US
dc.subject Large Language Models (LLMs) en_US
dc.subject Small Language Models (SLMs) en_US
dc.subject Reinforcement Learning en_US
dc.subject Interpretable Reasoning en_US
dc.subject Scientific Benchmarking en_US
dc.title Forecasting Research Success through Learned Comparison of Scientific Ideas en_US
dc.type Thesis en_US
dc.description.embargo No Embargo en_US
dc.type.degree BS-MS en_US
dc.contributor.department Dept. of Data Science en_US
dc.contributor.registration 20211245 en_US


Files in this item

This item appears in the following Collection(s)

  • MS THESES [2219]
    Thesis submitted to IISER Pune in partial fulfilment of the requirements for the BS-MS Dual Degree Programme/MSc. Programme/MS-Exit Programme

Show simple item record

Search Repository


Advanced Search

Browse

My Account