Please use this identifier to cite or link to this item: http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/11108
Title: Forecasting Research Success through Learned Comparison of Scientific Ideas
Authors: Patwardhan, Manasi
MULE, SRUJAN PRAKASH
Dept. of Data Science
20211245
Keywords: AI for Scientific Discovery
Automated Scientific Discovery
Comparative Empirical Forecasting
Research Idea Evaluation
Large Language Models (LLMs)
Small Language Models (SLMs)
Reinforcement Learning
Interpretable Reasoning
Scientific Benchmarking
Issue Date: May-2026
Citation: 126
Abstract: As generative language models (LMs) accelerate scientific research by automating hypothesis generation, a new bottleneck emerges: evaluating and filtering hundreds of LM generated ideas without exhaustive experimentation. This work asks whether LMs can learn to judge the empirical success of research ideas before any experiments are run. This thesis studies comparative empirical forecasting: given a benchmark-specific research goal and two candidate ideas, predict which will achieve better leaderboard performance. A dataset of 11,488 idea pairs grounded in objective outcomes from PapersWithCode is created for this task. While untrained 8B-parameter models struggle (≈30% accuracy), Supervised Fine-Tuning dramatically boosts performance to 77.1%, significantly outperforming frontier models like GPT-5 (61.1%). By framing evaluation as a reasoning task via Reinforcement Learning with Verifiable Rewards, models are trained to discover latent reasoning paths, achieving 71.35% accuracy with interpretable justifications. Crucially, these RL-trained variants demonstrate superior cross-domain generalization, achieving 67.49% on an independent test set and surpassing a zero-shot retrieval-augmented GPT-4.1 system by 16 percentage points. These results demonstrate that compute-efficient small language models can show potential as effective, objective verifiers, offering a scalable path for autonomous scientific discovery.
URI: http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/11108
Appears in Collections:MS THESES

Files in This Item:
File Description SizeFormat 
20211245_Srujan_Prakash_Mule_MS_Thesis.pdfMS Thesis5.89 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.