Forecasting Research Success through Learned Comparison of Scientific Ideas

MULE, SRUJAN PRAKASH

DR Home
→
THESES & PROJECT REPORTS
→
MS THESES
→
View Item

dc.contributor.advisor	Patwardhan, Manasi
dc.contributor.author	MULE, SRUJAN PRAKASH
dc.date.accessioned	2026-05-21T07:19:11Z
dc.date.available	2026-05-21T07:19:11Z
dc.date.issued	2026-05
dc.identifier.citation	126	en_US
dc.identifier.uri	http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/11108
dc.description.abstract	As generative language models (LMs) accelerate scientific research by automating hypothesis generation, a new bottleneck emerges: evaluating and filtering hundreds of LM generated ideas without exhaustive experimentation. This work asks whether LMs can learn to judge the empirical success of research ideas before any experiments are run. This thesis studies comparative empirical forecasting: given a benchmark-specific research goal and two candidate ideas, predict which will achieve better leaderboard performance. A dataset of 11,488 idea pairs grounded in objective outcomes from PapersWithCode is created for this task. While untrained 8B-parameter models struggle (≈30% accuracy), Supervised Fine-Tuning dramatically boosts performance to 77.1%, significantly outperforming frontier models like GPT-5 (61.1%). By framing evaluation as a reasoning task via Reinforcement Learning with Verifiable Rewards, models are trained to discover latent reasoning paths, achieving 71.35% accuracy with interpretable justifications. Crucially, these RL-trained variants demonstrate superior cross-domain generalization, achieving 67.49% on an independent test set and surpassing a zero-shot retrieval-augmented GPT-4.1 system by 16 percentage points. These results demonstrate that compute-efficient small language models can show potential as effective, objective verifiers, offering a scalable path for autonomous scientific discovery.	en_US
dc.description.sponsorship	TCS Research	en_US
dc.language.iso	en	en_US
dc.subject	AI for Scientific Discovery	en_US
dc.subject	Automated Scientific Discovery	en_US
dc.subject	Comparative Empirical Forecasting	en_US
dc.subject	Research Idea Evaluation	en_US
dc.subject	Large Language Models (LLMs)	en_US
dc.subject	Small Language Models (SLMs)	en_US
dc.subject	Reinforcement Learning	en_US
dc.subject	Interpretable Reasoning	en_US
dc.subject	Scientific Benchmarking	en_US
dc.title	Forecasting Research Success through Learned Comparison of Scientific Ideas	en_US
dc.type	Thesis	en_US
dc.description.embargo	No Embargo	en_US
dc.type.degree	BS-MS	en_US
dc.contributor.department	Dept. of Data Science	en_US
dc.contributor.registration	20211245	en_US

Files in this item

Name: 20211245_Srujan_P ...

Size: 5.756Mb

Format: PDF

Description: MS Thesis

View/Open

This item appears in the following Collection(s)

MS THESES [2219]
Thesis submitted to IISER Pune in partial fulfilment of the requirements for the BS-MS Dual Degree Programme/MSc. Programme/MS-Exit Programme

Show simple item record

Search Repository

Advanced Search

Browse

All of Repository
This Collection
- Titles
- Authors
- By Advisor
- By Issue Date
- Subjects
- By Type
- By Department

Forecasting Research Success through Learned Comparison of Scientific Ideas

Files in this item

This item appears in the following Collection(s)

Search Repository

Browse

All of Repository

This Collection

My Account