Abstract:
Recent advances in Large Language Models (LLMs) have significantly improved the ability to translate natural language questions (NLQs) into executable SQL queries (this task is often referred to as Text-to-SQL or NL-to-SQL). While many studies focus on general purpose benchmarks, enterprise environments present a distinct setting in which databases are accompanied by historical SQL query logs (a collection of SQL queries that were fired on the database), commonly referred to as SQL workloads. These workloads capture domain specific query patterns and represent a valuable yet often underutilized source of supervision for improving Text-to-SQL systems. However, in practical enterprise deployments, labeled workloads are typically scarce, often consisting of only a few dozen NLQ–SQL pairs. Existing approaches predominantly rely on in-context learning (ICL) with proprietary LLMs to exploit such limited data. Although effective, these approaches incur high operational cost due to repeated API calls and, more often than not, high inference latency too. In this thesis, we investigate whether open-source Text-to-SQL task-aware Small Language Models (SLMs) can be adapted more efficiently using the available workload data. We propose a workload-adaptive strategy that systematically selects the most suitable combination of model class (LLM vs. SLM) and learning paradigm among three alternatives: ICL (through few-shot prompting) with an LLM, supervised fine-tuning (SFT) of an SLM on the workload, and SFT augmented with synthetic data generated by an LLM. The strategy relies on workload-based diagnostic evaluation to guide this selection. Experiments conducted on fourteen databases under simulated scarce-workload conditions demonstrate that the proposed strategy consistently identifies effective adaptation pathways. In the majority of cases, fine-tuned task-aware SLMs outperform ICL-based LLM approaches while achieving significantly lower inference latency. When the available workload is insufficient for effective fine-tuning, augmenting the workload with carefully generated synthetic data further improves performance. These results highlight the potential of workload-driven adaptation as a practical and efficient approach for deploying Text-to-SQL systems in enterprise environments.