Alpha-SQL: Zero-Shot Text-to-SQL using Monte Carlo Tree Search
Authors: Boyan Li, Jiayi Zhang, Ju Fan, Yanwei Xu, Chong Chen, Nan Tang, Yuyu Luo
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that Alpha-SQL achieves 69.7% execution accuracy on the BIRD development set, using a 32B open-source LLM without fine-tuning. |
| Researcher Affiliation | Collaboration | 1The Hong Kong University of Science and Technology (Guangzhou) 2Renmin University of China 3Huawei Technologies Ltd. |
| Pseudocode | Yes | The Alpha-SQL algorithm, as outlined in Algorithm 1, operates in multiple phases: Selection, Expansion, Simulation, and Backpropagation. Given a user query q and the corresponding database schema D, the algorithm starts by initializing an empty search tree Ψ = (V, E) with a root node v0 representing the initial state (lines 3-4). |
| Open Source Code | Yes | The code is available at https://github.com/HKUSTDial/Alpha-SQL. |
| Open Datasets | Yes | We utilize the Spider (Yu et al., 2018) and BIRD (Li et al., 2023c) development sets for evaluation. |
| Dataset Splits | Yes | To facilitate more comparison experiments while reducing computational costs (Sections 5.3 to 5.5), we follow CHESSSQL (Talaei et al., 2024) and utilize the same Subsampled Development Set (SDS), which comprises 10% of each database from the BIRD development set. The SDS contains 147 samples, consisting of 81 simple, 54 moderate, and 12 challenging questions. |
| Hardware Specification | Yes | All experiments are run on an Ubuntu 22.04.3 LTS server with 512GB of RAM and dual 40-core Intel(R) Xeon(R) Platinum 8383C CPUs (@ 2.70GHz). Open-source LLMs are deployed locally using 8 GPUs, each with 80GB of memory and 312 TFLOPS with BF16 precision. |
| Software Dependencies | No | No specific software library or solver names with version numbers are provided in the paper. While the operating system 'Ubuntu 22.04.3 LTS' is mentioned, it does not fit the criteria of ancillary library or solver versions as per the examples. |
| Experiment Setup | Yes | The related hyper-parameters were set as follows: For offline database value retrieval, we set the editing similarity ϵedit as 0.3 and semantic similarity ϵsemantic as 0.6. For the MCTS rollout process, we set the number of rollouts to Nrollout = 24. During node expansion, each action was sampled Nexpansion = 3 times with a sampling temperature of Texpansion = 0.8. In the computation of self-supervised rewards, we set the SQL sampling parameters with Nreward = 5 repetitions and a temperature of Treward = 1.0. For the SQL Revision action (A6), we set a maximum iteration limit of Nrevision = 10 for the multi-round correction process. |