SQLFixAgent: Towards Semantic-Accurate Text-to-SQL Parsing via Consistency-Enhanced Multi-Agent Collaboration

Authors: Jipeng Cen, Jiaxin Liu, Zhixu Li, Jingjing Wang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluated our proposed framework on five Text-to-SQL benchmarks. The experimental results show that our method consistently enhances the performance of the baseline model, specifically achieving an execution accuracy improvement of over 3% on the Bird benchmark.
Researcher Affiliation Collaboration Jipeng Cen1, Jiaxin Liu4, Zhixu Li2,3, Jingjing Wang1* 1School of Computer Science & Technology, Soochow University, Suzhou, China 2School of Information, Renmin University of China, Beijing, China 3International College (Suzhou Research Institute), Renmin University of China, Suzhou, China 4i FLYTEK Research (Suzhou), China
Pseudocode Yes Algorithm 1: The algorithm of SQLFix Agent
Open Source Code Yes To facilitate the related research, all codes will be released via Github.
Open Datasets Yes We evaluated our framework on two primary Text-to-SQL benchmarks: Spider (Yu et al. 2018) and Bird (Li et al. 2024b).
Dataset Splits Yes Spider offers a training set comprising 8,659 samples, a development set with 1,034 samples, and a test set with 2,147 samples, encompassing 200 distinct databases and 138 domains.
Hardware Specification Yes All experiments were conducted on a server equipped with 1 AMD EPYC 7352 CPU and 8 NVIDIA RTX 3090 GPU.
Software Dependencies No In our experiments, we use the fine-tuned Codes (Li et al. 2024a) as the SQLTool used by agents for Text-to-SQL parsing and employ GPT-3.5-turbo as the backbone LLM for three agents. The paper mentions software names but does not provide specific version numbers for them.
Experiment Setup Yes For SQLTool inference, a beam search produces 4 SQL candidates, we select the first executable one for further checking by SQLFix Agent. If the error is detected, SQLFix Agent attempts to repair it up to 3 times. These hyperparameters are tuned on validation set.