Accurate and Regret-Aware Numerical Problem Solver for Tabular Question Answering

Authors: Yuxiang Wang, Jianzhong Qi, Junhao Gan

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on two benchmark datasets show that Tab La P is substantially more accurate than the state-of-the-art models, improving the answer accuracy by 5.7% and 5.8% on the two datasets, respectively.
Researcher Affiliation Academia Yuxiang Wang, Jianzhong Qi*, Junhao Gan School of Computing and Information Systems, The University of Melbourne EMAIL, EMAIL
Pseudocode Yes Algorithm 1: Table Question Answering with Tab La P
Open Source Code Yes Code https://github.com/yxw-11/Tab La P
Open Datasets Yes We conduct experiments to test the effectiveness of Tab La P on Wiki Table Quesetions (Pasupat and Liang 2015) and FTQ. Wiki Table Quesetions is a public dataset, while FTQ is adapted by us from the Fe Ta QA dataset (Nan et al. 2022) by removing answer tokens non-directly relevant to the questions.
Dataset Splits Yes Dataset # QA Pairs # Numerical Questions Training Testing Training Testing WTQ 11,321 4,344 5,461 2,148 FTQ 2,000 1,245 417 182 Tab Fact small 92,283 2,024 16,956 368
Hardware Specification Yes All experiments are run with two NVIDIA A100 80 GB GPUs on a cloud GPU server.
Software Dependencies No The paper mentions using a "Python interpreter" and "GPT-3.5 Turbo as the backbone model of Num Solver", and "Llama3-8B-Instruct as Ans Selector", but does not specify version numbers for Python or any libraries used in the implementation of Tab La P.
Experiment Setup Yes We fine-tune the Ans Selector and Tw Evaluator LLMs with the Adam W optimizer (Loshchilov and Hutter 2019) using a learning rate of 0.0002 and a weight decay 0.001. The maximum number of input tokens is 5,000, and the maximum number of epochs is 20.