Accurate and Regret-Aware Numerical Problem Solver for Tabular Question Answering
Authors: Yuxiang Wang, Jianzhong Qi, Junhao Gan
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on two benchmark datasets show that Tab La P is substantially more accurate than the state-of-the-art models, improving the answer accuracy by 5.7% and 5.8% on the two datasets, respectively. |
| Researcher Affiliation | Academia | Yuxiang Wang, Jianzhong Qi*, Junhao Gan School of Computing and Information Systems, The University of Melbourne EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: Table Question Answering with Tab La P |
| Open Source Code | Yes | Code https://github.com/yxw-11/Tab La P |
| Open Datasets | Yes | We conduct experiments to test the effectiveness of Tab La P on Wiki Table Quesetions (Pasupat and Liang 2015) and FTQ. Wiki Table Quesetions is a public dataset, while FTQ is adapted by us from the Fe Ta QA dataset (Nan et al. 2022) by removing answer tokens non-directly relevant to the questions. |
| Dataset Splits | Yes | Dataset # QA Pairs # Numerical Questions Training Testing Training Testing WTQ 11,321 4,344 5,461 2,148 FTQ 2,000 1,245 417 182 Tab Fact small 92,283 2,024 16,956 368 |
| Hardware Specification | Yes | All experiments are run with two NVIDIA A100 80 GB GPUs on a cloud GPU server. |
| Software Dependencies | No | The paper mentions using a "Python interpreter" and "GPT-3.5 Turbo as the backbone model of Num Solver", and "Llama3-8B-Instruct as Ans Selector", but does not specify version numbers for Python or any libraries used in the implementation of Tab La P. |
| Experiment Setup | Yes | We fine-tune the Ans Selector and Tw Evaluator LLMs with the Adam W optimizer (Loshchilov and Hutter 2019) using a learning rate of 0.0002 and a weight decay 0.001. The maximum number of input tokens is 5,000, and the maximum number of epochs is 20. |