Template-Driven LLM-Paraphrased Framework for Tabular Math Word Problem Generation
Authors: Xiaoqiang Kang, Zimu Wang, Xiaobo Jin, Wei Wang, Kaizhu Huang, Qiufeng Wang
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through the proposed framework, we construct a high-quality dataset Tab MWPTe LL by adhering to the question types in the Tab MWP dataset, and we conduct extensive experiments on a variety of LLMs to demonstrate the effectiveness of Tab MWP-Te LL in improving TMWP-solving performance. |
| Researcher Affiliation | Academia | 1School of Advanced Technology, Xi an Jiaotong-Liverpool University 2University of Liverpool 3Duke Kunshan University EMAIL, EMAIL |
| Pseudocode | No | The paper describes methods and processes through figures (Figure 2, Figure 3, Figure 4) and textual descriptions, but it does not contain explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/Jason8Kang/TELL |
| Open Datasets | Yes | We conduct evaluations on Tab MWP (Lu et al. 2023b), a recent large-scale dataset containing 38, 431 grade-level MWPs with tabular context, whose statistics are presented in Table 1. |
| Dataset Splits | Yes | We conduct evaluations on Tab MWP (Lu et al. 2023b), a recent large-scale dataset containing 38, 431 grade-level MWPs with tabular context, whose statistics are presented in Table 1. Table 1: Statistics of the Tab MWP dataset. Train Valid Test Total #Question 23, 059 7, 686 7, 686 38, 431 |
| Hardware Specification | Yes | All experiments are conducted on 8 NVIDIA Ge Force RTX 3090 graphics cards. |
| Software Dependencies | No | The paper mentions using XTuner for QLoRA, and specific LLMs (Yi, Mistral, Qwen 2, Llama 3) but does not provide specific version numbers for these or other key software libraries like Python or PyTorch, which would be necessary for reproducibility. |
| Experiment Setup | Yes | During the fine-tuning process, we set the number of epochs as 2, the batch size per device as 12, the gradient accumulation steps as 4, and the learning rate as 2e 4. |