TrustUQA: A Trustful Framework for Unified Structured Data Question Answering
Authors: Wen Zhang, Long Jin, Yushan Zhu, Jiaoyan Chen, Zhiwei Huang, Junjie Wang, Yin Hua, Lei Liang, Huajun Chen
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We have evaluated Trust UQA with 5 benchmarks covering 3 types of structured data. It outperforms 2 existing unified structured data QA methods. In comparison with the baselines that are specific to one data type, it achieves state-of-the-art on 2 of the datasets. Further more, we have demonstrated the potential of our method for more general QA tasks, QA over mixed structured data and QA across structured data. |
| Researcher Affiliation | Collaboration | 1Zhejiang University 2University of Manchester 3Ant Group 4ZJU-Ant Group Joint Lab of Knowledge Graph 5Zhejiang Key Laboratory of Big Data Intelligent Computing |
| Pseudocode | No | The paper describes query functions like 'get_information', 'search_node', and 'search_condition' and their translation rules in Table 1, but these are presented as textual descriptions and a mapping table, not as a structured pseudocode block or algorithm. |
| Open Source Code | Yes | Code https://github.com/zjukg/Trust UQA |
| Open Datasets | Yes | We adopt 5 datasets covering 3 data types: Wiki SQL (2017) and WTQ (2015) for table, Web Questions SP(Web QSP) (2016) and Meta QA (2018) for KG, and Cron Questions (2021) for temporal KG. |
| Dataset Splits | No | The paper mentions using well-known datasets such as Wiki SQL, WTQ, Web QSP, Meta QA, and Cron Questions, and mentions constructing demonstrations. However, it does not explicitly specify the training, validation, or test splits for these datasets within the provided text, only referring to them as 'official' or 'processed versions' without detailing the split methodology or sizes for the main evaluation. |
| Hardware Specification | Yes | Our system is equipped with 2*NVIDIA A100 PCIe 40GB GPUs, 40 physical cores across 2 sockets, each socket containing 20 cores. The Intel Xeon Gold 6148 processors operate at a base speed of 2.40 GHz, with a maximum of 3.70 GHz. |
| Software Dependencies | Yes | We use GPT-3.5 (gpt-3.5-turbo-0613) as the LLM with self-consistency strategy of 5 times, and Sentence BERT (2019) as the dense text encoder. |
| Experiment Setup | Yes | We use GPT-3.5 (gpt-3.5-turbo-0613) as the LLM with self-consistency strategy of 5 times, and Sentence BERT (2019) as the dense text encoder. If the answer is None due to mismatched entity-relation pairs and key-value inconsistencies etc., we implement the retry mechanism with 3 times trials. We set the number of retrieves m = 15 and the number of demonstrations k = 8. |