BiDeV: Bilateral Defusing Verification for Complex Claim Fact-Checking
Authors: Yuxuan Liu, Hongda Sun, Wenya Guo, Xinyan Xiao, Cunli Mao, Zhengtao Yu, Rui Yan
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results on two widely used challenging fact-checking benchmarks (Hover and Feverous-s) demonstrate that our Bi De V can achieve the best performance under both gold and open settings. |
| Researcher Affiliation | Collaboration | 1 Gaoling School of Artificial Intelligence, Renmin University of China 2 Nankai University 3 Baidu Inc. 4 Kunming University of Science and Technology EMAIL, EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The overview of our Bi De V is shown in Figure 2. In the subsequent sections, we will introduce how to integrate LLMs to eliminate the vagueness in the claim and the redundancy in the evidence. (Figure 2 is a diagram, not pseudocode). Figure 7: Case Study of selected baselines (FOLK and Program FC) and our Bi De V. (The pseudocode-like structures in Figure 7 are for baselines, not Bi De V's core algorithm). |
| Open Source Code | Yes | Code https://github.com/Ethan Leo-LYX/Bi De V |
| Open Datasets | Yes | Datasets. There are two widely used and challenging datasets to evaluate the fact-checking performance of baselines and our Bi De V: (i) Hover (Jiang et al. 2020) and (ii) Feverous-s (Pan et al. 2023). |
| Dataset Splits | Yes | Datasets. There are two widely used and challenging datasets to evaluate the fact-checking performance of baselines and our Bi De V: (i) Hover (Jiang et al. 2020) and (ii) Feverous-s (Pan et al. 2023). |
| Hardware Specification | No | In our proposed method, we use gpt-3.5-turbo as the base model of Perceptor, Rewriter, Decomposer, and Filter by accessing to Open AI API with few-shot demonstrations. For a fair comparison, we leverage Flan-T5-XL (3B) as the Querier and Checker without additional fine-tuning. The paper does not provide specific hardware details like GPU/CPU models. |
| Software Dependencies | No | In our proposed method, we use gpt-3.5-turbo as the base model of Perceptor, Rewriter, Decomposer, and Filter by accessing to Open AI API with few-shot demonstrations. For a fair comparison, we leverage Flan-T5-XL (3B) as the Querier and Checker without additional fine-tuning. The paper does not specify software versions for reproducibility. |
| Experiment Setup | Yes | In the vagueness defusing, we iteratively perceive-then-rewrite for 3 rounds. |