Adapting to Non-Stationary Environments: Multi-Armed Bandit Enhanced Retrieval-Augmented Generation on Knowledge Graphs
Authors: Xiaqiang Tang, Jian Li, Nan Du, Sihong Xie
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments conducted on two benchmark KGQA datasets demonstrate that our method significantly outperforms baseline methods in non-stationary settings while achieving state-of-the-art performance in stationary environments. |
| Researcher Affiliation | Collaboration | 1The Hong Kong University of Science and Technology (Guangzhou) 2Tencent Hunyuan |
| Pseudocode | Yes | Algorithm 1: Deep GGI-MO bandit enhanced RAG learning algorithm |
| Open Source Code | Yes | Code https://github.com/FUTUREEEEEE/Dynamic-RAG |
| Open Datasets | Yes | We evaluate our systems on two KGQA datasets Web QSP (Yih et al. 2016) and Complex Web Questions (CWQ) (Talmor and Berant 2018) |
| Dataset Splits | No | The paper does not explicitly provide train/test/validation splits within the main text for each dataset. While it mentions training on Web QSP and testing on CWQ for one scenario, it doesn't specify the splits *within* Web QSP or CWQ datasets for general experiments. |
| Hardware Specification | Yes | All experiments are conducted on the Nvidia Tesla V100 graphical card with the Intel Xeon Platinum 8255C CPU. |
| Software Dependencies | No | The paper mentions software components like Llama-2-7b-chat-hf and Llama Index but does not provide specific version numbers for them (e.g., 'Llama Index 2024' refers to the documentation year, not a software version). |
| Experiment Setup | No | The main text refers to an appendix for detailed setup ('See subsection 3 in the appendix (Tang et al. 2024) for detail set up.') but does not provide specific hyperparameters like learning rate, batch size, or optimizer settings in the main content. |