RuAG: Learned-rule-augmented Generation for Large Language Models
Authors: Yudi Zhang, Pei Xiao, Lu Wang, Chaoyun Zhang, Meng Fang, Yali Du, Yevgeniy Puzyrev, Randolph Yao, Si Qin, Qingwei Lin, Mykola Pechenizkiy, Dongmei Zhang, Saravanakumar Rajmohan, Qi Zhang
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 EXPERIMENTS |
| Researcher Affiliation | Collaboration | 1Eindhoven University of Technology 2Peking University 3Microsoft 4University of Liverpool 5King s College London |
| Pseudocode | No | The paper describes the MCTS process in Section 3.2 by outlining its phases (selection, expansion, simulation, backpropagation) and providing the UCT formula, but it does not present this information in a structured pseudocode or algorithm block format. |
| Open Source Code | Yes | Project link: https://github.com/microsoft/Ru AG. |
| Open Datasets | Yes | We evaluate our framework across diverse scenarios, including public tasks in NLP (relation extraction on DWIE), time-series (log anomaly detection on HDFS), decision-making (the cooperative game Alice and Bob), and an industrial task in abuse detection, demonstrating its effectiveness in enhancing LLM s capability over diverse tasks. Project link: https://github.com/microsoft/Ru AG. |
| Dataset Splits | Yes | We conduct experiments on the DWIE dataset (Zaporojets et al., 2021), which contains 802 documents and 23,130 entities. After excluding irrelevant articles, 700 documents are used for training and 97 for testing. and The dataset is split chronologically into training, validation, and test sets with a ratio of 8:1:1. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used for running the experiments, such as exact GPU or CPU models, memory, or detailed computer specifications. |
| Software Dependencies | No | The paper mentions using GPT-3.5 (gpt-35-turbo-16k-20230613) and GPT-4 (gpt-4-20230613) as LLM backbones, but does not provide specific version numbers for ancillary software dependencies like programming languages (e.g., Python), libraries (e.g., PyTorch), or other solvers. |
| Experiment Setup | Yes | We provide detailed implementation for the three public tasks and the hyperparamter in Table A5. |