Re2LLM: Reflective Reinforcement Large Language Model for Session-based Recommendation
Authors: Ziyan Wang, Yingpeng Du, Zhu Sun, Haoyan Chua, Kaidong Feng, Wenya Wang, Jie Zhang
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Lastly, we conduct experiments on real-world datasets and demonstrate the superiority of our Re2LLM over state-of-the-art methods. Our experiments demonstrate that Re2LLM outperforms state-of-the-art methods, including deep learning-based and LLM-based models, in both few-shot and full-data settings across two real-world datasets. |
| Researcher Affiliation | Academia | 1Nanyang Technological University 2Singapore University of Technology and Design 3Yanshan University |
| Pseudocode | Yes | The overall pseudocode of Re2LLM is in Appendix B. |
| Open Source Code | Yes | Our code and data are available in the Supplementary Material. |
| Open Datasets | Yes | We evaluate Re2LLM and baselines on two real-world datasets. Movie (Hetrec2011-Movielens) contains user ratings of movies and side information such as title, production year, and genre. Game ( Video Games of the Amazon Review Dataset) contains users reviews on various types of games and peripherals, and metadata such as title, brand, and tag. The statistics are in Table 1. |
| Dataset Splits | Yes | For each dataset, we apply the split-by-ratio strategy following (Sun et al. 2020) to obtain training, validation, and test sets by 7:1:2. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models or detailed computer specifications. It mentions using the gpt-4 API and discusses computational requirements in an appendix not provided. |
| Software Dependencies | No | The paper mentions using Optuna for hyperparameter optimization and BERT as a text encoder, but does not provide specific version numbers for these or any other key software libraries or frameworks. It also uses the 'gpt-4 API' which is a service rather than a software dependency with a version. |
| Experiment Setup | Yes | We conduct 20 trials to search for learning rate, weight decay, and batch size. For our method Re2LLM, we set the knowledge base size to 20, and the few-shot training size to 500. For the full dataset setting, we use the entire training set. For the few-shot setting, we sample 500 training samples from the entire training set for all methods. We run 5 experiments with different random seeds and show the average performance. All methods are optimized by the Adam optimizer. |