Coherency Improved Explainable Recommendation via Large Language Model

Authors: Shijie Liu, Ruixin Ding, Weihai Lu, Jun Wang, Mo Yu, Xiaoming Shi, Wei Zhang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on three datasets of explainable recommendation show that the proposed framework is effective, outperforming stateof-the-art baselines with improvements of 7.3% in explainability and 4.4% in text quality. ... We conduct extensive experiments to demonstrate the effectiveness of the proposed framework against strong baselines, and experimental results show that training techniques can further improve the results. ... Experimental Setting Dataset To validate the effectiveness of our method, we conducted experiments on three publicly available datasets and their splits (Li, Zhang, and Chen 2020).
Researcher Affiliation Collaboration Shijie Liu1*, Ruixing Ding1*, Weihai Lu2*, Jun Wang1, Mo Yu3, Xiaoming Shi1, Wei Zhang1 1East China Normal University, 2Peking University, 3 We Chat AI, Tencent
Pseudocode No The paper describes the methodology using textual explanations, mathematical formulas, and diagrams (Figure 2, Figure 3), but does not include any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code Yes Code https://github.com/karrich/CIER
Open Datasets Yes Dataset To validate the effectiveness of our method, we conducted experiments on three publicly available datasets and their splits (Li, Zhang, and Chen 2020). ... The three datasets are from Trip Advisor (hotel), Amazon (movies & TV), and Yelp (restaurant). ... The available datasets and keyword extraction tools are provided by Sentires (Zhang et al. 2014; Li et al. 2020).
Dataset Splits Yes Each dataset is randomly divided into training, validation, and test sets in an 8:1:1 ratio five times.
Hardware Specification Yes All the experiments are conducted on an NVIDIA H800 GPU.
Software Dependencies No The paper mentions models like LLaMA2-7B, GPT-4, gpt-4o, and bert-base-multilingual-uncased-sentiment, and optimizers like Adam W. However, it does not provide specific version numbers for general ancillary software libraries or programming languages (e.g., Python, PyTorch/TensorFlow versions) that are typically required for reproduction.
Experiment Setup Yes For CIER, λ is set to 0.1 and γ to 0.2, selected through grid search over the ranges [0.01, 0.1, 1.0, 10.0] and [0.0, 0.2, 0.5, 0.8, 1.0], respectively. The model is optimized using the Adam W (Loshchilov and Hutter 2017) optimizer with hierarchical learning rates: 10 4 for the Lora module and 10 3 for the other components. The training epoch is set to 3 and the embedding size d is set to 1024.