Active Large Language Model-Based Knowledge Distillation for Session-Based Recommendation

Authors: Yingpeng Du, Zhu Sun, Ziyan Wang, Haoyan Chua, Jie Zhang, Yew-Soon Ong

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on real-world datasets show that our method significantly outperforms state-of-the-art methods for SBR. We conduct extensive experiments to evaluate the performance of ALKDRec and answer four research questions.
Researcher Affiliation Academia 1College of Computing and Data Science, Nanyang Technological University, Singapore 2Information Systems Technology and Design, Singapore University of Technology and Design, Singapore 3A*STAR Center for Frontier AI Research, Singapore EMAIL,zhu EMAIL,EMAIL, EMAIL,EMAIL,EMAIL
Pseudocode No The paper describes the proposed method conceptually and mathematically, including definitions, theorems, and equations, but does not contain a structured pseudocode or algorithm block.
Open Source Code Yes 1Due to space limitations, we only provide the main sketch of proofs, while the detailed proofs can be found in Appendix B of our Git Hub repository at https://github.com/kk97111/ALKDRec.
Open Datasets Yes Datasets. We evaluate ALKDRec and baselines on two real-world datasets, namely Hetrec2011-ML and Amazon Games.
Dataset Splits Yes We randomly split sessions into training, validation, and test sets by 6:2:2. For evaluation phase, we adopt the widely used leave-oneout evaluation protocol.
Hardware Specification No No specific hardware details for running experiments are provided. The paper mentions using the GPT-4-turbo API and associated costs ('e.g., around 44 minutes and 8.6 USD for Chat GPT API in Amazon-Games'), but not the underlying hardware specifications for the experimental environment.
Software Dependencies No The paper mentions using 'Adam optimizer' and 'GPT-4-turbo2024-04-09' as the LLM teacher model, but does not provide specific version numbers for general software dependencies like programming languages or libraries used for implementation.
Experiment Setup Yes For an effective instance... we set µ = 10 empirically. For similar and incorrect instances, we assign them with lower gain values compared to effective instances, i.e., gsi s = gin s = gef s /2. We adopt the GPT-4-turbo2024-04-09 as the LLM teacher to distill knowledge from 500 instances... We set the number of effective/similar/incorrect instances as 1:5:4... For αv in Equation (1), we assign 3/2/1... we set the latent dimensions for the teacher and student recommenders at 100 and 10... We set the learning rate as 1 10 3 with Adam optimizer and batch size as 1024 for all methods.