reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

The Adaptive Q-Network for Recommendation Tasks with Dynamic Item Space

Authors: Jianxiang Zhu, Dandan Lai, Zhongcui Ma, Yaxin Peng

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that our approach has achieved state-of-the-art performance in the dynamic recommendation task.
Researcher Affiliation	Academia	Jianxiang Zhu1, Dandan Lai1, Zhongcui Ma1, Yaxin Peng1,2,* 1the Department of Mathematics, College of Sciences, Shanghai University, Shanghai 200444, China 2the School of Future Technology, Shanghai University, Shanghai 200444, China EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Adaptive Q-Network Training process: Require: Item set in the training environment Itrain, the weights of pre-trained embedding layer. Output: the weights of CEQN. 1: Initialize all trainable parameters in the statecharacteristic value function CEQN, load and freeze the weights of the pre-trained embedding layer as function f. 2: Initialize replay buffer B. 3: Project item set Itrain on the characteristic set Vtrain through f. 4: for each iteration do 5: Apply current behaviour policy πb train through Eq. (2), collect and store samples to B. 6: Sample mini-batch (st, vt, rt, st+1) from B. 7: Update CEQN according Eq. (3). 8: end for Testing process: Require: Item set in the test environment Itest, the weights of pre-trained embedding layer and weights of CEQN. 1: Project item set Itest on the characteristic space Vtest through f. 2: Apply current policy πtest based on Eq. (1).
Open Source Code	No	The paper does not provide an explicit statement about releasing source code for the described methodology, nor does it include a link to a code repository.
Open Datasets	Yes	We conduct experiments on two representative datasets Movie Lens1 and Kuai Rec2 (Gao et al. 2022). 1https://grouplens.org/datasets/movielens/1m/ 2https://kuairec.com/
Dataset Splits	Yes	We primarily assess the dynamic recommendation task, and the ratio of items in the training environment to the testing environment is 0.5. A ratio of 0.5 ensures that the number of items in the training set is similar to that in the test set. Additionally, we provide settings for other ratios in ablation studies on task setting.
Hardware Specification	No	The paper mentions running experiments but does not specify any particular hardware details such as GPU models, CPU types, or memory.
Software Dependencies	No	The paper mentions using 'Open AI Gymnasium (Brockman et al. 2016)' but does not specify any version numbers for this or other software components.
Experiment Setup	Yes	For the training stage, all policies are trained with 100 epochs. The policy is evaluated using 100 interaction trajectories after each epoch, and the maximum recommended sequence length is limited to 30. Following (Yu et al. 2024), we report the mean values of all metrics during the last 25% of training epochs to achieve a fair comparison.