reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Towards Automatic Sampling of User Behaviors for Sequential Recommender Systems

Authors: Hao Zhang, Mingyue Cheng, Zhiding Liu, Junzhe Jiang

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on benchmark recommendation models and four real-world datasets. The experimental results demonstrate the effectiveness of the proposed Auto SAM.
Researcher Affiliation	Academia	Hao Zhang , Mingyue Cheng , Zhiding Liu , Junzhe Jiang State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China EMAIL, {mycheng}@ustc.edu.cn,
Pseudocode	Yes	Algorithm 1 Learning the Auto SAM framework
Open Source Code	Yes	5https://github.com/zh-ustc/Auto SAM
Open Datasets	Yes	We conduct four real-world datasets from different online platforms as following: Tmall1 https://tianchi.aliyun.com/dataset/data Detail?data Id=42 Alipay2 https://tianchi.aliyun.com/dataset/data Detail?data Id=53 Yelp3 https://www.yelp.com/dataset Amazon Book4 (abbreviated as Amazon) is selected from Amazon review data. 4http://deepyeti.ucsd.edu/jianmo/amazon/index.html
Dataset Splits	Yes	We split the dataset into training, validation and testing sets following the leaveone-out strategy [Cheng et al., 2022; Zhao et al., 2022].
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper does not provide specific software dependency details with version numbers. It only mentions using "Autom as the default optimizer" and "SGD" without version information.
Experiment Setup	Yes	We set the batch size of 128 with 10, 000 random negative items per batch, and the embedding size is set to 128 for all methods. In all sampling based methods which employ twolayer SASRec as the backbone, we conduct 4 multi-head selfattention and 2 FFN layers, while the hidden size is set to 256. The sample rates of Ran SAM, Last SAM and Pop SAM are searched from {0.5, 0.6, 0.7, 0.8, 0.9}. We consistently employ Autom as the default optimizer for all recommenders, combined with a learning rate α1 of 1 e 3. As for our sampler, we conduct SGD with a learning rate α2 of 1 e 1. We use grid search to find the best group of Auto SAM s hyperparameters as shown in Table 2. Table 2: Auto SAM s hyper-parameter exploration. t [1.0, 3.0, 5.0, 7.0, 9.0] 5.0 5.0 5.0 3.0 b [ 0.5, 0.0, 0.5, 1.0, 1.5, 2.0] 1.0 2.0 1.0 0.5 k [2e 1, 2e 2, 2e 3, 2e 4, 2e 5] 2e 3 2e 3 2e 2 2e 3 λ [0.25, 0.5, 0.75] 0.5 0.5 0.5 0.5 ψ0 [0.2, 0.5, 0.8] 0.8 0.8 0.8 0.5