reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Mind Individual Information! Principal Graph Learning for Multimedia Recommendation

Authors: Penghang Yu, Zhiyi Tan, Guanming Lu, Bing-Kun Bao

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on three public real-world datasets demonstrate that the proposed framework outperforms state-of-the-art methods in terms of recommendation accuracy. Our main contributions can be summarized as follows: ... We conduct a comprehensive experimental study on three benchmark datasets, showing that PGL has distinct advantages in terms of recommendation accuracy.
Researcher Affiliation	Academia	1School of Communications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing, China 2Jiangsu Key Laboratory of Intelligent Information Processing and Communication Technology, Nanjing, China 3School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, China EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper provides mathematical equations and descriptions of methods, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The proposed framework and all compared methods are implemented using the MMRec framework3 (Zhou 2023), which is a unified open-source platform for developing and reproducing recommendation algorithms. ... 3https://github.com/enoche/MMRec. This statement refers to a third-party framework used by the authors, not the release of their specific PGL implementation.
Open Datasets	Yes	We conduct experiments on three categories of the widely used Amazon dataset2: (a)Baby, (b) Sports and Outdoors, and (c) Clothing, Shoes, and Jewelry, which we refer to as Baby, Sports, and Clothing in brief. The statistics of these datasets are presented in Table.3. 2Datasets are available at http://jmcauley.ucsd.edu/data/amazon/links.html
Dataset Splits	No	The paper mentions using a "standardized all-ranking protocol" and "Recall@20 as the training-stopping indicator," but does not explicitly state the train/test/validation dataset splits (e.g., percentages or sample counts) for reproduction.
Hardware Specification	Yes	All experiments are performed using Py Torch on NVIDIA Tesla V100 GPUs.
Software Dependencies	No	The paper mentions "Py Torch" and the "MMRec framework" but does not specify version numbers for these or any other software components used for the implementation, which is required for reproducibility.
Experiment Setup	Yes	For general settings, the ID embeddings are initialized using the Xavier initialization method and set to a dimension of 64. The batch size is set to 2048. For the self-supervised task, the temperature parameter is set to 0.2, as this value is commonly considered a good choice. Regarding the parameters used in our method, the truncation ratio γ is set to 0.25, the sparsification threshold ϵ is set to 1e-3, and the sampling ratio p is set to 0.3. For the hyperparameters of the self-supervised learning task, a hyperparameter search is conducted. The feature masking ratio ρ is searched in the range of [0.05, 0.1, 0.2, 0.3, 0.4], and the weight of the selfsupervised task λSSL is searched in the range of [0.005, 0.01, 0.05, 0.1, 0.5]. Furthermore, to avoid the overfitting issue, we employ an early stopping strategy. Following (Zhang et al. 2022), we use Recall@20 as the training-stopping indicator. If there is no improvement in Recall@20 for ten consecutive training epochs, the model training is halted.