MENTOR: Multi-level Self-supervised Learning for Multimodal Recommendation
Authors: Jinfeng Xu, Zheyu Chen, Shuo Yang, Jinze Li, Hewei Wang, Edith C. H. Ngai
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform comprehensive experiments on three public datasets in Amazon to validate the effectiveness of our MENTOR on both overall and component levels. |
| Researcher Affiliation | Academia | 1The University of Hong Kong, 2The Hong Kong Polytechnic University, 3Carnegie Mellon University |
| Pseudocode | No | The paper describes methods using narrative text and mathematical equations, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Code https://github.com/Jinfeng-Xu/MENTOR |
| Open Datasets | Yes | To evaluate our proposed MENTOR in the top N item recommendation task, we conduct extensive experiments on three widely used Amazon datasets (Mc Auley et al. 2015): Baby, Sports, and Clothing. |
| Dataset Splits | Yes | We follow the popular setting (Zhou and Shen 2023) with a random data splitting 8:1:1 for training, validation, and testing. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU, CPU models, or memory) used for running the experiments. |
| Software Dependencies | No | We implement MENTOR and all baselines with MMRec (Zhou 2023). For the general settings, we initialized the embedding with Xavier initialization (Glorot and Bengio 2010) of dimension 64. Besides, we optimize all models with Adam optimizer (Kingma and Ba 2014). |
| Experiment Setup | Yes | For our MENTOR, we perform a grid search on the dropout ratio in {0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7}, balancing hyper-parameter λf in {0.5, 1, 1.5, 2, 2.5}, balancing hyper-parameter λg in {1e-2, 1e-3, 1e-4}, temperature hyper-parameter τ in {0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8}, and balancing hyper-parameter λalign in {0.1, 0.2, 0.3}. We fix the learning rate with 1e-4, and the number of layers in the heterogenous graph with L = 2. The k of top-k in the item-item graph is set as 40. For convergence consideration, the early stopping is fixed at 20. |