Interpretable Deep Generative Recommendation Models
Authors: Huafeng Liu, Liping Jing, Jingxuan Wen, Pengyu Xu, Jiaqi Wang, Jian Yu, Michael K. Ng
JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | A series of experimental results on four widely-used benchmark datasets demonstrates the superiority of In DGRM on recommendation performance and interpretability. In this section, we evaluate the proposed deep generative model on four datasets by comparing with the state-of-the-art recommendation methods. |
| Researcher Affiliation | Academia | Huafeng Liu EMAIL School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China Department of Mathematics, The University of Hong Kong, Hong Kong SAR, China Liping Jing EMAIL Jingxuan Wen EMAIL Pengyu Xu EMAIL Jiaqi Wang EMAIL Jian Yu EMAIL School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China Michael K. Ng EMAIL Department of Mathematics, The University of Hong Kong, Hong Kong SAR, China |
| Pseudocode | Yes | Algorithm 1 In DGRM Generative Process Algorithm 2 Learning with local variational optimization for In DGRM Algorithm 3 The training procedure for In DGRM with local variational optimization strategy. |
| Open Source Code | No | No explicit statement or link for the source code of the described methodology is provided in the paper. |
| Open Datasets | Yes | In experiments, four widely-used recommendation datasets, Movie Lens 20M 8, Netflix 9, Ali Shop-7C 10 and Yelp 11, are used to validate the recommendation performance. Footnotes provide the links: 8. https://grouplens.org/datasets/movielens/, 9. https://www.netflixprize.com, 10. https://jianxinma.github.io/disentangle-recsys.html, 11. https://www.yelp.com/dataset/challenge. |
| Dataset Splits | Yes | The held-out users strategy and five-fold cross validation are used to evaluate the recommendation performance. The 20% users are taken as held-out users and evenly separated for validation and test respectively. For each held-out user, his/her feedback data is randomly split into five equal sized subsets. Among them, four subsets are used to obtain the latent representation, and the rest subset is for evaluation in each round. Finally, the averaged results on five rounds are reported. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using Adam for training but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | Parameter setting: For fair comparison, the number of learnable parameters is set around 2Mdmax for each method, which is equivalent to using dmax-dimensional representations for the M items. The initial dimensionality dmax is set as 150. The dropout technique (Srivastava et al., 2014) is adopted at the input layer with probability 0.5. The model is trained using Adam (Kingma and Ba, 2014) with batch size of 128 users for 200 epoch on all datasets. The regularization coefficient λ is set to 1.2 for ML 20M and Netflix, 1.5 for Ali Shop-7C and Yelp. λo is set to 1 for better disentanglement. For auto-encoder based deep methods (Mult-VAE, Macri VAE, DGLGM and our method), the hyperparameters are automatically tuned via TPE (Bergstra et al., 2011), which searches the optimal hyperparameter configuration with 200 trials on the validation set. |