Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

Fast Slate Policy Optimization: Going Beyond Plackett-Luce

Authors: Otmane Sakhi, David Rohde, Nicolas Chopin

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For our experiments, we focus on learning slate decision functions for the particular case of recommendation as collaborative filtering datasets are easily accessible, facilitating the reproducibility of our results. We choose three collaborative filtering datasets with varying action space size, Movie Lens25M (Harper & Konstan, 2015), Twitch (Rappaz et al., 2021) and Good Reads (Wan & Mc Auley, 2018; Wan et al., 2019).
Researcher Affiliation Collaboration Otmane Sakhi CREST-ENSAE, Criteo AI Lab EMAIL David Rohde Criteo AI Lab EMAIL Nicolas Chopin CREST-ENSAE EMAIL
Pseudocode Yes Algorithm 1: Learning with Latent Gaussian Perturbation
Open Source Code No The paper does not contain any explicit statements about the release of source code for the methodology, nor does it provide links to a code repository.
Open Datasets Yes We choose three collaborative filtering datasets with varying action space size, Movie Lens25M (Harper & Konstan, 2015), Twitch (Rappaz et al., 2021) and Good Reads (Wan & Mc Auley, 2018; Wan et al., 2019).
Dataset Splits Yes Given a dataset, we split randomly the user-item interaction session [X, Y ] into two parts; the observed interactions X and the hidden interactions Y. ... We split each dataset by users and keep 10% to create a validation set, on which the reward of the decision function is reported.
Hardware Specification No The training is conducted on a CPU machine, using the Adam optimizer (Kingma & Ba, 2014), with a batch size of 32. This description of 'a CPU machine' is too general and lacks specific details such as model, clock speed, or memory.
Software Dependencies No The paper mentions 'Adam optimizer' and 'FAISS library' but does not provide specific version numbers for these or any other software components used in the experiments.
Experiment Setup Yes The training is conducted on a CPU machine, using the Adam optimizer (Kingma & Ba, 2014), with a batch size of 32. We tune the learning rate on a validation set for all algorithms. ... We fix the latent space dimension L = 100 and use a slate size of K = 5 for these experiments. ... For all datasets and training routines, we allow a runtime of 60 minutes, and evaluate our policies on the validation set for 10 equally spaced intervals. ... Even if σ can be treated as an additional parameter, we fix it to σ = 1/L in all our experiments for a fair comparison.