Neural Combinatorial Clustered Bandits for Recommendation Systems

Authors: Baran Atalar, Carlee Joe-Wong

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on real recommendation datasets show that Ne UClust achieves better regret and reward than other contextual combinatorial and neural bandit algorithms. ... We show through experimental results on real world recommendation datasets (Movie Lens, Yelp) that our algorithm outperforms state-of-the-art neural contextual/combinatorial and CC-MAB algorithms (Experiments).
Researcher Affiliation Academia Baran Atalar, Carlee Joe-Wong Carnegie Mellon University EMAIL, EMAIL
Pseudocode Yes Ne UClust description. Algorithm 1 shows Ne UClust s pseudocode.
Open Source Code Yes Code https://github.com/Baran Atalar/Ne UClust
Open Datasets Yes In this section, we measure and evaluate the performance of Ne UClust on two recommendation datasets: Movie Lens 25M Dataset 1 and the Yelp Open Dataset2. 1https://grouplens.org/datasets/movielens/ 2https://www.yelp.com/dataset
Dataset Splits No We assume that every round we need to recommend an incoming movie to K users, hence each base arm corresponds to a user. ... For the restaurant recommendation setting for Yelp, in each round, there is an incoming user, and we need to select K restaurants to recommend to the user from a total of approximately 50,000 restaurants.
Hardware Specification No Table 1: Cumulative regret and runtime (RT) (rounded in seconds) comparison of the algorithms for the Movie Lens (ML) and Yelp (Y) datasets.
Software Dependencies No We make use of deep neural networks to estimate and learn the unknown reward functions... We first initialize the parameters of the first neural network... Run k-means clustering on the observed contexts...
Experiment Setup Yes Algorithm 1: Ne UClust (Online) 1: Input: Number of rounds T, regularization parameters λ1 and λ2, step sizes η1 and η2, number of gradient descent steps J, network widths m and n, network depths L and Lm, number of clusters M, maximum number of iterations of clustering ic, size of super arm K, norm parameter S. 2: Randomly initialize θ0 and Θ0 as described previously and Z0 = λ1I