Profile-Based Bandit with Unknown Profiles

Authors: Sylvain Lamprier, Thibault Gisselbrecht, Patrick Gallinari

JMLR 2018 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, experiments are conducted on both artificial data and a task of focused data capture from online social networks. Obtained results demonstrate the relevance of the approach in various settings.
Researcher Affiliation Collaboration Sylvain Lamprier EMAIL Sorbonne Universit es, UPMC Paris 06, LIP6, CNRS UMR 7606 Thibault Gisselbrecht EMAIL SNIPS, 18 rue Saint Marc, 75002 Paris Patrick Gallinari EMAIL Sorbonne Universit es, UPMC Paris 06, LIP6, CNRS UMR 7606
Pseudocode Yes Algorithm 1: Samp Lin UCB
Open Source Code No The paper does not provide any explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets No The paper describes the creation of the USElections, Olympic Games, and Brexit datasets from Twitter data based on specific criteria. However, it does not provide any concrete access information (links, DOIs, repositories, or citations to public versions) for these specific processed datasets.
Dataset Splits No The paper describes an online learning setting where actions are selected at each time step. It refers to a 'time period divided in T steps' and a number of 'listened users at each time step', but it does not specify traditional training, validation, or testing dataset splits for reproduction.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions techniques like 'SVM topic classifier', 'TF bag of words representations', 'Porter Stemmer algorithm', and 'Latent Dirichlet Allocation method' but does not specify any software libraries or frameworks with their version numbers.
Experiment Setup Yes For that purpose, we set the horizon T to 30000 iterations, the number of available actions K to 100 and the size of the profile space to d = 5 dimensions. [...] we tested different values for σ {0.5, 1.0, 2.0}. Moreover, in order to guarantee that ||xi,t|| L = 1, [...] we sampled a reward ri,t from a Gaussian with mean µ i β and variance R2 = 1. [...] in every instance, we set δ = 0.05 for these experiments. Also, to avoid a too large exploration on profiles in the early steps, we multiplied each ρi,t,δ by a 0.01 coefficient.