reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Adaptation to the Range in K-Armed Bandits

Authors: Hédi Hadiji, Gilles Stoltz

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide some numerical experiments on synthetic data to illustrate the qualitative behavior of some popular algorithms like UCB strategies when they are incorrectly tuned, as opposed to strategies that are less sensitive to ignoring the range or to the AHB strategy which adapts to it. These experiments are only of an illustrative nature. On Figures 2 and 3 we plot the estimates b RT (α)/α of the rescaled regret as solid lines. The shaded areas correspond to 2 standard errors of the sequences b RT (α, n)/α
Researcher Affiliation	Academia	H edi Hadiji EMAIL Gilles Stoltz EMAIL Universit e Paris-Saclay, CNRS, Laboratoire de math ematiques d Orsay, 91405, Orsay, France
Pseudocode	Yes	Algorithm 1 AHB: Ada Hedge for K armed Bandits, with extra-exploration
Open Source Code	No	The paper does not provide an explicit statement of code release or a link to a code repository. It refers to an "extended version of this article [ar Xiv:2006.03378]" which is a preprint server, not a code repository.
Open Datasets	No	We provide some numerical experiments on synthetic data to illustrate the qualitative behavior of some popular algorithms like UCB strategies when they are incorrectly tuned, as opposed to strategies that are less sensitive to ignoring the range or to the AHB strategy which adapts to it. These experiments are only of an illustrative nature.
Dataset Splits	No	The paper describes a simulation setup for bandit problems rather than dataset splits like training/validation/test sets.
Hardware Specification	No	The paper discusses numerical illustrations but does not provide any specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependency details with version numbers (e.g., Python 3.x, PyTorch 1.x, scikit-learn 0.x).
Experiment Setup	Yes	The main algorithm of interest is, of course, the AHB strategy with extra-exploration (Algorithm 1), which we tune as stated in Theorem 7 with parameter 1/2. Following Auer et al. (2002a), we used the tuning εt = min 1, 5K d t with d = 1/12 . We consider instances of UCB (Auer et al., 2002a) using indices of the form bµa(t) + s 2 ln T Na(t) , where s {0.01, 1, 100} .