reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Finite-time Analysis of Globally Nonstationary Multi-Armed Bandits

Authors: Junpei Komiyama, Edouard Fouché, Junya Honda

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that the proposed algorithms outperform the existing approaches in synthetic and real-world environments. Keywords: multi-armed bandits, adaptive windows, nonstationary bandits, changepoint detection, sequential learning. ... 6. Experiments This section reports the empirical performance of our proposed algorithms in simulations. We release the source code to reproduce our experiments on Git Hub13.
Researcher Affiliation	Academia	Junpei Komiyama EMAIL Stern School of Business, New York University, USA Edouard Fouch e EMAIL Institute for Program Structures and Data Organization, Karlsruhe Institute of Technology, Germany Junya Honda EMAIL Department of Systems Science, Graduate School of Informatics, Kyoto University, Japan Center for Advanced Intelligence Project, RIKEN, Japan
Pseudocode	Yes	Algorithm 1 ADWIN Require: A univariate stream of values S : (x1, x2, . . . ) [0, 1], conﬁdence level δ (0, 1). ... Algorithm 2 Thompson sampling (TS) Require: Set of arms [K]. ... Algorithm 3 Kullback Leibler UCB (KL-UCB). Require: Set of arms [K]. ... Algorithm 4 ADS-bandit Require: Set of arms [K], conﬁdence level δ, base-bandit algorithm ... Algorithm 5 ADR-bandit Require: Set of arms [K], conﬁdence level δ, monitoring parameter N N, base-bandit algorithm
Open Source Code	Yes	This section reports the empirical performance of our proposed algorithms in simulations. We release the source code to reproduce our experiments on Git Hub13. ... 13 https://github.com/edouardfouche/G-NS-MAB
Open Datasets	Yes	Bioliq: This dataset was released by Fouch e et al. (2019). ... Zozo (Saito et al., 2021) is a real-world environment where the rewards are generated from an ad recommender on an e-commerce website.
Dataset Splits	No	The paper describes how synthetic and real-world environments generate rewards over time for bandit problems, but it does not specify explicit training/test/validation dataset splits typically used in supervised learning contexts.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as CPU or GPU models.
Software Dependencies	No	The paper mentions various algorithms and methods like ADWIN2, TS, KL-UCB, RExp3, etc., but it does not list any specific software libraries or programming languages with their version numbers.
Experiment Setup	Yes	Algorithmic hyperparameters are described in Section A. ... Table 2: List of tested hyperparameters of the algorithms. Bold letters indicate the ones reported in this paper. We set the conﬁdence level of UCBL-CPD and Imp CPD to be the same as ADR-TS. Algorithm(s) Hyperparameters ADS-TS and ADR-TS, ADR-KL-UCB δ = 10-1, 10-2, 10-3, 10-4, 10-5, 10-6, 10-8, 10-12, 10-15 RExp3 T = 100, 500, 1000, 5000 SW-TS W = 100, 500, 1000, 5000