Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Finite-time Analysis of Globally Nonstationary Multi-Armed Bandits
Authors: Junpei Komiyama, Edouard Fouché, Junya Honda
JMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that the proposed algorithms outperform the existing approaches in synthetic and real-world environments. Keywords: multi-armed bandits, adaptive windows, nonstationary bandits, changepoint detection, sequential learning. ... 6. Experiments This section reports the empirical performance of our proposed algorithms in simulations. We release the source code to reproduce our experiments on Git Hub13. |
| Researcher Affiliation | Academia | Junpei Komiyama EMAIL Stern School of Business, New York University, USA Edouard Fouch e EMAIL Institute for Program Structures and Data Organization, Karlsruhe Institute of Technology, Germany Junya Honda EMAIL Department of Systems Science, Graduate School of Informatics, Kyoto University, Japan Center for Advanced Intelligence Project, RIKEN, Japan |
| Pseudocode | Yes | Algorithm 1 ADWIN Require: A univariate stream of values S : (x1, x2, . . . ) [0, 1], confidence level δ (0, 1). ... Algorithm 2 Thompson sampling (TS) Require: Set of arms [K]. ... Algorithm 3 Kullback Leibler UCB (KL-UCB). Require: Set of arms [K]. ... Algorithm 4 ADS-bandit Require: Set of arms [K], confidence level δ, base-bandit algorithm ... Algorithm 5 ADR-bandit Require: Set of arms [K], confidence level δ, monitoring parameter N N, base-bandit algorithm |
| Open Source Code | Yes | This section reports the empirical performance of our proposed algorithms in simulations. We release the source code to reproduce our experiments on Git Hub13. ... 13 https://github.com/edouardfouche/G-NS-MAB |
| Open Datasets | Yes | Bioliq: This dataset was released by Fouch e et al. (2019). ... Zozo (Saito et al., 2021) is a real-world environment where the rewards are generated from an ad recommender on an e-commerce website. |
| Dataset Splits | No | The paper describes how synthetic and real-world environments generate rewards over time for bandit problems, but it does not specify explicit training/test/validation dataset splits typically used in supervised learning contexts. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as CPU or GPU models. |
| Software Dependencies | No | The paper mentions various algorithms and methods like ADWIN2, TS, KL-UCB, RExp3, etc., but it does not list any specific software libraries or programming languages with their version numbers. |
| Experiment Setup | Yes | Algorithmic hyperparameters are described in Section A. ... Table 2: List of tested hyperparameters of the algorithms. Bold letters indicate the ones reported in this paper. We set the confidence level of UCBL-CPD and Imp CPD to be the same as ADR-TS. Algorithm(s) Hyperparameters ADS-TS and ADR-TS, ADR-KL-UCB δ = 10-1, 10-2, 10-3, 10-4, 10-5, 10-6, 10-8, 10-12, 10-15 RExp3 T = 100, 500, 1000, 5000 SW-TS W = 100, 500, 1000, 5000 |