Adversarial Bandits Against Arbitrary Strategies
Authors: Jung-hun Kim, Se-Young Yun
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we study adversarial bandit problems against arbitrarily switching arms. Crucially, we allow the number of switches S to be unknown to the agent, thereby targeting arbitrary strategies without prior knowledge of their complexity. To address this setting, we adopt the master-base framework combined with the online mirror descent (OMD) method... We begin by analyzing a master-base algorithm that employs OMD with a negative entropy regularizer, and show that it achieves a regret bound of O(S1/2K1/3T 2/3)... This refinement leads to an improved regret bound of O(min{ SKTρ, S KT}) with respect to T, where ρ captures the variance associated with a comparator strategy. |
| Researcher Affiliation | Academia | Jung-hun Kim EMAIL CREST, ENSAE, IP Paris Fair Play joint team, France Se-Young Yun EMAIL KAIST AI, South Korea |
| Pseudocode | Yes | Algorithm 1 Master-base OMD... Algorithm 2 Master-base OMD with adaptive learning rates |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. It only discusses implementation details in Remark 3.6. |
| Open Datasets | No | This is a theoretical paper. No specific datasets are used or referenced for empirical evaluation. |
| Dataset Splits | No | This is a theoretical paper and does not involve empirical evaluation on datasets, thus no dataset splits are provided. |
| Hardware Specification | No | The paper is theoretical and does not describe any experimental setup that would require hardware specifications. |
| Software Dependencies | No | The paper is theoretical and does not provide specific software dependencies with version numbers for experimental reproducibility. Remark 3.6 mentions general implementation aspects without specific versioned components used in experiments. |
| Experiment Setup | No | The paper is theoretical and does not present experimental results, therefore no experimental setup details or hyperparameters are provided. |