Fixing the Loose Brake: Exponential-Tailed Stopping Time in Best Arm Identification
Authors: Kapilan Balagopalan, Ngo Tuan Nguyen, Yao Zhao, Kwang-Sung Jun
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | D. Empirical studies In this section we present the study of stopping time distribution of Successive Elimination (SE) (Even-Dar et al., 2006) algorithm. Experimental setup: We implemented SE in two different configurations... In Figure 5, 6 we have plotted the histograms of observed stopping times for all the trials. |
| Researcher Affiliation | Academia | Kapilan Balagopalan * 1 Tuan Ngo Nguyen * 1 Yao Zhao * 1 Kwang-Sung Jun 1... 1University of Arizona. Correspondence to: Kwang-Sung Jun <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 FC-DSH Input: A set of K arms, δ... Algorithm 2 Brake Booster Input: base trial count L1, base budget T1, algorithm A, base failure rate δ0... Algorithm 3 Budgeted Identification Input: algorithm A, the number of trials L, sampling budget per trial T, base failure rate δ0 |
| Open Source Code | No | No explicit statement or link for open-source code for the methodology described in this paper is found. |
| Open Datasets | No | Experimental setup: We implemented SE in two different configurations, 1) Original version 2) Version with ε-slack added to the stopping condition. We set the number of arms 3, with mean rewards {1.0, 0.9, 0.9}. Noise follows N(0, 1). |
| Dataset Splits | No | Experimental setup: We implemented SE in two different configurations... We set the number of arms 3, with mean rewards {1.0, 0.9, 0.9}. Noise follows N(0, 1). We conduct 1000 trials and observe the stopping times of those trials. This describes simulation parameters and number of runs, not dataset splits. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, memory) are provided for running the experiments. |
| Software Dependencies | No | No specific software dependencies with version numbers are mentioned. The paper only states 'We implemented SE'. |
| Experiment Setup | Yes | Experimental setup: We implemented SE in two different configurations... We set the number of arms 3, with mean rewards {1.0, 0.9, 0.9}. Noise follows N(0, 1). We set δ = 0.01. We conduct 1000 trials and observe the stopping times of those trials. We forcefully terminated the trials that do not stop until 30,000 time steps or 1,000,000 time steps. ... We consider a bandit problem with four arms, each associated with mean rewards of {1.0, 0.6, 0.6, 0.6}. The reward noise is drawn from a normal distribution N(0, 1). The confidence parameter is set to δ = 0.01. We perform 1000 independent trials and record the stopping times for each. Trials that do not terminate within 1,000,000 time steps are forcefully stopped at that point. ... In these experiments, we employ a 1.2 growth factor for both the per-trial budget and the number of trials, in contrast to the conventional doubling scheme. ... For the implementation of FC-DSH algorithm, we deviate from the theoretical version presented in the paper by reusing samples across rounds. ... we generalize the growth schedule by introducing a scaling parameter b, such that Tm = bm 1T1, for phase m 2. While our analysis focuses on the case b = 2, we speculate that the exponential guarantee may hold for any value of b. In our experiments, we set b = 1.01 to allow for finer budget increment. Additionally, we adopt the stopping rule proposed by (Jourdan et al., 2022) across all algorithms to ensure fair comparison. |