The Survival Bandit Problem

Authors: Charles Riou, Junya Honda, Masashi Sugiyama

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we evaluate the empirical performance of the algorithms EXPLOIT-UCB and EXPLOITUCB-DOUBLE (for various parameters n) that we introduced in this paper. The survival regret of the algorithms considered is given in Figure 4. All the curves are averages over 200 simulations. The corresponding average survival time min(T, τ(B, π)) of the algorithms is given in Table 2. The average proportion of ruins of the algorithms is given in Table 3.
Researcher Affiliation Academia Charles Riou EMAIL The University of Tokyo & RIKEN Center for AIP Tokyo, Japan; Junya Honda EMAIL Kyoto University & RIKEN Center for AIP Kyoto, Japan; Masashi Sugiyama EMAIL RIKEN Center for AIP & The University of Tokyo Tokyo, Japan
Pseudocode Yes Algorithm 1 EXPLOIT-UCB(B) and Algorithm 2 EXPLOIT-UCB-DOUBLE are provided, detailing the procedures of the proposed algorithms.
Open Source Code No The paper does not contain any explicit statement about providing open-source code for the described methodology, nor does it include links to a code repository.
Open Datasets No The paper defines specific multinomial arm distributions (F(1) to F(12)) for its simulated bandit problem, rather than utilizing or providing access to a publicly available dataset. There is no mention of existing benchmark datasets or specific links/citations for data access.
Dataset Splits No The paper describes a multi-armed bandit problem with custom-defined arm distributions for simulations. This setup does not involve traditional training, validation, or test dataset splits typically found in supervised learning tasks.
Hardware Specification No The paper does not specify any particular hardware components, such as GPU or CPU models, memory, or details of a computing cluster, used for running the experiments.
Software Dependencies No The paper does not mention any specific software dependencies or libraries with version numbers (e.g., Python, PyTorch, specific solvers) that would be required to reproduce the experiments.
Experiment Setup Yes The paper details the experimental setting in Section 8.1, including the horizon T = 20000, K = 3 multinomial arms, and the specific probability distributions F(1) through F(12) used. It also specifies the parameters n = {1, log T, 100} for EXPLOIT-UCB-DOUBLE.