Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

Finite Continuum-Armed Bandits

Authors: Solenne Gaucher

NeurIPS 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical we propose an optimal strategy for this problem. Under natural assumptions on the reward function, we prove that the optimal regret scales as O(T 1/3) up to poly-logarithmic factors when the budget T is proportional to the number of actions N. When T becomes small compared to N, a smooth transition occurs. When the ratio T/N decreases from a constant to N 1/3, the regret increases progressively up to the O(T 1/2) rate encountered in continuum-armed bandits.
Researcher Affiliation Academia Solenne Gaucher Laboratoire de Mathématiques d Orsay Université Paris-Saclay, 91405, Orsay, France EMAIL
Pseudocode Yes Algorithm 1 Upper Confidence Bound for Finite continuum-armed bandits (UCBF)
Open Source Code No Not found. The paper is theoretical and describes an algorithm but does not mention providing access to its source code.
Open Datasets No Not found. The paper is theoretical and does not involve the use of datasets for training or evaluation.
Dataset Splits No Not found. The paper is theoretical and does not involve experimental validation on datasets, thus no dataset splits are mentioned.
Hardware Specification No Not found. The paper is purely theoretical and does not describe any computational experiments that would require hardware specifications.
Software Dependencies No Not found. The paper is purely theoretical and does not describe any computational experiments that would require software dependencies with version numbers.
Experiment Setup No Not found. The paper is purely theoretical and does not describe any experimental setup or hyperparameters.