reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Trading Off Resource Budgets For Improved Regret Bounds

Authors: Thomas Orton, Damon Falck

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results: We benchmark both FPML and OGhybrid on an online black-box hyperparameter optimization problem based on the 2020 Neur IPS BBO challenge [Turner et al., 2021]. We ﬁnd that both these new algorithms outperform OG for various compute budgets.
Researcher Affiliation	Academia	Damon Falck University of Oxford EMAIL Thomas Orton University of Oxford EMAIL
Pseudocode	Yes	Algorithm 1 FPML(B,") Require: N B 1, " > 0. Initialize the cumulative cost C0(a) 0 for each arm a 2 A. for round t = 1, . . . , T do 1. For each arm a 2 A, draw a noise perturbation pt(a) 1 " Exp. 2. Calculate the perturbed cumulative costs for round t 1, Ct 1(a) Ct 1(a) pt(a). 3. Pull the B arms with the lowest perturbed cumulative costs according to Ct 1. Break ties arbitrarily. 4. Update the cumulative costs for each arm, Ct(a) Ct 1(a) + ct(a). end for
Open Source Code	No	The paper does not contain any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	the optimization problem at each round t 2 [T] is to choose the hyperparameters of either a multi-layer perceptron (MLP) or a lasso classiﬁer for one of 184 classiﬁcation tasks from the Pembroke Machine Learning Benchmark [Olson et al., 2017] (so T = 368).
Dataset Splits	No	No explicit train/validation/test dataset splits are provided for the tasks from the Pembroke Machine Learning Benchmark used in the experiments. The paper mentions running each algorithm setting 100 times, but this relates to statistical evaluation rather than data partitioning for model training/validation.
Hardware Specification	No	The paper mentions 'B available CPU cores' in the context of an application scenario but does not provide specific hardware details like GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	No	The paper mentions using 'geometric sampling' and the 'Bayesmark package', but it does not specify version numbers for any software components or libraries.
Experiment Setup	No	The paper states that the " parameter for the bandit subroutines FPML-partial and Exp3 was set to their theoretically optimal values given B, N and T without any ﬁne-tuning. However, it does not provide the specific numerical values of these parameters or other detailed hyperparameters for the experimental setup.