reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization

Authors: Marius Lindauer, Katharina Eggensperger, Matthias Feurer, André Biedenkapp, Difan Deng, Carolin Benjamins, Tim Ruhkopf, René Sass, Frank Hutter

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To provide an impression on the sequential performance of SMAC3, we compared it against random search (Bergstra and Bengio, 2012) although we consider it a weak baseline (Turner et al., 2020) Hyperband (Li et al., 2018), Dragonﬂy (Kandasamy et al., 2020) and BOHB (Falkner et al., 2018) on the surrogate benchmark for HPO on DNNs on the letter dataset (Falkner et al., 2018), a joint HPO+NAS benchmark (Klein and Hutter, 2019) on the Naval Propulsion dataset and a pure NAS benchmark (Zela et al., 2020). As shown in Figure 2, SMAC3 s multi-ﬁdelity approach (see Sec. 2.3) performs as well as Hyperband in the beginning, performs best in the middle, until SMAC3 s pure BO with RFs catches up in the end. For the whole time, SMAC3 consistently outperforms Dragonﬂy and in the later phases, also BOHB.
Researcher Affiliation	Collaboration	Marius Lindauer1 EMAIL ... Frank Hutter2,3 EMAIL 1Leibniz University Hannover, 2University of Freiburg, 3Bosch Center for Artiﬁcial Intelligence. The authors are affiliated with Leibniz University Hannover and University of Freiburg (academic), and Bosch Center for Artiﬁcial Intelligence (industry).
Pseudocode	No	The paper describes algorithms and methods but does not include any clearly labeled pseudocode blocks or algorithms formatted like code.
Open Source Code	Yes	The SMAC3 package is available under a permissive BSD-license at https://github.com/automl/SMAC3.
Open Datasets	Yes	We compared it against... on the surrogate benchmark for HPO on DNNs on the letter dataset (Falkner et al., 2018), a joint HPO+NAS benchmark (Klein and Hutter, 2019) on the Naval Propulsion dataset and a pure NAS benchmark (Zela et al., 2020).1 As shown in Figure 2... 1. See https://github.com/automl/HPOBench/ for details regarding the experimental setup.
Dataset Splits	No	The paper mentions using specific benchmark datasets (letter dataset, Naval Propulsion dataset) and refers to HPOBench for experimental setup details. However, it does not explicitly provide the training/test/validation split percentages, sample counts, or specific splitting methodologies within the provided text.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud instances) used for running the experiments.
Software Dependencies	No	The paper mentions DASK (Rocklin, 2015), Gaussian Processes, and Random Forests, but it does not specify version numbers for any software libraries or dependencies used in the implementation or experiments.
Experiment Setup	No	The paper describes different pre-set configurations for SMAC3's internal components (e.g., 'Sobol sequence as initial design, a GP with 5/2-Mat ern Kernel and EI as acquisition function' for SMAC4BB, or 'random forest as surrogate model and log EI as acquisition function' for SMAC4HPO). However, it does not provide specific hyperparameters or system-level training settings for the machine learning models being optimized or for the models used in the empirical comparisons (e.g., learning rates, batch sizes, number of epochs for DNNs).