reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ADMMBO: Bayesian Optimization with Unknown Constraints using ADMM

Authors: Setareh Ariafar, Jaume Coll-Font, Dana Brooks, Jennifer Dy

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on a number of challenging BO benchmark problems show that our proposed approach outperforms the state-of-the-art methods in terms of the speed of obtaining a feasible solution and convergence to the global optimum as well as minimizing the number of total evaluations of unknown objective and constraints functions.
Researcher Affiliation	Academia	Setareh Ariafar EMAIL Electrical and Computer Engineering Department Northeastern University Boston, MA 02115, USA Jaume Coll-Font EMAIL Computational radiology Laboratory Boston Children s Hospital Boston, MA 02115, USA Dana Brooks EMAIL Electrical and Computer Engineering Department Northeastern University Boston, MA 02115, USA Jennifer Dy EMAIL Electrical and Computer Engineering Department Northeastern University Boston, MA 02115, USA
Pseudocode	Yes	Algorithm 3.1 ADMMBO Algorithm 3.2 OPT Algorithm 3.3 FEAS
Open Source Code	Yes	Please see our opensource code available at https://github.com/Setareh Ar/ADMMBO for more details on each experiment.
Open Datasets	Yes	We compare ADMMBO with four state-of-the-art constrained Bayesian optimization methods1: EIC (Gelbart et al., 2014; Gardner et al., 2014), ALBO (Gramacy et al., 2016), Slack-AL (Picheny et al., 2016) and PESC (Hern andez-Lobato et al., 2015). In our last experiment, we tune the hyperparameters of a three-hidden-layers fully connected neural network for a multiclass classiﬁcation task using MNIST dataset (Le Cun, 1998; Hern andez Lobato et al., 2015).
Dataset Splits	No	The paper mentions using the MNIST dataset and minimizing validation error, but it does not specify explicit training/test/validation split percentages or sample counts for the dataset.
Hardware Specification	Yes	We consider the optimization problem of ﬁnding a set of hyperparameters that minimize the validation error subject to the prediction time being smaller than or equal to 0.045 second on NVIDIA Tesla K80 GPU.
Software Dependencies	No	We build our network using Keras with Tensor Flow backends (Chollet et al., 2015; Abadi et al., 2016). While Keras and TensorFlow are mentioned as software used, no specific version numbers for these components are provided.
Experiment Setup	Yes	In all the synthetic problems, discussed below, similar to (Hern andez-Lobato et al., 2015; Picheny et al., 2016; Gramacy et al., 2016), we assume that f and ci follow independent GP priors with zero mean and squared exponential kernels. For the problem of hyperparameter tuning in Neural Networks on the MNIST dataset, we assume that f and ci , follow independent GP priors with zero mean and with Mat ern 5/2 kernels (Hern andez Lobato et al., 2015). For ADMMBO, in all the experiments we set M {20, 50}, ρ = 0.1, ϵ = 0.01, δ = 0.05 and initialize y1 i and z1 i with the bounds of B. Further, in all the experiments, we set the total BO iteration budget to 100(N + 1), where N is the number of constraints of the optimization. We empirically observed that ADMMBO performed best when we assign a higher BO budget for the ﬁrst iteration of the algorithm. Thus, we set α1 = β1 i {10, 20, 50} for the ﬁrst iteration and αk = βk i {2, 5} for the rest. Considering total BO budget and the budgets for the optimality and feasibility subproblems, we set K accordingly. We initialize datasets F and Ci with n = mi = 2 points. We set µ = 10 and τ incr = τ decr = 2 similar to (Boyd et al., 2011; Hong and Luo, 2017). We consider the optimization problem of ﬁnding a set of hyperparameters that minimize the validation error subject to the prediction time being smaller than or equal to 0.045 second on NVIDIA Tesla K80 GPU. Here, we focus on eleven hyperparameters: learning rate, decay rate, momentum parameter, two drop out probabilities for the input layer and the hidden layers as well as two regularization parameters for the weight decay, the weight maximum value, the number of hidden units in each of the 3 hidden layers, and the choice of activation function (RELU or sigmoid).