reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multiobjective Tree-Structured Parzen Estimator

Authors: Yoshihiko Ozaki, Yuki Tanigaki, Shuhei Watanabe, Masahiro Nomura, Masaki Onishi

JAIR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that MOTPE approximates the Pareto fronts of a variety of benchmark problems and a convolutional neural network design problem better than existing methods through the numerical results. We also investigate how the conﬁguration of MOTPE aﬀects the behavior and the performance of the method and the eﬀectiveness of asynchronous parallelization of the method based on the empirical results.
Researcher Affiliation	Collaboration	Yoshihiko Ozaki EMAIL Artiﬁcial Intelligence Research Center, AIST, Tokyo, Japan GREE, Inc., Tokyo, Japan; Yuki Tanigaki EMAIL Artiﬁcial Intelligence Research Center, AIST, Tokyo, Japan; Shuhei Watanabe EMAIL University of Freiburg, Freiburg, Germany; Masahiro Nomura nomura EMAIL Cyber Agent, Inc., Tokyo, Japan; Masaki Onishi EMAIL Artiﬁcial Intelligence Research Center, AIST, Tokyo, Japan
Pseudocode	Yes	Algorithm 1 Tree-structured Parzen Estimator; Algorithm 2 Split Observations; Algorithm 3 Greedy Hypervolume Subset Selection; Algorithm 4 Multiobjective Tree-structured Parzen Estimator; Algorithm 5 Asynchronous Parallel MOTPE; Algorithm 6 Worker for Asynchronous Parallel MOTPE
Open Source Code	Yes	The code is available at https://doi.org/10.5281/zenodo.6258358. The singularity container image which we used to run our code and the experimental data of Section 5 are available upon request.
Open Datasets	Yes	The WFG benchmark suite (Huband, Barone, While, & Hingston, 2005; Huband, Hingston, Barone, & While, 2006) that consists of nine problems was used to analyze the fundamental performance of MOTPE. ... the classiﬁcation error rate for the CIFAR-10 dataset (Krizhevsky, 2009)
Dataset Splits	Yes	The number of initial observations ni was set to 11n - 1 (i.e., 32 for n = 3 and 98 for n = 9), and the initial solutions were sampled using the Latin hypercube sampling (Mc Kay et al., 1979). ... The error rate was measured on a set of 10,000 images extracted from the training set. The rest of the training data, 40,000 images, were used for training.
Hardware Specification	No	Computational resource of AI Bridging Cloud Infrastructure (ABCI) provided by National Institute of Advanced Industrial Science and Technology (AIST) was used. ... although np = 4 is relatively small, this is a realistic setting as it is not easy to prepare dozens of graphics processing units.
Software Dependencies	Yes	We implemented MOTPE3 by modifying the TPE implementation of Optuna (version 2.0.0) (Akiba et al., 2019) whereas we used Spearmint4 (Snoek et al., 2012) for Par EGO, SMS-EGO and PESMO, and Hyper Mapper 2.0 (version 2.2.3) for Bayesian optimization with a random forests-based surrogate because Optuna does not provide these algorithms. ... The CNNs were implemented in the tf.keras (Tensorﬂow version 2.2.0) library and trained using the SGD optimizer with a batch size of 32 during 50 epochs.
Experiment Setup	Yes	We set γ = 0.10, nc = 24, and the scales for all parameters to uniform for MOTPE. ... The settings for Spearmint are shown in Table 1. ... The evaluation budget (including initial evaluations) was set to 1,000, the number of initial observations was set to 100, γ was set to 0.10, and nc was set to 24 for MOTPE. On the other hand, the evaluation budget for NSGA-II was set to 10,000. The remaining settings for NSGA-II were set to population size = 100, mutation prob = 1/n, crossover prob = 0.9, and swapping prob = 0.5. ... For MOTPE, we set γ = 0.10, nc = 24, the parameter scales for Number of units and SGD learning rate to log-uniform, and those for the rest of the numerical parameters to uniform. For Spearmint, we set likelihood = GAUSSIAN because the problem is noisy. Additionally, we marked Dropout rate, SGD learning rate, and SGD momentum as to ignore for the second objective because our second objective does not depend on these parameters... The CNNs were implemented in the tf.keras (Tensorﬂow version 2.2.0) library and trained using the SGD optimizer with a batch size of 32 during 50 epochs.