Tunability: Importance of Hyperparameters of Machine Learning Algorithms

Authors: Philipp Probst, Anne-Laure Boulesteix, Bernd Bischl

JMLR 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Secondly, we conduct a large-scale benchmarking study based on 38 datasets from the Open ML platform and six common machine learning algorithms. We apply our measures to assess the tunability of their parameters. Our results yield default values for hyperparameters and enable users to decide whether it is worth conducting a possibly time consuming tuning strategy, to focus on the most important hyperparameters and to choose adequate hyperparameter spaces for tuning.
Researcher Affiliation Academia Philipp Probst EMAIL Institute for Medical Information Processing, Biometry and Epidemiology, LMU Munich Marchioninistr. 15, 81377 München, Germany Anne-Laure Boulesteix EMAIL Institute for Medical Information Processing, Biometry and Epidemiology, LMU Munich Marchioninistr. 15, 81377 München, Germany Bernd Bischl EMAIL Department of Statistics, LMU Munich Ludwigstraße 33, 80539 München, Germany
Pseudocode No The paper describes methods and procedures in prose, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes The fully reproducible R code for all computations and analyses of our paper can be found on the github page: https://github.com/Philipp Pro/tunability.
Open Datasets Yes We use a specific subset of carefully curated classification datasets from the Open ML platform called Open ML100 (Bischl et al., 2017a). For our study we only use the 38 binary classification tasks that do not contain any missing values.
Dataset Splits Yes The performance estimation for the different hyperparameter experiments is computed through 10-fold cross-validation. For the comparison of surrogate models 10 times repeated 10-fold cross-validation is used.
Hardware Specification No The paper discusses software tools and parallelization but does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No All our experiments are executed in R and are run through a combination of custom code from our random bot (Kühn et al., 2018b), the Open ML R package (Casalicchio et al., 2017), mlr (Bischl et al., 2016) and batchtools (Lang et al., 2017) for parallelization. All results are uploaded to the Open ML platform and there publicly available for further analysis. mlr is also used to compare and fit all surrogate regression models.
Experiment Setup Yes The algorithms considered in this paper are common methods for supervised learning. We examine elastic net (glmnet R package), decision tree (rpart), k-nearest neighbors (kknn), support vector machine (svm), random forest (ranger) and gradient boosting (xgboost). For more details about the used software packages see Kühn et al. (2018b). An overview of their considered hyperparameters is displayed in Table 1, including respective data types, box-constraints and a potential transformation function. [...] We sample these points from independent uniform distributions where the respective support for each parameter is displayed in Table 1. [...] For the estimation of the defaults for each algorithm we randomly sample 100000 points in the hyperparameter space as defined in Table 1 and determine the configuration with the minimal average risk. The same strategy with 100000 random points is used to obtain the best hyperparameter setting on each dataset that is needed for the estimation of the tunability of an algorithm. For the estimation of the tunability of single hyperparameters we also use 100000 random points for each parameter, while for the tunability of combination of hyperparameters we only use 10000 random points to reduce runtime as this should be enough to cover 2-dimensional hyperparameter spaces.