reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

AutoFolio: An Automatically Configured Algorithm Selector

Authors: Marius Lindauer, Holger H. Hoos, Frank Hutter, Torsten Schaub

JAIR 2015 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate Auto Folio can signiﬁcantly improve the performance of claspfolio 2 on 8 out of the 13 scenarios from the Algorithm Selection Library, leads to new state-of-the-art algorithm selectors for 7 of these scenarios, and matches state-of-the-art performance (statistically) on all other scenarios. Compared to the best single algorithm for each AS scenario, Auto Folio achieves average speedup factors between 1.3 and 15.4.
Researcher Affiliation	Academia	Marius Lindauer EMAIL University of Freiburg Holger H. Hoos EMAIL University of British Columbia Frank Hutter EMAIL University of Freiburg Torsten Schaub EMAIL University of Potsdam INRIA Rennes
Pseudocode	Yes	Algorithm 1: Auto Folio: Automated conﬁguration of an algorithm selector Input : algorithm conﬁgurator AC, algorithm selector S, conﬁguration space C of S, training data of algorithm scenario D (with performance and feature matrix), number of folds k 1 randomly split D into D(1), . . . , D(k) 2 start AC with D(1), . . . , D(k) as meta instances, using average loss across meta-instances as performance metric m, and using S as target algorithm with conﬁguration space C 3 while conﬁguration budget remaining do 4 AC selects conﬁguration c C and meta instance n {1 . . . k} 5 train Sc on D\D(n), assess its loss on D(n) and return that loss to AC 6 return best conﬁguration c of S found by AC
Open Source Code	Yes	We also provide an open-source implementation of Auto Folio (www.ml4aad.org/autofolio/) based on the algorithm conﬁgurator SMAC (Hutter, Hoos, & Leyton-Brown, 2011) and the algorithm selection framework claspfolio 2 (Hoos et al., 2014). Our open-source implementation of Auto Folio is available at www.ml4aad.org/autofolio/.
Open Datasets	Yes	We ran Auto Folio on the thirteen algorithm selection scenarios that make up the Algorithm Selection Library 1.0 (Bischl et al., 2015b).3 ASlib scenario. As an input, claspfolio 2 reads an algorithm selection scenario, supporting the format of the Algorithm Selection library, ASlib. For a full speciﬁcation of the ASlib format, we refer the interested reader to aslib.net.
Dataset Splits	Yes	To judge the performance of an algorithm selection (AS) system on an AS scenario, it is crucial to partition the given set of problem instances into a training and a test set, use the AS system only on the training set to train a selector s, and evaluate s only on the test set instances... (ii) the training data is further partitioned into k folds (in our experiments, we use k = 10)... Overall, we use the instance set for each selection scenario as illustrated in Figure 5: (i) we split the full set of instances into a training and a test set and (ii) the training data is further partitioned into k folds (in our experiments, we use k = 10), which are used as follows.
Hardware Specification	Yes	We performed these experiments on the bw Uni Cluster in Karlsruhe, whose machines are equipped with two Octa-Core Intel Xeon E5-2670 (2.6 GHz, 20 MB cache) CPUs and 64 GB RAM each, running Hat Enterprise Linux 6.4.
Software Dependencies	Yes	In these experiments, Auto Folio employs claspfolio 2 using the well-known machine learning package scikit-learn (Pedregosa et al., 2011) (version 0.14.1) and the algorithm conﬁgurator SMAC (version 2.08.00).
Experiment Setup	Yes	Each conﬁgurator run was allocated a total time budget of 2 CPU days. A single run of claspfolio 2 was limited to 1 CPU hour, using the runsolver tool (Roussel, 2011). As a performance metric, we used penalized average runtime with penalty factor 10 (PAR10), which counts each timeout as 10 times the given runtime cutoﬀ(runtime cutoﬀs diﬀer between the ASlib scenarios). We further study how the optimization of PAR10 inﬂuenced other metrics, such as the number of timeouts.