reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA

Authors: Lars Kotthoff, Chris Thornton, Holger H. Hoos, Frank Hutter, Kevin Leyton-Brown

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Auto-WEKA 2.0 is now fully integrated with WEKA. This is important, because the crux of Auto-WEKA lies in its simplicity: providing a push-button interface that requires no knowledge about the available learning algorithms or their hyperparameters, asking the user to provide, in addition to the dataset to be processed, only a memory bound (1 GB by default) and the overall time budget available for the entire learning process... Figure 3: Example Auto-WEKA run on the iris dataset. The resulting best classiﬁer along with its parameter settings is printed ﬁrst, followed by its performance. While Auto-WEKA runs, it logs to the status bar how many conﬁgurations it has evaluated so far.
Researcher Affiliation	Academia	Department of Computer Science University of British Columbia 2366 Main Mall, Vancouver, B.C. V6T 1Z4 Canada; EMAIL
Pseudocode	No	The paper describes the system and its functionality but does not provide structured pseudocode or algorithm blocks for its own methodology.
Open Source Code	Yes	Source code for Auto-WEKA is hosted on Git Hub (https://github.com/automl/autoweka) and is available under the GPL license (version 3). Releases are published to the WEKA package repository and available both through the WEKA package manager and from the Auto-WEKA project website (http://automl.org/autoweka).
Open Datasets	Yes	Figure 3: Example Auto-WEKA run on the iris dataset. The resulting best classiﬁer along with its parameter settings is printed ﬁrst, followed by its performance.
Dataset Splits	Yes	Auto-WEKA performs cross-validation internally, so we disable WEKA s cross-validation (-no-cv).
Hardware Specification	No	The paper mentions parallel runs on a "single machine" but does not provide any specific hardware details such as CPU, GPU models, or memory specifications.
Software Dependencies	No	We describe the new version of Auto-WEKA, a system designed to help such users by automatically searching through the joint space of WEKA s learning algorithms and their respective hyperparameter settings to maximize performance, using a state-of-the-art Bayesian optimization method. Our new package is tightly integrated with WEKA, making it just as accessible to end users as any other learning algorithm... updated the software to work with the latest versions of WEKA and Java.
Experiment Setup	Yes	asking the user to provide, in addition to the dataset to be processed, only a memory bound (1 GB by default) and the overall time budget available for the entire learning process.4 The overall budget is set to 15 minutes by default to accommodate impatient users; longer runs allow the Bayesian optimizer to search the space more thoroughly; we recommend at least several hours for production runs.4. Internally, to avoid using all its budget for executing a single slow learner, Auto-WEKA limits individual runs of any learner to 1/12 of the overall budget; it further limits feature search to 1/60 of the budget.