MASIF: Meta-learned Algorithm Selection using Implicit Fidelity Information

Authors: Tim Ruhkopf, Aditya Mohan, Difan Deng, Alexander Tornede, Frank Hutter, Marius Lindauer

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In an extensive experimental study on four different benchmarks, we showed that MASIF outperforms existing meta-learning-based approaches in terms of the regret of the selected algorithm and learning curve-based algorithm selectors in terms of regret for the invested budget.
Researcher Affiliation Collaboration Frank Hutter EMAIL Machine Learning Lab Albert-Ludwigs University Freiburg Bosch Center for Artificial Intelligence
Pseudocode No The paper describes the MASIF architecture and data augmentation in sections 3.2 and 3.3, but does not include any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code Yes 1MASIF s code is published on https://anonymous.4open.science/status/MASIF-824D
Open Datasets Yes Task-set can be obtained from https://github.com/google-research/google-research/tree/master/task_set. LCBench can be downloaded from https://github. com/automl/LCBench. We added the newly created Scikit-CC18 benchmark as supplementary material. We provide the Synthetic and Scikit-CC18 benchmarks as supplementary.
Dataset Splits Yes We then assess the performance of each approach by performing a 10-fold outer cross-validation of the meta-dataset.
Hardware Specification Yes All the experiments are executed on 4 Intel Xeon E5 cores with 8000MB RAM.
Software Dependencies No The packages & version numbers are available in the setups file of the linked repository. (Explanation: The specific version numbers are not listed directly within the paper text but are referred to an external file.)
Experiment Setup Yes We encode all the meta-features (i.e., ϕD and ϕA) with a 2-layer MLP (with hidden size of 128 and 64) into an embedding of size 64. All transformer encoder layers in MASIF have the same architecture with hidden size 128 and 4 attention heads. Each of the transformer encoders (Learning Curve Transformer Encoder and Algo Transformer Encoder) have 2 transformer layers are applied with a dropout rate of 0.2. We train MASIF with an ADAM optimizer with a learning rate of 0.001 and beta values as 0.9 and 0.999 for 500 epochs while neither learning rate scheduler nor weight decay are employed.