MASIF: Meta-learned Algorithm Selection using Implicit Fidelity Information
Authors: Tim Ruhkopf, Aditya Mohan, Difan Deng, Alexander Tornede, Frank Hutter, Marius Lindauer
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In an extensive experimental study on four different benchmarks, we showed that MASIF outperforms existing meta-learning-based approaches in terms of the regret of the selected algorithm and learning curve-based algorithm selectors in terms of regret for the invested budget. |
| Researcher Affiliation | Collaboration | Frank Hutter EMAIL Machine Learning Lab Albert-Ludwigs University Freiburg Bosch Center for Artificial Intelligence |
| Pseudocode | No | The paper describes the MASIF architecture and data augmentation in sections 3.2 and 3.3, but does not include any explicitly labeled pseudocode or algorithm blocks with structured steps. |
| Open Source Code | Yes | 1MASIF s code is published on https://anonymous.4open.science/status/MASIF-824D |
| Open Datasets | Yes | Task-set can be obtained from https://github.com/google-research/google-research/tree/master/task_set. LCBench can be downloaded from https://github. com/automl/LCBench. We added the newly created Scikit-CC18 benchmark as supplementary material. We provide the Synthetic and Scikit-CC18 benchmarks as supplementary. |
| Dataset Splits | Yes | We then assess the performance of each approach by performing a 10-fold outer cross-validation of the meta-dataset. |
| Hardware Specification | Yes | All the experiments are executed on 4 Intel Xeon E5 cores with 8000MB RAM. |
| Software Dependencies | No | The packages & version numbers are available in the setups file of the linked repository. (Explanation: The specific version numbers are not listed directly within the paper text but are referred to an external file.) |
| Experiment Setup | Yes | We encode all the meta-features (i.e., ϕD and ϕA) with a 2-layer MLP (with hidden size of 128 and 64) into an embedding of size 64. All transformer encoder layers in MASIF have the same architecture with hidden size 128 and 4 attention heads. Each of the transformer encoders (Learning Curve Transformer Encoder and Algo Transformer Encoder) have 2 transformer layers are applied with a dropout rate of 0.2. We train MASIF with an ADAM optimizer with a learning rate of 0.001 and beta values as 0.9 and 0.999 for 500 epochs while neither learning rate scheduler nor weight decay are employed. |