reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

MotherNet: Fast Training and Inference via Hyper-Network Transformers

Authors: Andreas Mueller, Carlo Curino, Raghu Ramakrishnan

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Mother Net on two tabular benchmarks, the small datasets in Open ML CC18, as used by Hollmann et al. (2022) and a version of the Tab Zilla benchmark (Mc Elfresh et al., 2024). Quantitative results are shown in Figure 2 and Table 4, where errors are given over the five paired splits of the data. We can see that Tab PFN outperforms all other methods, though not statistically significantly so, even at 60 minutes of tuning time for reference methods.
Researcher Affiliation	Industry	Andreas C. M uller, Carlo Curino & Raghu Ramakrishnan Gray Systems Lab Microsoft EMAIL
Pseudocode	No	The paper describes the methodology in Section 3 and illustrates the architecture in Figure 1. There is no explicit section or figure labeled 'Pseudocode' or 'Algorithm' presenting structured algorithmic steps.
Open Source Code	Yes	Training and inference code and pre-trained model weights are made publicly available 1. 1https://github.com/microsoft/ticl
Open Datasets	Yes	Using a fixed model structure, we are able to produce neural networks that work well on small numeric tabular datasets from the Open ML CC-18 benchmark suite (Bischl et al., 2017), and show that our approach also provides a good trade-off of speed and accuracy on the Tab Zilla dataset collection Mc Elfresh et al. (2024).
Dataset Splits	Yes	As in Hollmann et al. (2022), we split each dataset 50/50 into training (or in-context learning) and test set, and repeat this split five times. For this evaluation, we follow Mc Elfresh et al. (2024) in their setup for Tab PFN, and subsample 3000 data points for Mother Net, as the full datasets are too large for the transformer architectures.
Hardware Specification	Yes	We train Mother Net on a single A100 GPU with 80GB of GPU memory, which takes approximately four weeks. We were able to process up to 30,000 data points on an A100 GPU with 80GB of memory, and 100,000 samples on CPU. All our experiments were done on a A100 GPU with 80GB of RAM on cloud infrastructure.
Software Dependencies	No	The paper mentions using 'scikit-learn (Pedregosa et al., 2011)' and 'Hyper Opt Bergstra et al. (2011)' for baseline hyperparameter tuning, but does not provide specific version numbers for the software dependencies used in their own experimental setup.
Experiment Setup	Yes	We are using increasing batch sizes of 8, 16 and 32 and a learning rate of 0.00003, with cosine annealing (Loshchilov & Hutter, 2016).