Well-tuned Simple Nets Excel on Tabular Datasets

Authors: Arlind Kadra, Marius Lindauer, Frank Hutter, Josif Grabocka

NeurIPS 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically assess the impact of these regularization cocktails for MLPs in a large-scale empirical study comprising 40 tabular datasets and demonstrate that (i) well-regularized plain MLPs significantly outperform recent state-of-the-art specialized neural network architectures, and (ii) they even outperform strong traditional ML methods, such as XGBoost.
Researcher Affiliation Collaboration Arlind Kadra Department of Computer Science University of Freiburg EMAIL; Marius Lindauer Institute for Information Processing Leibniz University Hannover EMAIL; Frank Hutter Department of Computer Science University of Freiburg & Bosch Center for Artificial Intelligence EMAIL; Josif Grabocka Department of Computer Science University of Freiburg EMAIL
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes We provide the code for our implementation at the following link: https://github.com/releaunifreiburg/ Well Tuned Simple Nets.
Open Datasets Yes We use a large collection of 40 tabular datasets (listed in Table 9 of Appendix D). This includes 31 datasets from the recent open-source Open ML Auto ML Benchmark [16]2. In addition, we added 9 popular datasets from UCI [3] and Kaggle that contain roughly 100K+ instances. ... The datasets are retrieved from the Open ML repository [54] using the Open ML-Python connector [14]...
Dataset Splits Yes The datasets are retrieved from the Open ML repository [54] using the Open ML-Python connector [14] and split as 60% training, 20% validation, and 20% testing sets.
Hardware Specification Yes We ran all experiments on a CPU cluster, each node of which contains two Intel Xeon E5-2630v4 CPUs with 20 CPU cores each, running at 2.2GHz and a total memory of 128GB.
Software Dependencies No The paper mentions using "Py Torch library [43]" and "Auto DL-framework Auto-Pytorch [39, 62]" but does not specify their version numbers.
Experiment Setup Yes In order to focus exclusively on investigating the effect of regularization we fix the neural architecture to a simple multilayer perceptron (MLP) and also fix some hyperparameters of the general training procedure. These fixed hyperparameter values, as specified in Table 4 of Appendix B.1... We use a 9-layer feed-forward neural network with 512 units for each layer... We set a low learning rate of 10-3... We use Adam W [36]... and cosine annealing with restarts [35] as a learning rate scheduler. For the restarts, we use an initial budget of 15 epochs, with a budget multiplier of 2...