reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

NEAR: A Training-Free Pre-Estimator of Machine Learning Model Performance

Authors: Raphael Husistein, Markus Reiher, Marco Eckhoff

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To examine the performance of our zero-cost proxy NEAR, we evaluated it on the three standard cell-based NAS benchmarks NAS-Bench-101 (Ying et al., 2019), NATS-Bench-SSS, and NATSBench-TSS (Dong et al., 2021). We note that cell-based refers to the construction of the individual networks, which were obtained by placing stacks of repeated cells in a common skeleton (for details see Ying et al. (2019) and Dong et al. (2021)). For comparison, we report the two commonly employed rank correlation measures Kendall s τ (Kendall, 1938) and Spearman s ρ (Spearman, 1904) also for twelve other zero-cost proxies.
Researcher Affiliation	Academia	Raphael T. Husistein, Markus Reiher, and Marco Eckhoff Department of Chemistry and Applied Biosciences, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland. EMAIL
Pseudocode	No	The paper describes the proposed method, NEAR, through mathematical definitions (e.g., Definition 3.1 and 3.2) and prose, and includes figures illustrating neural network architectures (Figure A.1) and processes (Figure A.3). However, it does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured, code-like steps for its procedures in the main text or appendices.
Open Source Code	Yes	In order to make our calculations reproducible, we provide the code including all hyperparameter settings as well as the raw data on Zenodo (Husistein et al., 2024). The NEAR software is available on Git Hub (https://github.com/Reiher Group/NEAR).
Open Datasets	Yes	To examine the performance of our zero-cost proxy NEAR, we evaluated it on the three standard cell-based NAS benchmarks NAS-Bench-101 (Ying et al., 2019), NATS-Bench-SSS, and NATSBench-TSS (Dong et al., 2021). ... NATS-Bench-TSS search space consists of 15 625 neural network architectures trained on the datasets CIFAR-10, CIFAR-100 (Krizhevsky & Hinton, 2009), and Image Net16-120 (Chrabaszcz et al., 2017). ... One is the balanced version of the extended MNIST (EMNIST) dataset (Cohen et al., 2017).
Dataset Splits	Yes	We applied early stopping with a patience of 10 epochs, whereby 10% of the training set served as validation set.
Hardware Specification	No	The paper mentions 'substantial computing power' as a challenge in neural network development and discusses 'computational demand' and 'computational burden' in the context of efficiency. However, it does not specify any particular hardware components, such as CPU or GPU models, memory configurations, or cloud computing resources used for their experiments.
Software Dependencies	No	The reproducibility statement indicates that 'the code including all hyperparameter settings as well as the raw data on Zenodo' and 'The NEAR software is available on Git Hub'. While this implies software was used and will be provided, the paper itself does not explicitly list specific software dependencies with their version numbers within the textual content.
Experiment Setup	Yes	We employed a multi-layer perceptron with two hidden layers of size 200 and the Adam optimizer (Kingma & Ba, 2015). We applied early stopping with a patience of 10 epochs, whereby 10% of the training set served as validation set.