Variation Matters: from Mitigating to Embracing Zero-Shot NAS Ranking Function Variation

Authors: Pavel Rumiantsev, Mark Coates

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that the proposed stochastic ordering can effectively boost performance of a search on standard benchmark search spaces. ... We present results for the NAS-Bench search spaces in Fig. 1a and results for the Trans NAS search spaces in Fig. 1b. ... Results are presented in Table 1.
Researcher Affiliation Academia Pavel Rumiantsev EMAIL The Department of Electrical and Computer Engineering Mc Gill University Mark Coates EMAIL The Department of Electrical and Computer Engineering Mc Gill University
Pseudocode Yes Algorithm 1 Statistical MAX and TOP-K pseudocode ... Algorithm 2 Regularised evolutionary algorithm (REA) ... Algorithm 3 Greedy evolutionary search algorithm difference with Algorithm 2 ... Algorithm 4 Free regularised evolutionary algorithm (Free REA) difference with Algorithm 2
Open Source Code No The paper lists third-party software (automl/NASLib, Py Torch, Num Py, Sci Py) and datasets with their licenses and citations, but does not provide specific access (link or explicit statement) to the authors' own implementation code for the methodology described in the paper.
Open Datasets Yes The following dataset were used: CIFAR-10/100 (Krizhevsky, 2009) under CC BY 4.0 Licence Image Net-16-120 (Chrabaszcz et al., 2017) under CC BY 4.0 Licence Nina Pro (Atzori et al., 2012) under CC BY-ND Licence Five datasets from Taskonomy collection (Zamir et al., 2018) under CC BY 4.0 Licence
Dataset Splits Yes For this work, we view a search space as a combination of a feasible architecture set and a dataset of labelled training, validation, and test samples. ... standard architectural search spaces including NAS-Bench-101, NAS-Bench-201, and Trans NAS-Bench-101. ... NAS-Bench-101 Ying et al. (2019) includes the performance and training statistic of the 423k architectures on CIFAR-10. ... NAS-Bench-201 (Dong & Yang, 2019) has 15625 architectures in the search space and provides performance for CIFAR-10, CIFAR-100, and Image Net-16-120.
Hardware Specification No The paper does not explicitly mention any specific hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies No In this work, we used the following software: automl/NASLib (Mehta et al., 2022) available on Git Hub under Apache 2.0 Licence Py Torch (Paszke et al., 2019) available via Py PI under a custom BSD licence Num Py (Harris et al., 2020) available via Py PI under a custom BSD licence Sci Py (Virtanen et al., 2020) available via Py PI under a custom BSD licence. While software names are listed, specific version numbers are not provided for PyTorch, NumPy, or SciPy.
Experiment Setup Yes We set V = 10 and set B = 64. ... The 5% significance level is used for rejecting the null hypothesis when conducting the Mann-Whitney U-test. ... We repeat each experiment 100 times, computing the mean value of the accuracy for the selected architecture and its variation. ... In the case of averaging, we cache the average of the ranking function output over 10 evaluations. ... We use an evaluation budget and cap it at 1000 evolution cycles. ... The optimal threshold lies within the range from 0.025 to 0.075.