Dextr: Zero-Shot Neural Architecture Search with Singular Value Decomposition and Extrinsic Curvature

Authors: Rohan Asthana, Joschua Conrad, Maurits Ortmanns, Vasileios Belagiannis

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our extensive evaluation includes a total of six experiments, including the Convolutional Neural Network (CNN) search space, i.e. DARTS and the Transformer search space, i.e. Auto Former. The proposed proxy demonstrates a superior performance on multiple correlation benchmarks, including NAS-Bench-101, NAS-Bench201, and Trans NAS-Bench-101-micro; as well as on the NAS task within the DARTS and the Auto Former search space, all while being notably efficient.
Researcher Affiliation Academia Rohan Asthana EMAIL Friedrich-Alexander-Universität Erlangen-Nürnberg Erlangen, Germany Joschua Conrad EMAIL Universität Ulm Ulm, Germany Maurits Ortmanns EMAIL Universität Ulm Ulm, Germany Vasileios Belagiannis EMAIL Friedrich-Alexander-Universität Erlangen-Nürnberg Erlangen, Germany
Pseudocode No The paper describes the methodology using mathematical formulations and textual explanations, especially in sections 3.2 and 3.3, and further implementation details in A.6, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code is available at https://github.com/rohanasthana/Dextr.
Open Datasets Yes Our extensive evaluation includes a total of six experiments, including the Convolutional Neural Network (CNN) search space, i.e. DARTS and the Transformer search space, i.e. Auto Former. The proposed proxy demonstrates a superior performance on multiple correlation benchmarks, including NAS-Bench-101, NAS-Bench201, and Trans NAS-Bench-101-micro; as well as on the NAS task within the DARTS and the Auto Former search space, all while being notably efficient. The architecture discovered through our proxy in these search spaces is then evaluated by training it on Image Net (Deng et al., 2009) and its test performance is compared with baseline approaches.
Dataset Splits Yes Our DARTS evaluation follows the standard protocol (Liu et al., 2019; Chen et al., 2021c; Peng et al., 2022) of searching for the optimal architecture on CIFAR-10 and training the found architecture on Image Net (Deng et al., 2009).
Hardware Specification Yes We run all the search procedures on a single NVIDIA RTX A6000 GPU with 48GB memory.
Software Dependencies No The paper describes experimental settings and parameters but does not explicitly list software dependencies (like PyTorch, TensorFlow, or Python versions) with specific version numbers, which are crucial for reproducibility.
Experiment Setup Yes The experimental configuration of the search on DARTS search space (Liu et al., 2019) using Zero-Cost-PT (Xiang et al., 2023) algorithm is detailed in Table 9. The training on Image Net for DARTS search space follows the same settings and protocol as (Lukasik et al., 2022; Asthana et al., 2024; Chen et al., 2021c). Table 9: Experimental settings for search in DARTS search space through Zero-Cost-PT algorithm. Settings: Batch Size 1, Cutout False, Learning Rate 0.025, Learning Rate Min 0.001, Momentum 0.9, Weight Decay 3e-4, Grad Clip 5, Init Channels 16, Layers 8, Drop Path Prob -