Prior Knowledge Guided Neural Architecture Generation

Authors: Jingrong Xie, Han Ji, Yanan Sun

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on new search spaces demonstrate that our method achieves superior accuracy over state-of-the-art methods. For example, we only need 0.004 GPU Days to generate architecture with 76.1% top-1 accuracy on Image Net and 97.56% on CIFAR-10. Furthermore, we can find competitive architecture for more unseen search spaces, such as Trans NAS-Bench-101 and NATS-Bench, which demonstrates the broad applicability of the proposed method.
Researcher Affiliation Academia 1Department of Computer Science Sichuan University, Chengdu, China. Correspondence to: Yanan Sun <EMAIL>.
Pseudocode Yes Algorithm 1 Overall Framework of PG-NAG
Open Source Code No The paper does not provide an explicit statement about releasing its source code, nor does it include a link to a code repository. It refers to various benchmarks and related works but not its own implementation code.
Open Datasets Yes Specifically, we select top-20 high-performance architectures from each of the three popular benchmarks, which are NAS-Bench-101 (Ying et al., 2019), NAS-Bench-201 (Dong & Yang, 2020b), and NAS-Bench-301 (Zela et al., 2020). For computer vision validation datasets, we utilize three widely used datasets: CIFAR-10, CIFAR-100, and Image Net... For the automatic speech recognition task, we valid architectures on TIMIT dataset. ... For the natural language processing task, we valid architectures on PTB dataset,
Dataset Splits Yes Image Net-16-120 contains 151,700 training images, 3,000 validation images, and 3,000 test images with 120 classes.
Hardware Specification Yes All experiments were done on Linux Ubuntu 18.04, using Nvidia 3090 GPUs.
Software Dependencies No The paper mentions that experiments were run on "Linux Ubuntu 18.04" (an operating system) but does not provide specific version numbers for any ancillary software libraries or frameworks (e.g., Python, PyTorch, TensorFlow, CUDA versions) that would be needed for reproducibility.
Experiment Setup Yes In the DARTS search space, we generate a cell and set the number of initial convolutional channels to 36. We optimize the architecture weights using stochastic gradient descent with an initial learning rate of 0.025 and a single Consine annealing learning rate schedule.