Prior Knowledge Guided Neural Architecture Generation
Authors: Jingrong Xie, Han Ji, Yanan Sun
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on new search spaces demonstrate that our method achieves superior accuracy over state-of-the-art methods. For example, we only need 0.004 GPU Days to generate architecture with 76.1% top-1 accuracy on Image Net and 97.56% on CIFAR-10. Furthermore, we can find competitive architecture for more unseen search spaces, such as Trans NAS-Bench-101 and NATS-Bench, which demonstrates the broad applicability of the proposed method. |
| Researcher Affiliation | Academia | 1Department of Computer Science Sichuan University, Chengdu, China. Correspondence to: Yanan Sun <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Overall Framework of PG-NAG |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its source code, nor does it include a link to a code repository. It refers to various benchmarks and related works but not its own implementation code. |
| Open Datasets | Yes | Specifically, we select top-20 high-performance architectures from each of the three popular benchmarks, which are NAS-Bench-101 (Ying et al., 2019), NAS-Bench-201 (Dong & Yang, 2020b), and NAS-Bench-301 (Zela et al., 2020). For computer vision validation datasets, we utilize three widely used datasets: CIFAR-10, CIFAR-100, and Image Net... For the automatic speech recognition task, we valid architectures on TIMIT dataset. ... For the natural language processing task, we valid architectures on PTB dataset, |
| Dataset Splits | Yes | Image Net-16-120 contains 151,700 training images, 3,000 validation images, and 3,000 test images with 120 classes. |
| Hardware Specification | Yes | All experiments were done on Linux Ubuntu 18.04, using Nvidia 3090 GPUs. |
| Software Dependencies | No | The paper mentions that experiments were run on "Linux Ubuntu 18.04" (an operating system) but does not provide specific version numbers for any ancillary software libraries or frameworks (e.g., Python, PyTorch, TensorFlow, CUDA versions) that would be needed for reproducibility. |
| Experiment Setup | Yes | In the DARTS search space, we generate a cell and set the number of initial convolutional channels to 36. We optimize the architecture weights using stochastic gradient descent with an initial learning rate of 0.025 and a single Consine annealing learning rate schedule. |