Design Principle Transfer in Neural Architecture Search via Large Language Models
Authors: Xun Zhou, Xingyu Wu, Liang Feng, Zhichao Lu, Kay Chen Tan
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that LAPT can beat the state-of-the-art TNAS methods on most tasks and achieve comparable performance on others. ... Extensive experiments across various search spaces and tasks demonstrate the effectiveness of LAPT. ... Comparison to NAS methods Comparison on NAS201. ... Experiments on NAS201. ... Experiments on Trans101. ... Experiments on DARTs. ... Ablation study |
| Researcher Affiliation | Academia | 1Department of Data Science and Artificial Intelligence, The Hong Kong Polytechnic University 2College of Computer Science, Chongqing University 3Department of Computer Science, City University of Hong Kong EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: Framework of the proposed LAPT ... Algorithm 2: Principle Adaption |
| Open Source Code | No | The paper does not provide an explicit statement about releasing code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | This work tests the proposed TNAS method on three search spaces, i.e., NAS-bench-201 (NAS201) (Dong and Yang 2020), Trans NAS-Bench-101 (Trans101) (Duan et al. 2021), and DARTs (Liu, Simonyan, and Yang 2018). ... Experiments on NAS201. ... Experiments on Trans101. ... Experiments on DARTs. |
| Dataset Splits | Yes | We separately search for top-performing architectures (i.e., Top 0.1%) on CIFAR-10 and CIFAR-100 from the refined search space... We search for individual architecture from the refined search space to solve image classification tasks on CIFAR-10 and Image Net. |
| Hardware Specification | No | The paper mentions 'GPU Days' as a metric for comparison but does not specify the particular GPU models, CPU models, or other hardware specifications used for the experiments. |
| Software Dependencies | No | GPT-4 is used as the pre-trained LLM for design principle learning and adaptation. Related prompts and the effectiveness of different LLMs are shown in https://arxiv.org/abs/2408.11330. (No specific version number for GPT-4 or other software is provided) |
| Experiment Setup | Yes | Details of hyperparameters are in Table 1 and implementations are as follows: ... Table 1: Hyper-parameters settings (Learning # of samples, Adaption r, # of iterations, population size, # of generations, REA tournament size, crossover probability, mutation probability, Supernet # of epochs) |