DANCE: Resource-Efficient Neural Architecture Search with Data-Aware and Continuous Adaptation
Authors: Maolin Wang, Tianshuo Wei, Sheng Zhang, Ruocheng Guo, Wangyu Wang, Shanshan Ye, Lixin Zou, Xuetao Wei, Xiangyu Zhao
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments across five datasets demonstrate DANCE s effectiveness. Our method consistently outperforms state-of-the-art NAS approaches in terms of accuracy while significantly reducing search costs. Under varying computational constraints, DANCE maintains robust performance while smoothly adapting architectures to different hardware requirements. The code and appendix can be found at https://github.com/Applied-Machine Learning-Lab/DANCE. |
| Researcher Affiliation | Academia | Maolin Wang1 , Tianshuo Wei1 , Sheng Zhang1 , Ruocheng Guo2, , Wanyu Wang1, , Shanshan Ye3, , Lixin Zou4 , Xuetao Wei5 , Xiangyu Zhao1 1City University of Hong Kong 2Independent Researcher 3Australian Artificial Intelligence Institute, University of Technology Sydney 4Wuhan University 5Southern University of Science and Technology EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Three-Stage Training Process Input: Training data D, Resource constraints C, Initial parameters θ Output: Optimized Sub Net A |
| Open Source Code | Yes | The code and appendix can be found at https://github.com/Applied-Machine Learning-Lab/DANCE. |
| Open Datasets | Yes | The evaluation framework employs CIFAR-10 [Krizhevsky et al., 2009] and CIFAR100 [Krizhevsky et al., 2009] as foundational benchmarks to establish baseline architectural distributions. Three challenging fine-grained datasets Stanford Cars [Kramberger and Potoˇcnik, 2020], CUB-200-2011 [Wah et al., 2011], and Food-101 [Bossard et al., 2014]-are used to rigorously test the precision of dynamic selection mechanisms and architectural sampling under high-resolution visual discrimination requirements. |
| Dataset Splits | No | The paper does not explicitly provide specific percentages, sample counts, or clear references to predefined dataset splits (e.g., train/validation/test) for reproducibility, beyond mentioning the names of the datasets used. |
| Hardware Specification | No | Training is performed on standard GPU hardware with automatic mixed precision (AMP) enabled for efficiency. |
| Software Dependencies | No | All components are implemented using Py Torch and monitored through comprehensive metrics, including accuracy, loss components, and gate statistics. |
| Experiment Setup | Yes | Stage 1(Pre-training)runs for epochs with frozen Select Gate modules, while Stage 2 (Distribution-Guided Architecture Learning) continues for epochs with all components activated. The learning rates are selected from [0.0001, 0.0005, 0.001] for different components. We employ the Adam W optimizer with One Cycle LR scheduler using 30% warm-up period and early stopping patience of 15. All components are implemented using Py Torch and monitored through comprehensive metrics, including accuracy, loss components, and gate statistics. The backbone network uses Res Net-18 and VGG-16 architecture with customized Select Gate modules integrated at different layers. |