reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Design Principle Transfer in Neural Architecture Search via Large Language Models

Authors: Xun Zhou, Xingyu Wu, Liang Feng, Zhichao Lu, Kay Chen Tan

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that LAPT can beat the state-of-the-art TNAS methods on most tasks and achieve comparable performance on others. ... Extensive experiments across various search spaces and tasks demonstrate the effectiveness of LAPT. ... Comparison to NAS methods Comparison on NAS201. ... Experiments on NAS201. ... Experiments on Trans101. ... Experiments on DARTs. ... Ablation study
Researcher Affiliation	Academia	1Department of Data Science and Artificial Intelligence, The Hong Kong Polytechnic University 2College of Computer Science, Chongqing University 3Department of Computer Science, City University of Hong Kong EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Framework of the proposed LAPT ... Algorithm 2: Principle Adaption
Open Source Code	No	The paper does not provide an explicit statement about releasing code or a link to a code repository for the methodology described.
Open Datasets	Yes	This work tests the proposed TNAS method on three search spaces, i.e., NAS-bench-201 (NAS201) (Dong and Yang 2020), Trans NAS-Bench-101 (Trans101) (Duan et al. 2021), and DARTs (Liu, Simonyan, and Yang 2018). ... Experiments on NAS201. ... Experiments on Trans101. ... Experiments on DARTs.
Dataset Splits	Yes	We separately search for top-performing architectures (i.e., Top 0.1%) on CIFAR-10 and CIFAR-100 from the refined search space... We search for individual architecture from the refined search space to solve image classification tasks on CIFAR-10 and Image Net.
Hardware Specification	No	The paper mentions 'GPU Days' as a metric for comparison but does not specify the particular GPU models, CPU models, or other hardware specifications used for the experiments.
Software Dependencies	No	GPT-4 is used as the pre-trained LLM for design principle learning and adaptation. Related prompts and the effectiveness of different LLMs are shown in https://arxiv.org/abs/2408.11330. (No specific version number for GPT-4 or other software is provided)
Experiment Setup	Yes	Details of hyperparameters are in Table 1 and implementations are as follows: ... Table 1: Hyper-parameters settings (Learning # of samples, Adaption r, # of iterations, population size, # of generations, REA tournament size, crossover probability, mutation probability, Supernet # of epochs)