Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Multi-Objective Neural Architecture Search by Learning Search Space Partitions
Authors: Yiyang Zhao, Linnan Wang, Tian Guo
JMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our La MOO algorithm on two types of NAS scenarios. The first type is based on three popular NAS datasets, Nas Bench201 Dong and Yang (2020), Nas Bench301 Zela et al. (2022) and HW-NAS-Bench Li et al. (2021). The second type is real-world deep learning domain applications, including image classification, object detection, and language models. |
| Researcher Affiliation | Academia | Yiyang Zhao EMAIL WORCESTER POLYTECHNIC INSTITUTE Linnan Wang EMAIL BROWN UNIVERSITY Tian Guo EMAIL WORCESTER POLYTECHNIC INSTITUTE |
| Pseudocode | Yes | Algorithm 1 Pseudo-code of La MOO for the NAS task. 1: Inputs: Initial D0 from uniform sampling, sample budget T. 2: for t = 0, . . . , T do 3: Set L {Ωroot} (collections of regions to be split). 4: while L = do 5: Ωj pop_first_element(L), Dt,j Dt Ωj, nt,j |Dt,j|. 6: Compute dominance number ot,j of Dt,j using Eqn. 2 and train a SVM model h( ). 7: If (Dt,j, ot,j) is splittable by SVM, then L L Partition(Ωj, h( )). 8: end while 9: if Path Selection then 10: for k = root, k is not leaf node do 11: Dt,k Dt Ωk, vt,k Hyper Volume(Dt,k), nt,k |Dt,k|. 12: k arg max c children(k) UCBt,c, where UCBt,c := vt,c + 2Cp q nt,c 13: end for 14: end if 15: if Leaf Selection then 16: for k = root, k is not leaf node do 17: Dt,k Dt Ωk, nt,k |Dt,k|. 18: end for 19: end if 20: for l is leaf node do 21: vt,l Hyper Volume(Dt,l) 22: end for 23: k arg max l leaf nodes UCBt,l, where UCBt,l := vt,l + 2Cp q 2 log(nt,l) nt,p , where p is the parent of l. 24: Dt+1 Dt Dnew, where Dnew is drawn from Ωk based on sampling algorithms such as q EHVI or CMA-ES. 25: end for |
| Open Source Code | No | The paper does not provide a direct link to a source code repository or an explicit statement about the public release of the code for the methodology described. |
| Open Datasets | Yes | We evaluate our La MOO algorithm on two types of NAS scenarios. The first type is based on three popular NAS datasets, Nas Bench201 Dong and Yang (2020), Nas Bench301 Zela et al. (2022) and HW-NAS-Bench Li et al. (2021). The second type is real-world deep learning domain applications, including image classification, object detection, and language models. |
| Dataset Splits | Yes | Nas Bench201 provides all architectures information in its search space and comprises 15625 architectures trained to converge on CIFAR10 Krizhevsky (2009). As such, NAS algorithms can leverage the preexisting information about each architecture s #FLOPs and accuracy as ground truth to avoid time-consuming training during algorithm evaluation. |
| Hardware Specification | Yes | For each architecture in the Pareto frontier, we train it using 8 Tesla V100 GPUs with images of a 224x224 resolution in (accuracy, #FLOPs) two-objective search. |
| Software Dependencies | No | The paper mentions 'Tensor RT latency with FP16' but does not specify the version number of Tensor RT or any other key software dependencies with their versions. |
| Experiment Setup | Yes | Each sampled network is trained for 600 epochs, with a batch size of 128, using a momentum SGD optimizer initiated with a learning rate of 0.025, which is then subject to a cosine learning rate schedule throughout the training period. Weight decay is employed for regularization purposes. |