reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Runtime Analysis of Evolutionary NAS for Multiclass Classification

Authors: Zeqiong Lv, Chao Qian, Yun Liu, Jiahao Fan, Yanan Sun

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we step for the runtime analysis, which is an essential theory aspect of EAs, of ENAS upon multiclass classification problems. Specifically, we first propose a benchmark to lay the groundwork for the analysis. Furthermore, we design a two-level search space, making it suitable for multiclass classification problems and consistent with the common settings of ENAS. Based on both designs, we consider (1+1)-ENAS algorithms with one-bit and bit-wise mutations, and analyze their upper and lower bounds on the expected runtime. We prove that the algorithm using both mutations can find the optimum with the expected runtime upper bound of O(r M ln r M) and lower bound of Ω(r M ln M). This suggests that a simple onebit mutation may be greatly considered, given that most state-of-the-art ENAS methods are laboriously designed with the bit-wise mutation. Empirical studies also support our theoretical proof.
Researcher Affiliation	Academia	1College of Computer Science, Sichuan University, China 2National Key Laboratory for Novel Software Technology, and School of Artificial Intelligence, Nanjing University, China. Correspondence to: Yanan Sun <EMAIL>.
Pseudocode	No	The paper describes the steps of the (1+1)-ENAS algorithm in Section 2.2, but they are presented as a numbered list within the paragraph text, not as a formally labeled 'Pseudocode' or 'Algorithm' block with distinct structural formatting.
Open Source Code	No	The paper does not contain any explicit statement about making the source code publicly available, nor does it provide a link to a code repository. Mentions of appendices are for extended experiments and proofs, not for code release.
Open Datasets	No	We propose a multiclass classification benchmark problem MCC that captures essential properties of real-world multiclass classification tasks, including both linearly and nonlinearly divisible decision regions (e.g., half-space region, unbounded/bounded polyhedra region). As defined in Definition 3.1, the input is a 2-dimensional input vector as suggested in the binary classification benchmark problems (Fischer et al., 2023; Lv et al., 2024b); the output label (class) takes on one of M labels. The paper defines its own synthetic benchmark problem (MCC) and how its data points are generated (on a unit circle); it does not use or provide access to any external or pre-existing public datasets.
Dataset Splits	No	The paper does not use traditional datasets with training/test/validation splits. Instead, it defines a synthetic problem (MCC) and varies its parameters (M and r) for experimental evaluation, as described in Section 5: 'We set the problem classes M from 2 to 24, with a step of 2, and the problem parameter r from 2 to 10, with a step of 2.' This is not a conventional dataset split.
Hardware Specification	No	The paper does not explicitly mention any specific hardware used for running the experiments, such as GPU models, CPU types, or cloud computing resources.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers, such as programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	In this section, we investigate the empirical performance of the (1+1)-ENAS algorithm with four different mutation settings for solving MCC. We set the problem classes M from 2 to 24, with a step of 2, and the problem parameter r from 2 to 10, with a step of 2. Figure 3 presents the average number of generations of 1,000 independent runs for finding an optimal solution.