reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Circuit Representation Learning with Masked Gate Modeling and Verilog-AIG Alignment

Authors: Haoyuan Wu, Haisheng Zheng, Yuan Pu, Bei Yu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate MGVGA on various logic synthesis tasks for EDA and show the superior performance of MGVGA compared to previous state-of-the-art methods.
Researcher Affiliation	Collaboration	Haoyuan Wu1 Haisheng Zheng2 Yuan Pu1,3 Bei Yu1 1The Chinese University of Hong Kong 2Shanghai Artificial Intelligence Laboratory 3Chat EDA Tech
Pseudocode	No	The paper describes methodologies in text and uses figures to illustrate the overall architecture (e.g., Figure 3: Overview of the MGVGA). It does not contain any pseudocode blocks or algorithms.
Open Source Code	Yes	Our code is available at https://github.com/wuhy68/MGVGA.
Open Datasets	Yes	AIG Collection For MGM. We obtain 27 circuit designs from five circuit benchmarks as our training dataset: MIT LL Labs CEP (Chetwynd et al., 2019), ITC 99 (Davidson, 1999), IWLS 05 (Albrecht, 2005), EPFL (Amar u et al., 2015), and Open Core (Takeda, 2008). Verilog-AIG Pairs Collection For VGA. In this phase, source Verilog codes (Thakur et al., 2023; Liu et al., 2023) are selected and subjected to logic synthesis using Yosys (Wolf et al., 2013), and then they are converted into AIG format. Evaluation Dataset Collection. As for the evaluation dataset, we select 10 circuit designs external to the training dataset from opensource benchmark (Chowdhury et al., 2021; Amar u et al., 2015; Open RISC, 2009; Yosys HQ, 2020; Asanovic, 2016)
Dataset Splits	Yes	AIG Collection For MGM. We obtain 27 circuit designs from five circuit benchmarks as our training dataset... The resulting AIG dataset comprises 810000 AIGs and 40500 synthesis labels... Verilog-AIG Pairs Collection For VGA. This process yields 64826 Verilog-AIG pairs... Evaluation Dataset Collection. As for the evaluation dataset, we select 10 circuit designs external to the training dataset from opensource benchmark... Our dataset consists of 10000 pairs of logic gates to test the identification of logic equivalence.
Hardware Specification	Yes	The model is fine-tuned for 3 epochs on 8 A100 GPUs with 80G memory each.
Software Dependencies	No	The paper mentions software like Yosys, ABC, Deep GCN, and specific LLM models (gte-Qwen2-7B-instruct, Qwen2-7B) by name and provides citations, but does not specify version numbers for general software dependencies (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	The training process employs a linear learning rate schedule with the Adam optimizer set at a learning rate of 1 10 3, a weight decay of 0.01, and a batch size of 512. The model is fine-tuned for 3 epochs on 8 A100 GPUs with 80G memory each.