Circuit Representation Learning with Masked Gate Modeling and Verilog-AIG Alignment
Authors: Haoyuan Wu, Haisheng Zheng, Yuan Pu, Bei Yu
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate MGVGA on various logic synthesis tasks for EDA and show the superior performance of MGVGA compared to previous state-of-the-art methods. |
| Researcher Affiliation | Collaboration | Haoyuan Wu1 Haisheng Zheng2 Yuan Pu1,3 Bei Yu1 1The Chinese University of Hong Kong 2Shanghai Artificial Intelligence Laboratory 3Chat EDA Tech |
| Pseudocode | No | The paper describes methodologies in text and uses figures to illustrate the overall architecture (e.g., Figure 3: Overview of the MGVGA). It does not contain any pseudocode blocks or algorithms. |
| Open Source Code | Yes | Our code is available at https://github.com/wuhy68/MGVGA. |
| Open Datasets | Yes | AIG Collection For MGM. We obtain 27 circuit designs from five circuit benchmarks as our training dataset: MIT LL Labs CEP (Chetwynd et al., 2019), ITC 99 (Davidson, 1999), IWLS 05 (Albrecht, 2005), EPFL (Amar u et al., 2015), and Open Core (Takeda, 2008). Verilog-AIG Pairs Collection For VGA. In this phase, source Verilog codes (Thakur et al., 2023; Liu et al., 2023) are selected and subjected to logic synthesis using Yosys (Wolf et al., 2013), and then they are converted into AIG format. Evaluation Dataset Collection. As for the evaluation dataset, we select 10 circuit designs external to the training dataset from opensource benchmark (Chowdhury et al., 2021; Amar u et al., 2015; Open RISC, 2009; Yosys HQ, 2020; Asanovic, 2016) |
| Dataset Splits | Yes | AIG Collection For MGM. We obtain 27 circuit designs from five circuit benchmarks as our training dataset... The resulting AIG dataset comprises 810000 AIGs and 40500 synthesis labels... Verilog-AIG Pairs Collection For VGA. This process yields 64826 Verilog-AIG pairs... Evaluation Dataset Collection. As for the evaluation dataset, we select 10 circuit designs external to the training dataset from opensource benchmark... Our dataset consists of 10000 pairs of logic gates to test the identification of logic equivalence. |
| Hardware Specification | Yes | The model is fine-tuned for 3 epochs on 8 A100 GPUs with 80G memory each. |
| Software Dependencies | No | The paper mentions software like Yosys, ABC, Deep GCN, and specific LLM models (gte-Qwen2-7B-instruct, Qwen2-7B) by name and provides citations, but does not specify version numbers for general software dependencies (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | The training process employs a linear learning rate schedule with the Adam optimizer set at a learning rate of 1 10 3, a weight decay of 0.01, and a batch size of 512. The model is fine-tuned for 3 epochs on 8 A100 GPUs with 80G memory each. |