Bridging Layout and RTL: Knowledge Distillation based Timing Prediction

Authors: Mingjun Wang, Yihan Wen, Bin Sun, Jianan Mu, Juan Li, Xiaoyi Wang, Jing Justin Ye, Bei Yu, Huawei Li

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that RTLDistil achieves significant improvement in RTL-level timing prediction error reduction, compared to state-of-the-art prediction models. This framework enables accurate early-stage timing prediction, advancing EDA s left-shift paradigm while maintaining computational efficiency.
Researcher Affiliation Collaboration 1State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences, China 2The Chinese University of Hong Kong, HKSAR 3University of Chinese Academy of Sciences, China 4CASTEST Co., Ltd., China 5Beijing University of Technology, China.
Pseudocode No The paper describes the methodology using textual explanations and mathematical formulations (Equations 2a, 2b, 4-7), and diagrams (Figure 3), but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our code and dataset will be publicly available at https://github.com/sklp-edalab/RTLDistil.
Open Datasets Yes Our code and dataset will be publicly available at https://github.com/sklp-edalab/RTLDistil. To enable a comprehensive evaluation of scalability and adaptability, we collected 2004 RTL designs with diverse functionalities and complexities sourced from platforms including Git Hub, Hugging Face, Open Core, and RISC-V projects to reflect real-world industrial needs, including small arithmetic blocks, DSP modules, RISC-V subsystems, etc.
Dataset Splits Yes For dataset splits, circuits are split into 80% for training, 10% for validation, and 10% for testing, ensuring a fair evaluation of the model s ability to generalize across different circuits.
Hardware Specification Yes Experiments were conducted on 8 NVIDIA A100 GPUs, and models were implemented using Py Torch and Py Torch Geometric (Py G).
Software Dependencies No The paper mentions that "models were implemented using Py Torch and Py Torch Geometric (Py G)" but does not specify version numbers for these software components.
Experiment Setup Yes The optimization employed the Adam optimizer with an initial learning rate of 2 10 4 and batch sizes of 8. Multi-granularity knowledge distillation used grid-searched weights for nodelevel (α), subgraph-level (β), and global-level (γ) distillation.