Circuit Transformer: A Transformer That Preserves Logical Equivalence

Authors: Xihan Li, Xing Li, Lei Chen, Xing Zhang, Mingxuan Yuan, Jun Wang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we trained an 88-million-parameter Circuit Transformer to generate equivalent yet more compact forms of input circuits, outperforming existing neural approaches on both synthetic and real world benchmarks, without any violation of equivalence constraints.
Researcher Affiliation Collaboration 1 UCL Centre for Artificial Intelligence 2 Huawei Noah s Ark Lab EMAIL EMAIL
Pseudocode Yes Algorithm 1 Constrained Sequence Generation with Cutoff properties... Algorithm 2 The Computation of St... Algorithm 3 Circuit generation with immediate equivalent node merging... Algorithm 4 Circuit Generation with Circuit Transformer and Monte-Carlo Tree Search... Algorithm 5 Random generation of a k-input, l-output circuit
Open Source Code Yes Code: https://github.com/snowkylin/circuit-transformer
Open Datasets Yes IWLS FFWs: we transform the IWLS 2023 benchmark (Mishchenko, 2023) into circuits represented by AND and NOT gates by the script suggested in (Mishchenko & Chatterjee, 2022), and extract 1.5 million 8-input, 2-output fanout-free windows (FFWs), a kind of substructure of large circuits (Zhu et al., 2023).
Dataset Splits Yes 89% of the data is for training, 1% is for validation and 10% is reserved for testing.
Hardware Specification Yes All the Transformer models are trained on the training set sufficiently for 5 epochs on a single NVIDIA Ge Force RTX 4090 graphic card for 75 hours. All the experiments are conducted on a workstation with the following specification: CPU: AMD Ryzen 9 7950X Desktop Processor (16 cores, 32 threads) Memory: 192GB (48GB 4) DDR5 5200MHz GPU: NVIDIA Ge Force RTX 4090 2
Software Dependencies No The paper mentions 'The implementation is based on (Yu et al., 2020)' which refers to TensorFlow Model Garden, but does not provide specific version numbers for TensorFlow or any other libraries or software used.
Experiment Setup Yes The embedding width and the size of feedforward layer are set to 512 and 2048 following (Vaswani et al., 2017), while the number of attention layers is set to 12, leading to 88.2 million total parameters... The vocabulary size is 20... Batch size is set to be 128. The maximum length of the input and output sequence is set to be 200. To evaluate the effectiveness of tree positional encoding (TPE) in Section 4.4, we trained Circuit Transformers with and without TPE. The maximal depth of tree positional embeddings is set to be 32. ... trained ... for 5 epochs.