reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Efficient Automated Circuit Discovery in Transformers using Contextual Decomposition

Authors: Aliyah Hsu, Georgia Zhou, Yeshwanth Cherapanamjeri, Yaxuan Huang, Anobel Odisho, Peter Carroll, Bin Yu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On three standard circuit evaluation datasets (indirect object identification, greater-than comparisons, and docstring completion), we demonstrate that CD-T outperforms ACDC and EAP by better recovering the manual circuits with an average of 97% ROC AUC under low runtimes. In addition, we provide evidence that faithfulness of CD-T circuits is not due to random chance by showing our circuits are 80% more faithful than random circuits of up to 60% of the original model size. All experiments are conducted on an NVIDIA A100 GPU.
Researcher Affiliation	Academia	Aliyah R. Hsu Department of EECS UC Berkeley EMAIL Georgia Zhou Department of EECS UC Berkeley Yeshwanth Cherapanamjeri CSAIL, MIT Yaxuan Huang Department of Statistics UC Berkeley Anobel Y. Odisho & Peter R. Carroll Department of Urology, Epidemiology and Biostatistics UC San Francisco Bin Yu Department of Statistics, EECS Center for Computational Biology UC Berkeley
Pseudocode	Yes	Our complete algorithm is described in Algorithm 1, presented in the specific case where we have chosen to decompose our source nodes s so that βs is the activation s deviation from the mean over some distribution.
Open Source Code	Yes	All code for using CD-T and reproducing results is made available on Github. 1 1https://github.com/adelaidehsu/CD_Circuit
Open Datasets	Yes	Specifically, for evaluation, we use three standard circuit evaluation datasets: indirect object identification (IOI) (Wang et al., 2023), greater-than comparisons (Greater-than) (Hanna et al., 2023), and docstring completion (Docstring) (Heimersheim & Janiak, 2023) (see Appendix A for details).
Dataset Splits	Yes	We identify circuits using 25 IOI samples drawn from mixed templates, and mean ablation is conducted using the corrupted ABC dataset. Another set of 100 IOI samples are used in evaluation. We identify circuits using a random sample of 100 datapoints provided by Hanna et al. (2023)... Another set of 100 samples are used in evaluation. We identify circuits using a 100 datapoints sampling for the dataset provided by Heimersheim & Janiak (2023)... Another set of 100 samples are used in evaluation.
Hardware Specification	Yes	All experiments are conducted on an NVIDIA A100 GPU.
Software Dependencies	No	The paper does not explicitly mention specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	For CD-T, we test by varying the percentile of top nodes to extract in every iteration, in the range of [90, 99]. For our mean-ablation , we simply take the mean over the activations over 100 negative datapoints (impossible completions, with the ending year preceding the starting century), and as above, when setting the decomposition at a source node, define the relevant component to be the deviation from the mean activation over this distribution. We identify circuits using 25 IOI samples drawn from mixed templates, and mean ablation is conducted using the corrupted ABC dataset.