reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

HeadMap: Locating and Enhancing Knowledge Circuits in LLMs

Authors: Xuehao Wang, Liyuan Wang, Binghuai Lin, Yu Zhang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate that simply masking a small portion of attention heads in the knowledge circuit can significantly reduce the model s ability to make correct predictions. This suggests that the knowledge flow within the knowledge circuit plays a critical role when the model makes a correct prediction. Extensive experiments conducted on diverse datasets demonstrate the efficiency and efficacy of the proposed method.
Researcher Affiliation	Collaboration	Xuehao Wang1,2, , Liyuan Wang2, Binghuai Lin2, Yu Zhang1, 1Southern University of Science and Technology 2Tencent Technology Co., Ltd, China EMAIL
Pseudocode	Yes	Algorithm 1 Layer-Conditioned Locating Algorithm Require: Number of selected attention heads n of each layer, set of selected samples D, model M with L layers and H heads at each layer. 1: Initialize H 2: for l 1 to L do 3: for h 1 to H do 4: H H {(l, h)}; 5: lossl,h 0 6: for (X, Y) D do 7: lossl,h lossl,h + L(MH, X, Y) Calculate the loss when mask the heads in H 8: end for 9: H H \ {(l, h)} 10: end for 11: for h arg top K(lossl, n) do 12: H H {(l, h)} Select heads with higher influence of loss 13: end for 14: end for 15: return H
Open Source Code	Yes	Our code is available at https://github.com/Xuehao Wang Fi/Head Map.
Open Datasets	Yes	The commonsense reasoning tasks contain eight datasets, including six question-answering datasets (i.e., Bool Q (Clark et al., 2019), PIQA (Bisk et al., 2020), SIQA (Sap et al., 2019), ARC-e (Clark et al., 2018), ARC-c (Clark et al., 2018), and OBQA (Mihaylov et al., 2018)), a context completion dataset (i.e., Hella Swag (Zellers et al., 2019)), and a fill-in-theblank dataset (i.e., Wino Grande (Sakaguchi et al., 2021)).
Dataset Splits	Yes	Each dataset contains a training and testing set. We use the accuracy metric to measure the performance. ... We select 8 attention heads for each MHA block and use 64 samples in each dataset to locate the knowledge circuit. ... our analysis focuses on samples that LLMs can predict successfully and utilizes such 64 samples with the lowest next token prediction loss on each dataset to conduct experiments.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types) used for running the experiments. It mentions using LLa MA2-7B and LLa MA3-8B models, but not the computational resources for training or inference.
Software Dependencies	No	The paper mentions using the Adam W optimizer, but it does not specify version numbers for any key software components or libraries (e.g., Python, PyTorch, TensorFlow).
Experiment Setup	Yes	The batch size is set to 16, and the Adam W optimizer is used. We use the linear learning rate scheduler and train all methods for 2 epochs with 100 warmup steps. The learning rate is set to 0.0002 for Lo RA and Do RA, while set to 0.001 for Head Map. We set rank r = 32 for all methods and α = 64 for Lo RA and Do RA. We select 8 attention heads for each MHA block and use 64 samples in each dataset to locate the knowledge circuit.