HeadMap: Locating and Enhancing Knowledge Circuits in LLMs
Authors: Xuehao Wang, Liyuan Wang, Binghuai Lin, Yu Zhang
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate that simply masking a small portion of attention heads in the knowledge circuit can significantly reduce the model s ability to make correct predictions. This suggests that the knowledge flow within the knowledge circuit plays a critical role when the model makes a correct prediction. Extensive experiments conducted on diverse datasets demonstrate the efficiency and efficacy of the proposed method. |
| Researcher Affiliation | Collaboration | Xuehao Wang1,2, , Liyuan Wang2, Binghuai Lin2, Yu Zhang1, 1Southern University of Science and Technology 2Tencent Technology Co., Ltd, China EMAIL |
| Pseudocode | Yes | Algorithm 1 Layer-Conditioned Locating Algorithm Require: Number of selected attention heads n of each layer, set of selected samples D, model M with L layers and H heads at each layer. 1: Initialize H 2: for l 1 to L do 3: for h 1 to H do 4: H H {(l, h)}; 5: lossl,h 0 6: for (X, Y) D do 7: lossl,h lossl,h + L(MH, X, Y) Calculate the loss when mask the heads in H 8: end for 9: H H \ {(l, h)} 10: end for 11: for h arg top K(lossl, n) do 12: H H {(l, h)} Select heads with higher influence of loss 13: end for 14: end for 15: return H |
| Open Source Code | Yes | Our code is available at https://github.com/Xuehao Wang Fi/Head Map. |
| Open Datasets | Yes | The commonsense reasoning tasks contain eight datasets, including six question-answering datasets (i.e., Bool Q (Clark et al., 2019), PIQA (Bisk et al., 2020), SIQA (Sap et al., 2019), ARC-e (Clark et al., 2018), ARC-c (Clark et al., 2018), and OBQA (Mihaylov et al., 2018)), a context completion dataset (i.e., Hella Swag (Zellers et al., 2019)), and a fill-in-theblank dataset (i.e., Wino Grande (Sakaguchi et al., 2021)). |
| Dataset Splits | Yes | Each dataset contains a training and testing set. We use the accuracy metric to measure the performance. ... We select 8 attention heads for each MHA block and use 64 samples in each dataset to locate the knowledge circuit. ... our analysis focuses on samples that LLMs can predict successfully and utilizes such 64 samples with the lowest next token prediction loss on each dataset to conduct experiments. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types) used for running the experiments. It mentions using LLa MA2-7B and LLa MA3-8B models, but not the computational resources for training or inference. |
| Software Dependencies | No | The paper mentions using the Adam W optimizer, but it does not specify version numbers for any key software components or libraries (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | The batch size is set to 16, and the Adam W optimizer is used. We use the linear learning rate scheduler and train all methods for 2 epochs with 100 warmup steps. The learning rate is set to 0.0002 for Lo RA and Do RA, while set to 0.001 for Head Map. We set rank r = 32 for all methods and α = 64 for Lo RA and Do RA. We select 8 attention heads for each MHA block and use 64 samples in each dataset to locate the knowledge circuit. |