Discovering Influential Neuron Path in Vision Transformers
Authors: Yifan Wang, Yifei Liu, Yingdong Shi, Changming Li, Anqi Pang, Sibei Yang, Jingyi Yu, Kan Ren
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate the superiority of our method finding the most influential neuron path along which the information flows, over the existing baseline solutions. Additionally, the neuron paths have illustrated that vision Transformers exhibit some specific inner working mechanism for processing the visual information within the same image category. We further analyze the key effects of these neurons on the image classification task, showcasing that the found neuron paths have already preserved the model capability on downstream tasks, which may also shed some lights on real-world applications like model pruning. The project website including implementation code is available at https://foundation-model-research.github.io/Neuron Path/. ... We have conducted several quantitative and qualitative experiments on the found neuron path, illustrating the significant role it plays and the advantage of our solution discovering and explaining the critical part of vision Transformer models. |
| Researcher Affiliation | Collaboration | 1Shanghai Tech University, 2Tencent PCG EMAIL |
| Pseudocode | Yes | Algorithm 1 Layer-progressive Neuron Locating Algorithm ... Algorithm 2 Greedy Search-Based Influence Pattern Algorithm |
| Open Source Code | Yes | The project website including implementation code is available at https://foundation-model-research.github.io/Neuron Path/. |
| Open Datasets | Yes | By applying the method to two types of vision Transformer models using different pretrain paradigms, supervised Vi T (Vi T-B-16) (Dosovitskiy et al., 2021) and self-supervised Masked Auto Encoder (MAE-B-16) (He et al., 2022), we have derived an unex- Published as a conference paper at ICLR 2025 pected discovery, as illustrated in Figure 2. Despite both models being pretrained and finetuned on the same dataset (Image Net (Deng et al., 2009)) with almost identical model structures |
| Dataset Splits | Yes | For a selected model, we retain the top t {1, 5, 10, 30, 50} most influential neurons per layer within the neuron paths discovered by our Neuron Path method. Following the statistical procedure outlined in Section 4.3, we identified neuron paths for each category using the 80% of the image data, and transfer the results and conduct the pruning experiment on the rest 20% image data, establishing a generalization setting. |
| Hardware Specification | Yes | All the experiments are run on NVIDIA A40 GPUs with batch size equals to 10 and sampling step m equals to 20, using Image Net1k validation set. |
| Software Dependencies | No | The paper does not explicitly mention specific software libraries or frameworks with version numbers for reproducibility. |
| Experiment Setup | Yes | All the experiments are run on NVIDIA A40 GPUs with batch size equals to 10 and sampling step m equals to 20, using Image Net1k validation set. ... For our following experiments, we will mainly utilize 3 Vi T settings and 1 MAE setting: Vi T-B-16, Vi T-B-32, Vi T-L-32 and MAE-B-16 as the target models. Details of these models are in Appendix C.1. ... As for the calculation of JAS, we set the sampling step m = 20 in Eq. (3) for the following experiments. |