Beyond Intuition: Rethinking Token Attributions inside Transformers

Authors: Jiamin Chen, Xuhong Li, Lei Yu, Dejing Dou, Haoyi Xiong

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our method is further validated qualitatively and quantitatively through the faithfulness evaluations across different settings: single modality (BERT and Vi T) and bi-modality (CLIP), different model sizes (Vi T-L) and different pooling strategies (Vi T-MAE) to demonstrate the broad applicability and clear improvements over existing methods.4 Experiments We validate our proposed explanation method by comparing the results with several strong baselines. The experiment settings are based on two aspects: different modalities and different model versions. The experimental results show the clear advantages and wide applicability of our methods over the others in explaining Transformers. 4.1 Experimental Settings Faithfulness Evaluation. Following previous works (Abnar & Zuidema, 2020; Chefer et al., 2021a;b; Samek et al., 2017; Vu et al., 2019; De Young et al., 2020), we prepare three types of tests for the trustworthiness evaluation: 1) Perturbation Tests... 2) Segmentation Tests... 3) Language Reasoning... 4.4 Ablation Study We propose two ablation studies...
Researcher Affiliation Collaboration Jiamin Chen EMAIL Beihang University & Baidu Inc. Xuhong Li EMAIL Baidu Inc. Lei Yu EMAIL Beihang University & Beihang Hangzhou Innovation Institute Yuhang Dejing Dou EMAIL Baidu Inc. Haoyi Xiong EMAIL Baidu Inc.
Pseudocode No The paper describes its methodology using mathematical derivations and textual descriptions in Section 3, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes 1Code available at https://github.com/jiaminchen-1031/transformerinterp and Interpret DL (Li et al., 2022) as well.
Open Datasets Yes Language Reasoning comes from a NLP benchmark ERASER (De Young et al., 2020) for rationales extraction... We select randomly 5 images per class (5k in total) from the Image Net validation set for the perturbation tests, and the dataset of Image Net-Segmentation (Guillaumin et al., 2014) for the segmentation tests... Movie Reviews Dataset (De Young et al., 2020)... 20 Newsgroups Dataset (Lang, 1995).
Dataset Splits Yes We select randomly 5 images per class (5k in total) from the Image Net validation set for the perturbation tests, and the dataset of Image Net-Segmentation (Guillaumin et al., 2014) for the segmentation tests. ... We finetune a BERT-base model on its training data, with the accuracy reaching 93% on testing set. We randomly select 3000 documents from the testing set for the perturbation.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper mentions 'Interpret DL (Li et al., 2022)' as a tool used, but it does not specify version numbers for any software components (e.g., programming languages, libraries, frameworks) crucial for reproducibility.
Experiment Setup No The paper describes various experimental settings, such as different modalities (BERT, ViT, CLIP) and evaluation tests (Perturbation, Segmentation, Language Reasoning). However, it does not explicitly provide concrete hyperparameter values or detailed training configurations (e.g., learning rate, batch size, number of epochs, optimizer settings) for the models or baselines used in the experiments.