Unlearning-based Neural Interpretations
Authors: Ching Lam Choi, Alexandre Duplessis, Serge Belongie
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically verify this local smoothing effect by measuring the normal curvature of the model function before and after unlearning; we also demonstrate that unlearning makes attributions resistant to perturbative attacks. Our contributions can be summarised as follows: ... We empirically show that present reliance on static baselines imposes undesirable post-hoc biases... We visually, numerically and formally establish the utility of UNI as a means to compute robust, meaningful and debiased image attributions. ... We experiment on Image Net-1K (Deng et al., 2009), Image Net-C (Hendrycks & Dietterich, 2019) and compare against various path-based and gradient-based attribution methods. ... We report Mu Fidelity scores (Bhatt et al., 2021)... We evaluate with a step size of 10% and average over 10,000 random image samples... We report robustness results using 2 distance measures Spearman correlation coefficient in Table 5 and top-k pixel intersection score in Table 6 pre and post attack. |
| Researcher Affiliation | Academia | Ching Lam Choi CSAIL, Department of EECS Massachusetts Institute of Technology EMAIL Alexandre Duplessis Department of Computer Science University of Oxford EMAIL Serge Belongie Pioneer Centre for AI University of Copenhagen EMAIL |
| Pseudocode | Yes | Algorithm 1 UNI: unlearning direction, baseline matching and path-attribution |
| Open Source Code | No | The paper does not explicitly state that source code for its methodology is released or provide a link to a repository. It refers to 'open source exemplars (Fel et al., 2022a)' but this refers to related work, not their own implementation. |
| Open Datasets | Yes | We experiment on Image Net-1K (Deng et al., 2009), Image Net-C (Hendrycks & Dietterich, 2019) |
| Dataset Splits | No | The paper mentions evaluating with '10,000 random image samples' and describes how pixels are removed or inserted for inference, which is an evaluation sampling strategy. It references ImageNet-1K and ImageNet-C, which are standard datasets, but does not explicitly state which train/test/validation splits were used for the experiments performed in this paper, nor does it cite a specific split methodology used for its own evaluation. |
| Hardware Specification | No | The paper does not explicitly mention any specific hardware (e.g., CPU, GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'pre-trained computer vision backbone models (Paszke et al., 2019)', where Paszke et al., 2019 refers to PyTorch, but it does not specify any version numbers for PyTorch or other software dependencies used in their implementation. |
| Experiment Setup | Yes | Unless otherwise specified, we the following hyperparameters: unlearning step size η = 1; l2 PGD with T = 10 steps, a budget of ε = 0.25, step size µ = 0.1; Riemann approximation with B = 15 steps. |