Model Guidance via Robust Feature Attribution

Authors: Mihnea Ghitu, Vihari Piratla, Matthew Robert Wicker

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Across a comprehensive series of experiments, we show that our approach consistently reduces test-time misclassifications by 20% compared to state-of-the-art methods. We also extend prior experimental settings to include natural language processing tasks. Additionally, we conduct novel ablations that yield practical insights, including the relative importance of annotation quality over quantity.
Researcher Affiliation Academia Mihnea Ghitu EMAIL Imperial College London Vihari Piratla EMAIL University of Cambridge Matthew Wicker EMAIL Imperial College London
Pseudocode No The paper describes mathematical formulations and high-level steps for Rand-R4, Adv-R4, and Cert-R4, but it does not include a distinct block of structured pseudocode or an algorithm.
Open Source Code Yes Code for our method and experiments is available at: https://github.com/Mihneaghitu/Model Guidance Via Robust Feature Attribution.
Open Datasets Yes We conduct experiments on six datasets... The datasets include three synthetic ones and three real-world datasets: Decoy MNIST (Ross et al., 2017), Decoy DERM (a variant of Derm MNIST (Yang et al., 2023) created similarly to Decoy MNIST), Decoy IMDB (a text dataset (Maas et al., 2011) created by mimicking Decoy MNIST in a discrete space), ISIC (Codella et al., 2019), Plant Phenotyping, and Salient Image Net (Singla et al., 2022).
Dataset Splits Yes The IMDB dataset (Maas et al., 2011) is made up of 50000 IMDB movie text reviews, half of which (25000) are in the train set and the other half (25000) in the test set, and induces a binary classification sentiment analysis task.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running the experiments.
Software Dependencies No The paper mentions using Python for implementation (implied by PyTorch-like model architectures with `torch.nn.ReLU()`) but does not provide specific version numbers for Python, PyTorch, or any other libraries or software dependencies.
Experiment Setup No The paper mentions the use of the Adam optimizer and discusses hyperparameters like \lambda and \beta for the loss function, and varying weight decay coefficients. However, it does not provide specific numerical values for critical experimental setup details such as learning rates, batch sizes, number of training epochs, or the exact values of \lambda, \beta, and weight decay used for the main results.