A Versatile Influence Function for Data Attribution with Non-Decomposable Loss

Authors: Junwei Deng, Weijing Tang, Jiaqi W. Ma

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of VIF across three examples: Cox regression for survival analysis, node embedding for network analysis, and listwise learning-to-rank for information retrieval. In all cases, the influence estimated by VIF closely resembles the results obtained by brute-force leave-one-out retraining, while being up to 103 times faster to compute.
Researcher Affiliation Academia 1University of Illinois Urbana-Champaign 2 Carnegie Mellon University. Correspondence to: Jiaqi W. Ma <EMAIL>.
Pseudocode No The paper describes methods mathematically and textually but does not contain a clearly labeled pseudocode or algorithm block.
Open Source Code Yes The code is publicly available at https://github.com/TRAIS-Lab/Versatile-Influence-Function.
Open Datasets Yes For Cox Regression, we use the METABRIC and SUPPORT datasets (Katzman et al., 2018). For Node Embedding, we use Zachary s Karate network (Zachary, 1977) and train a Deep Walk model (Perozzi et al., 2014). For Listwise Learning-to-Rank, we use the Delicious (Tsoumakas et al., 2008) and Mediamill (Snoek et al., 2006) datasets.
Dataset Splits Yes For Cox Regression, both METABRIC and SUPPORT datasets are split into training, validation, and test sets with a 6:2:2 ratio. The training objects and test objects are defined as the full training and test sets. For node embedding, the test objects are all valid pairs of nodes, i.e., 34 34 = 1156 objects, while the training objects are the 34 individual nodes. In the case of listwise learning-to-rank, we sample 500 test samples from the pre-defined test set as the test objects.
Hardware Specification Yes All runtime measurements were recorded using an Intel(R) Xeon(R) Gold 6338 CPU.
Software Dependencies No The paper mentions that the code is publicly available, which implies software dependencies would be listed there, but it does not explicitly provide specific software versions within the paper's text.
Experiment Setup Yes For Cox regression, we train a Cox PH model... The model is optimized using the Adam optimizer with a learning rate of 0.01. We train the model for 200 epochs on the METABRIC dataset and 100 epochs on the SUPPORT dataset. For node embedding, we sample 1,000 walks per node, each with a length of 6, and set the window size to 3. The dimension of the node embedding is set to 2. For listwise learning-to-rank, the model is optimized using the Adam optimizer with a learning rate of 0.001, weight decay of 5e-4, and a batch size of 128 for 100 epochs on both the Mediamill and Delicious datasets. We also use Truncated SVD to reduce the feature dimension to 8.