Shapley-Guided Utility Learning for Effective Graph Inference Data Valuation
Authors: Hongliang Chi, Qiong Wu, Zhengyi Zhou, Yao Ma
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on diverse graph datasets demonstrate that SGUL consistently outperforms existing baselines in both inductive and transductive settings. SGUL offers an effective, efficient, and interpretable approach for quantifying the value of test-time neighbors. |
| Researcher Affiliation | Collaboration | Hongliang Chi1, Qiong Wu2, Zhengyi Zhou2, Yao Ma1 1Rensselaer Polytechnic Institute, Troy, NY, United States 2AT&T Chief Data Office, Bedminster, NJ, United States |
| Pseudocode | Yes | Algorithm 1 Shapley-Guided Utility Learning (SGUL) Algorithm 2 Test-time Structure Value Estimation Algorithm 3 Node Dropping Evaluation Protocol Algorithm 4 Precedence-Constrained Permutation Sampling |
| Open Source Code | Yes | Code is released at https://github.com/frankhlchi/infer_data_valuation. |
| Open Datasets | Yes | To evaluate the effectiveness of our proposed Shapley-Guided Utility Learning (SGUL) framework, we conducted extensive experiments on seven diverse real-world graph datasets. These datasets include Cora, Citeseer, Pubmed (Sen et al., 2008), Coauthor-CS, Coauthor-Physics (Shchur et al., 2018), Roman-empire, and Amazon-ratings (Platonov et al., 2023). |
| Dataset Splits | Yes | In this setting, we partition each dataset into three distinct graph structures: Training Graph, Validation Graph, and Testing Graph. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. It mentions 'Time (s)' and 'Memory (MB)' in Table 1 for efficiency analysis, but these are metrics, not hardware specifications. |
| Software Dependencies | No | The implementation used Py Torch with a learning rate of 0.001 for both methods. This mentions 'Py Torch' but does not specify a version number. |
| Experiment Setup | No | The paper describes the general experimental setup in Appendix E.2, including the types of GNN models and settings (inductive/transductive), but it does not provide specific hyperparameter values like learning rate for the GNN models, batch size, or number of epochs. It mentions 'learning rate of 0.001' and '1000 epochs' in the context of the fitting process for SGUL-Shapley and SGUL-Accuracy, not for the GNN model training itself. |