reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

DiffIM: Differentiable Influence Minimization with Surrogate Modeling and Continuous Relaxation

Authors: Junghun Lee, Hyunju Kim, Fanchen Bu, Jihoon Ko, Kijung Shin

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments on real-world graphs, we show that each proposed scheme significantly improves speed with little (or even no) IMIN performance degradation. Our method is Pareto-optimal (i.e., no baseline is faster and more effective than it) and typically several orders of magnitude (spec., up to 15,160 ) faster than the most effective baseline, while being more effective.
Researcher Affiliation	Academia	KAIST, Daejeon, South Korea EMAIL
Pseudocode	Yes	Algorithm 1: DIFFIM / DIFFIM+/ DIFFIM++
Open Source Code	Yes	Code, datasets and online appendix https://github.com/junghunl/DiffIM
Open Datasets	Yes	We used three real-world social-network datasets from Ko et al. (2020): WC, CL, and ET, consisting of interactions logs, e.g., retweets among users (Sabottke, Suciu, and Dumitras, 2015). Such interactions naturally represent people s impact on others, i.e., influence. We provided their basic statistics in Table 2. Each dataset was split into training and test graphs based on a time threshold tth: edges before tth were used for training, and those after were used for testing.
Dataset Splits	Yes	Each dataset was split into training and test graphs based on a time threshold tth: edges before tth were used for training, and those after were used for testing. For each training graph, we generated 1,000 random seed sets, using 800 for training and 200 for validation. For each test graph, we generated 50 random seed sets and reported average performance, where we used 10,000 Monte Carlo simulations as the ground-truth influence, following the settings by Kempe, Kleinberg, and Tardos (2003).
Hardware Specification	No	No specific hardware details (like GPU/CPU models) are provided in the main text. The paper mentions "See Appendix F for more details of the experimental settings, e.g., hardware information." but Appendix F is not provided in the given text.
Software Dependencies	No	The paper mentions using "graph convolutional networks (GCNs)" and references "PyTorch for Deep Learning" in the acknowledgements, but it does not specify version numbers for any software dependencies.
Experiment Setup	Yes	We consistently used a graph convolutional network (GCN) with six layers and a final fully connected layer. For DIFFIM+, the probabilistic decisions r were optimized (see Alg. 1) in nep = 100 epochs for each removal with α = 0.1 and β = 1. ... The probabilistic decisions r were normalized by a sigmoid function σsigmoid(x) = 1 1+e x [0, 1]. We used initial probabilistic decisions r(e) = σsigmoid(x) = 1 b \|E\|, e E, which makes the loss Lbudget regarding the budget constraint to be 0 (see Sec. 5.3).