reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Generalizing Weisfeiler-Lehman Kernels to Subgraphs

Authors: Dongkwan Kim, Alice Oh

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In experiments on eight real-world and synthetic benchmarks, WLKS significantly outperforms leading approaches on five datasets while reducing training time, ranging from 0.01x to 0.25x compared to the state-of-the-art.
Researcher Affiliation	Academia	Dongkwan Kim & Alice Oh KAIST, Republic of Korea EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: 1-WL Algorithm Algorithm 2: WLSk Algorithm: 1-WL for subgraphs with their k-hop neighborhoods
Open Source Code	Yes	We make our code available for future research1. 1https://github.com/dongkwan-kim/WLKS
Open Datasets	Yes	We employ four real-world datasets (PPI-BP, HPO-Neuro, HPO-Metab, and EM-User) and four synthetic datasets (Density, Cut-Ratio, Coreness, and Component) introduced by Alsentzer et al. (2020).
Dataset Splits	Yes	Table 1: Statistics of real-world and synthetic datasets. ...Dataset splits 80/10/10 80/10/10 80/10/10 70/15/15 80/10/10 80/10/10 80/10/10 80/10/10
Hardware Specification	Yes	When measuring the complete training time, we run models of the best hyperparameters from each model s original code, including batch sizes and total epochs, using Intel(R) Xeon(R) CPU E5-2640 v4 and a single Ge Force GTX 1080 Ti (for deep GNNs).
Software Dependencies	No	All models are implemented with Py Torch (Paszke et al., 2019) and Py Torch Geometric (Fey & Lenssen, 2019). We use the implementation of Support Vector Machines (SVMs) in Scikit-learn (Pedregosa et al., 2011). Specific version numbers for these software dependencies are not provided.
Experiment Setup	Yes	We do a grid search of five hyperparameters: the number of iterations ({1, 2, 3, 4, 5}), whether to combine kernels of all iterations, whether to normalize histograms, L2 regularization ({23/100, 24/100, ..., 214/100}), and the coefficient α0({0.999, 0.99, 0.9, 0.5, 0.1, 0.01, 0.001}). When combining with kernels on continuous features (Equation 4), we tune αfeature from the space of {0.0001, 0.001, 0.01, 0.05, 0.1, 0.15, 0.2, 0.25} and set αstructure = 1/(1 + αfeature).