Position: Graph Matching Systems Deserve Better Benchmarks

Authors: Indradyumna Roy, Saswat Meher, Eeshaan Jain, Soumen Chakrabarti, Abir De

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Table 2, we quantify the impact of train-test leakage on baseline models using Intra-test-pairs and Cross-train-test-pairs. We evaluate these models on the default dataset test splits, which include leakage, and report the Mean Squared Error (MSE) and Kendall Tau Correlation (Ktau) between predicted and ground truth GED values under both Intra-test-pairs and Cross-train-test-pairs.
Researcher Affiliation Academia 1IIT Bombay, Mumbai, India 2EPFL, Lausanne, Switzerland.
Pseudocode Yes Algorithm 1 Construct Edit Path from Permutation Matrix Algorithm 2 Dataset Processing with Cost Variants Algorithm 3 GENERATEPAIRS Algorithm 4 COMPUTEOPTIMALPATHS Algorithm 5 GENERATECOSTVARIANTS
Open Source Code Yes All code and datasets used in this work have been made publicly available at https://anonymous.4open.science/r/better-graph-matching-7146/.
Open Datasets Yes All code and datasets used in this work have been made publicly available at https://anonymous.4open.science/r/better-graph-matching-7146/. Using four leakage-free datasets (Mutag, Code2, Molhiv, Molpcba) from GRAPHEDX
Dataset Splits Yes This unique set is split into Strain, Sval, and Stest in a 60:20:20 ratio.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU models, or cloud instance types) used for running its experiments.
Software Dependencies No We explored two libraries, GEDLIB (Blumenthal et al., 2019) and Network X (Hagberg & Conway, 2020), for GED calculation using combinatorial approaches. These are library names with citations, but specific version numbers for these or any other software dependencies are not provided.
Experiment Setup No The paper does not explicitly provide details about specific hyperparameters (e.g., learning rate, batch size, number of epochs, optimizer settings) or system-level training settings for its experiments.