Position: Graph Matching Systems Deserve Better Benchmarks
Authors: Indradyumna Roy, Saswat Meher, Eeshaan Jain, Soumen Chakrabarti, Abir De
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Table 2, we quantify the impact of train-test leakage on baseline models using Intra-test-pairs and Cross-train-test-pairs. We evaluate these models on the default dataset test splits, which include leakage, and report the Mean Squared Error (MSE) and Kendall Tau Correlation (Ktau) between predicted and ground truth GED values under both Intra-test-pairs and Cross-train-test-pairs. |
| Researcher Affiliation | Academia | 1IIT Bombay, Mumbai, India 2EPFL, Lausanne, Switzerland. |
| Pseudocode | Yes | Algorithm 1 Construct Edit Path from Permutation Matrix Algorithm 2 Dataset Processing with Cost Variants Algorithm 3 GENERATEPAIRS Algorithm 4 COMPUTEOPTIMALPATHS Algorithm 5 GENERATECOSTVARIANTS |
| Open Source Code | Yes | All code and datasets used in this work have been made publicly available at https://anonymous.4open.science/r/better-graph-matching-7146/. |
| Open Datasets | Yes | All code and datasets used in this work have been made publicly available at https://anonymous.4open.science/r/better-graph-matching-7146/. Using four leakage-free datasets (Mutag, Code2, Molhiv, Molpcba) from GRAPHEDX |
| Dataset Splits | Yes | This unique set is split into Strain, Sval, and Stest in a 60:20:20 ratio. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU models, or cloud instance types) used for running its experiments. |
| Software Dependencies | No | We explored two libraries, GEDLIB (Blumenthal et al., 2019) and Network X (Hagberg & Conway, 2020), for GED calculation using combinatorial approaches. These are library names with citations, but specific version numbers for these or any other software dependencies are not provided. |
| Experiment Setup | No | The paper does not explicitly provide details about specific hyperparameters (e.g., learning rate, batch size, number of epochs, optimizer settings) or system-level training settings for its experiments. |