Simple Path Structural Encoding for Graph Transformers

Authors: Louis Airale, Antonio Longa, Mattia Rigon, Andrea Passerini, Roberto Passerone

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate SPSE on extensive benchmarks, including molecular datasets from Benchmarking GNNs (Dwivedi et al., 2023), Long-Range Graph Benchmarks (Dwivedi et al., 2022), and Large-Scale Graph Regression Benchmarks (Hu et al., 2021). SPSE consistently outperforms RRWP in graph-level and node-level tasks, demonstrating significant improvements in molecular and long-range datasets.
Researcher Affiliation Academia 1University of Trento, Trento, Italy. Correspondence to: Louis Airale <EMAIL>, Roberto Passerone <EMAIL>.
Pseudocode Yes Algorithm 1 Count paths between all pairs of nodes (simplified) Algorithm 2 DAGDECOMPOSE: Decomposition of an input graph into multiple DAGs
Open Source Code Yes 1The Python implementation of the algorithm is available on the project s Github page.
Open Datasets Yes We conduct experiments on graph datasets from three distinct benchmarks, covering both nodeand graphlevel tasks. These include ZINC, CLUSTER, PATTERN, MNIST, and CIFAR10 from Benchmarking GNNs (Dwivedi et al., 2023), Peptides-functional and Peptides-structural from the Long-Range Graph Benchmark (Dwivedi et al., 2022), and the 3.7M-sample PCQM4Mv2 dataset from the Large-scale Graph Regression Benchmark (Hu et al., 2021).
Dataset Splits Yes To validate Proposition 3, we design a synthetic dataset consisting of 12,000 graphs. ... The dataset is split into training (10,000 graphs), validation (1,000 graphs), and test (1,000 graphs) sets.
Hardware Specification No The paper mentions 'gigaflops' in Section 5.1 when discussing model complexities but does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions 'The Python implementation of the algorithm' but does not specify a version for Python or any other libraries/frameworks used, such as PyTorch or TensorFlow, along with their version numbers.
Experiment Setup Yes We train these models using three hyperparameter configurations adopted from (Menegaux et al., 2023). These correspond to the setups used for ZINC (config #1), PATTERN (config #2), and CIFAR10 (config #3), covering a range of model complexities from 40 to 280 gigaflops. Table 3. Model configurations used for the synthetic experiments. Configuration #1: Transformer layers 3, Self-attention heads 4, Hidden dimension 52, Learning rate 10^-3, Epochs 100 Configuration #2: Transformer layers 6, Self-attention heads 4, Hidden dimension 64, Learning rate 5 x 10^-4, Epochs 300 Configuration #3: Transformer layers 10, Self-attention heads 8, Hidden dimension 64, Learning rate 5 x 10^-4, Epochs 400