ReHub: Linear Complexity Graph Transformers with Adaptive Hub-Spoke Reassignment

Authors: Tomer Borreda, Daniel Freedman, Or Litany

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on long-range graph benchmarks indicate a consistent improvement in results over the base method, Neural Atoms, while maintaining a linear complexity instead of O(n3/2). Remarkably, our sparse model achieves performance on par with its non-sparse counterpart. Furthermore, Re Hub outperforms competitive baselines and consistently ranks among the top performers across various benchmarks. Code is available on our project webpage1. We evaluate Re Hub on the long-range graph benchmark (LRGB) which is is widely used to evaluate methods which aim at overcoming issues such as oversmoothing and oversquashing. LRGB comprises five datasets. In Table 1, we present a comparison of Re Hub using several common MPNNs. Results are shown for both the sparse case where the number of hubs connected to each spoke per layer, k, is small and a dense variant, Re Hub-FC, where each spoke is fully connected to all hubs.
Researcher Affiliation Collaboration Tomer Borreda EMAIL Department of Computer Science Technion-Israel Institute of Technology Daniel Freedman EMAIL Department of Applied Mathematics Tel Aviv University Or Litany EMAIL Department of Computer Science Technion-Israel Institute of Technology NVIDIA
Pseudocode Yes Algorithm 1 Hub (Re)Assignment Require: Hub-Spoke cross-attention score matrix Γℓ+1 and Hub-Hub distance matrix ℓ+1 for ih = 1 to Nh do H(ih) = Bottom-k-Indices(row ih of ℓ+1) end for for is = 1 to Ns do i h = arg maxih Γℓ+1 isih Eℓ+1 isih = ( 1 if ih H(i h) 0 otherwise end for return Eℓ+1
Open Source Code Yes Code is available on our project webpage1. 1https://tomerborreda.github.io/rehub/
Open Datasets Yes For long-range communication we evaluate Re Hub on the long-range graph benchmark (LRGB) which is is widely used to evaluate methods which aim at overcoming issues such as oversmoothing and oversquashing. LRGB comprises five datasets. Two of the datasets are image-based graph datasets: Pascal VOC-SP and COCO-SP which contain superpixel graphs of the well known image segmentation datasets Pascal VOC and COCO. The latter three datasets are molecular datasets: Peptides-Func, Peptides-Struct and PCQMContact, which require the prediction of molecular interactions and properties that require global aggregation of information. For evaluation on large graphs we show results on graph datasets of citation networks: OGBN-Arxiv and Coauthor Physics which include about 170K and 30K nodes respectively, with the task of node class prediction. Additionally, we evaluate peak memory consumption on a custom dataset of large random regular graphs (Steger & Wormald, 1999; Kim & Vu, 2003) of gradually increasing sizes from 10K to 700K nodes. Additional statistics about the datasets is available in Appendix A.1
Dataset Splits No The paper mentions using a "validation split of the Pascal VOC-SP dataset" (Figure 3 caption) and refers to following hyperparameters of other works for some datasets, implying that splits were used. However, it does not explicitly state the specific percentages, sample counts, or detailed methodology for creating these dataset splits in the provided text.
Hardware Specification Yes All experiments were performed using one NVIDIA L40 GPU with 48GB of memory.
Software Dependencies No The paper mentions using "Graph GPS" (Rampášek et al., 2022) code, merging parts from "Exphormer" (Shirzad et al., 2023), "Py Metis4", "torch.cuda" functions (implying PyTorch), and "Network X library" (Hagberg et al., 2008). However, it does not provide specific version numbers for any of these software dependencies.
Experiment Setup Yes In Tables 8-11 we provide the hyperparameters used in our experiments. Table 8: Hyperparameters for the LRGB datasets used for evaluation. For Peptides-func, Peptides-struct and PCQM-Contact some of the hyperparameters are model-specific and presented in additional tables. Hyperparameter PCQM-Contact Peptides-func Peptides-struct Pascal VOC-SP Dropout 0 0.12 0.2 0.15 Attention dropout 0.2 0.2 0.2 0.2 ... Learning Rate 0.0003 0.0003 0.0003 0.0005 Weight Decay 0 0 0 0 Warmup Epochs 10 10 10 10 Optimizer Adam W Adam W Adam W Adam W # Epochs 200 200 200 300 MPNN Gated GCN # Layers 4 Hidden Dim 96 # Heads 8 Hubs Ratio 1 k 3 # Params 590,051