How does over-squashing affect the power of GNNs?
Authors: Francesco Di Giovanni, T. Konstantin Rusch, Michael Bronstein, Andreea Deac, Marc Lackenby, Siddhartha Mishra, Petar Veličković
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our analysis through extensive controlled experiments and ablation studies. |
| Researcher Affiliation | Collaboration | Francesco Di Giovanni EMAIL University of Oxford T. Konstantin Rusch EMAIL Massachusetts Institute of Technology Michael M. Bronstein University of Oxford Andreea Deac Université de Montréal Marc Lackenby University of Oxford Siddhartha Mishra ETH Zürich Petar Veličković Google Deep Mind |
| Pseudocode | No | The paper presents mathematical formulations, theorems, and experimental results, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | Hence, we perform our empirical test in a controlled environment, but at the same time, we base our experiments on the real world ZINC chemical dataset (Irwin et al., 2012) and follow the experimental setup in Dwivedi et al. (2020), constraining the number of molecular graphs to 12K. |
| Dataset Splits | No | The paper mentions using a set of 12K ZINC molecular graphs and setting up synthetic node features and target outputs. It also discusses training and testing, and reports 'Test MAE' and 'training MAE'. However, it does not specify the exact percentages or counts for training, validation, and test splits, or how these splits were generated (e.g., random seed, stratified split, k-fold cross-validation). |
| Hardware Specification | No | The paper describes the experimental setup, models, and datasets used, but does not provide specific details about the hardware (e.g., GPU models, CPU types) on which the experiments were run. |
| Software Dependencies | No | The paper mentions various MPNN models (GCN, GIN, Graph SAGE, Gated GCN) and libraries implicitly used (e.g., for computations or deep learning frameworks), but it does not specify any software names with version numbers (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | In particular, we set the depth to m = maxi diam(Gi)/2 , which happens to be m = 11 for the considered ZINC 12K graphs, such that the MPNNs are guaranteed not to underreach. We further vary the value of the α-quantile of the τ-distributions over the graphs Gi between 0 and 1, thus controlling the level of commute times. ... We consider four different MPNN models namely GCN (Kipf & Welling, 2017), GIN (Xu et al., 2019), Graph SAGE (Hamilton et al., 2017), and Gated GCN (Bresson & Laurent, 2017). Moreover, we choose the MAX-pooling as the GNN readout, which is supported by Theorem 3.2 and forces the GNNs to make use of the message-passing in order to learn the mixing. ... we fix the MPNN size to 100K parameters. ... we consider a high commute time-regime by setting α = 0.8. ... we train the MPNNs on both types of mixing and provide the resulting relative MAEs (i.e., MAE divided by the L1-norm of the targets) in Table 1. |