Higher-Order Graphon Neural Networks: Approximation and Cut Distance

Authors: Daniel Herbst, Stefanie Jegelka

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also validate our theoretical findings for IWNs with a proof-of-concept experiment on the graphons from Figure 1. For continuity/convergence, we plot the absolute errors of the model outputs for the sampled simple graphs in comparison to their graphon limits in Figure 2. Due to the δ -continuity of MPNNs, their errors decrease as the graph size grows. For the IWN, however, this does not hold. Yet, the errors for the IWN stabilize with increasing sizes, suggesting that the outputs converge (just not to their graphon limit). For transferability, we further plot prediction interval widths of the output distributions on simple graphs for each of the sizes in Figure 3. Here, the widths contract for both models and there are only minor differences visible between the MPNN and the IWN. This validates Theorem 5.2 and suggests that IWNs can indeed have similar transferability properties as MPNNs. For more details, see H.
Researcher Affiliation Academia 1TUM, School of CIT 2TUM, MCML and MDSI 3MIT, Department of EECS and CSAIL EMAIL
Pseudocode Yes H.2 PSEUDOCODE OF MODELS Algorithm 1 MPNN forward pass Algorithm 2 2-IWN forward pass Algorithm 3 2-IWN linear operators
Open Source Code Yes Code. The code used for the toy experiment is provided under https://github.com/ dan1elherbst/Higher-Order-WNNs.
Open Datasets No We keep the signal fixed at a constant value of 1 and look at the following 4 different graphons: Erd os R enyi (ER): W := 1/2 constant. Stochastic Block Model (SBM): We take 5 blocks, each with intra-cluster edge probabilities of p = 0.8 and inter-cluster edge probabilites q = 0.3. Triangular: Here, W(x, y) = (x + y)/2 (the sets {W z}, z [0, 1] are triangles). Narrow: W(x, y) = exp sin (x y)2 γ 2 , with γ := 0.05. From each of these graphons, we sample 100 simple graphs of each of the sizes {200, 400, 600, 800, 1000}. We use a weighted graph of 1000 nodes sampled from each graphon as an approximation of the respective graphon itself (convergence of weighted graphs is typically much faster than of simple graphs).
Dataset Splits No The paper describes generating data for evaluation by sampling graphs from specific graphon models across different sizes, but it does not specify traditional training, validation, or test splits for supervised learning tasks. It focuses on evaluating model properties (continuity, transferability) rather than performance on a partitioned dataset.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as GPU or CPU models. It only mentions that code is provided for the toy experiment.
Software Dependencies No The paper mentions models like MPNN and IWN, and uses a sigmoid activation function, but does not specify software names with version numbers for libraries, frameworks, or programming languages used in the experiments.
Experiment Setup Yes For both models, we use a simple setup of 2 layers, a hidden dimension of 16, and the sigmoid function as activation.