Parameter Efficient Node Classification on Homophilic Graphs

Authors: Lucas Prieto, Jeroen Den Boef, Paul Groth, Joran Cornelisse

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose Graph Non-Parametric Diffusion (GNPD) a method that outperforms traditional GNNs using only 2 linear models and non-parameteric diffusion. Our method takes ideas from Correct & Smooth (C&S) and the Scalable Inception Graph Network (SIGN) and combines them to create a simpler model that outperforms both of them on several datasets. Our method achieves unmatched parameter efficiency, competing with models with two orders of magnitude more parameters. Additionally GNPD can also forego spectral embeddings which are the computational bottleneck of the C&S method.
Researcher Affiliation Collaboration Lucas Prieto EMAIL Socialdatabase University of Amsterdam Jeroen Den Boef EMAIL Socialdatabase Paul Groth EMAIL University of Amsterdam Joran Cornelisse EMAIL Socialdatabase
Pseudocode No The paper describes methods using text, mathematical equations (e.g., Equation 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15) and schematic diagrams (Figure 1, 2, 3) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about the release of source code or a direct link to a code repository for the methodology described. It mentions "Reviewed on Open Review: https: // openreview. net/ forum? id= XXXX" but this is for review, not code access.
Open Datasets Yes The statistics of the datasets used in this paper are described in Table 1. Open graph benchmark: Datasets for machine learning on graphs. ar Xiv preprint ar Xiv:2005.00687, 2020.
Dataset Splits Yes Table 1: Summary statistics for the datasets used in this paper. Dataset Nodes Edges Classes Train/Val/Test arxiv 169,343 1,166,243 40 54%/18%/28% Products 2,449,029 61,859,140 47 10%/2%/88% Pubmed 19,717 44,338 3 92%/3%/5% Citeseer 3,327 4,732 6 55%/15%/30%
Hardware Specification No The paper discusses computational efficiency and runtimes, particularly in Section 5 and Figure 5, noting benefits from foregoing spectral embeddings. However, it does not specify any particular GPU, CPU, or other hardware details used for executing the experiments.
Software Dependencies No The paper mentions "Light GBM (Ke et al., 2017)" as a component in its aggregation step. However, it does not provide a specific version number for Light GBM or any other software dependencies, which is required for reproducibility.
Experiment Setup Yes 6.1 Ablation study In this section we measure the importance of the different components of our method. While most of the hyper-parameters in this method were inherited from C&S, we introduced the λ parameter to regulate class specific homophily; we show the sensitivity of our method to this hyper-parameter and the number of diffusion steps k in Table 5. Table 5: Sensitivity analysis with respect to λ and k (the number of diffusion steps). The results are averaged over 10 runs and shown with standard deviation.