Joint Graph Rewiring and Feature Denoising via Spectral Resonance
Authors: Jonas Linkerhägner, Cheng Shi, Ivan Dokmanić
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We run extensive experiments to show that JDR outperforms existing preprocessing rewiring strategies while being guided solely by denoising. We extensively evaluate JDR on both synthetic data generated from the c SBM and real-world benchmark datasets. We compare our algorithm with the state-of-the-art rewiring methods first-order spectral rewiring (Fo SR) (Karhadkar et al., 2023), batch Ollivier-Ricci flow (BORF) (Nguyen et al., 2023) and diffusion improves graph learning (DIGL) (Gasteiger et al., 2019). |
| Researcher Affiliation | Academia | Jonas Linkerhägner Cheng Shi Ivan Dokmani c Department of Mathematics and Computer Science University of Basel EMAIL |
| Pseudocode | Yes | A detailed pseudocode is given in Appendix A.1. Algorithm 1 Joint Denoising and Rewiring |
| Open Source Code | Yes | Our method is outlined in Figure 1 and the code repository is available online1. 1https://github.com/jlinki/JDR |
| Open Datasets | Yes | We extensively evaluate JDR on both synthetic data generated from the c SBM and real-world benchmark datasets. We evaluate JDR on five common homophilic benchmarks datasets, namely the citation graphs Cora, Cite Seer, Pub Med (Sen et al., 2008) and the Amazon co-purchase graphs Computers and Photo (Mc Auley et al., 2015). For heterophilic datasets, we rely on the Wikipedia graphs Chameleon and Squirrel (Rozemberczki et al., 2021), the Web KB datasets Texas and Cornell used in Pei et al. (2020) and the actor co-occurence network Actor (Tang et al., 2009). To show the scalability of JDR on larger heterophilic datasets, we further report the results for the Yandex Q user network Questions (Platonov et al., 2023) and the social networks Penn94 and Twitch-Gamers (Lim et al., 2021). |
| Dataset Splits | Yes | We also adopt their data splits, namely the sparse splitting 2.5%/2.5%/95% for training, validation and testing, respectively, or the dense splitting 60%/20%/20%. For the general experiments, we perform 100 runs with different random splits. The remaining larger graphs are evaluated using their original splits. |
| Hardware Specification | Yes | All algorithms are run on Nvidia A100 with 80GB and we time their Python processes. Experiments on c SBM, Cora, Citeseer and Photo were conducted on an internal cluster with Nvidia Tesla V100 GPUs with 32GB of VRAM. The experiments on the remaining datasets (Pub Med, Computers, Chameleon, Squirrel, Actor, Cornell and Texas) were performed using Nvidia A100 GPUs with 40GB or 80GB of VRAM. |
| Software Dependencies | No | The paper mentions software components like 'Adam optimizer', 'GCN', 'GPRGNN', and 'Pytorch Geometric (Fey and Lenssen, 2019)' but does not provide specific version numbers for these software libraries or frameworks. For example, it does not state 'Pytorch 1.9' or 'Python 3.8'. |
| Experiment Setup | Yes | Unless stated otherwise, we use the hyperparameters from Chien et al. (2021) for the GNNs and optimize the hyperparameters of JDR using a mixture of grid and random search on the validation set. We use the top-64 values of A to enforce sparsity and interpolation to update the features. A detailed list of all hyperparameters can be found in Appendix A.7 or in the code repository. For JDR, we always keep the 64 largest entries of the rewired adjacency matrix A per node. In all experiments we use the Adam optimizer and the standard early stopping after 200 epochs from (Chien et al., 2021). Whenever we use a GCN, it uses two layers, a hidden dimension of 64 and dropout with 0.5. Whenever we use GPRGNN, we use a polynomial filter of order 10 (corresponding to 10 hops) and a hidden dimension of 64. The hyperparameters for JDR on the c SBM are shown in Table 22. |