Relation-Aware Diffusion for Heterogeneous Graphs with Partially Observed Features
Authors: Daeho Um, Yoonji Lee, Jiwoong Park, Seulki Park, Yuneil Yeo, Seong Jin Ahn
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through experiments, we demonstrate that our virtual feature scheme effectively serves as a bridge between existing diffusion-based methods and heterogeneous graphs, maintaining the advantages of these methods. Furthermore, we confirm that adjusting the importance of each edge type leads to significant performance gains on heterogeneous graphs. Extensive experimental results demonstrate the superiority of our scheme in both semi-supervised node classification and link prediction tasks on heterogeneous graphs with missing rates ranging from low to exceedingly high. |
| Researcher Affiliation | Collaboration | Daeho Um AI Center, Samsung Electronics EMAIL, Yoonji Lee Samsung Electronics EMAIL, Jiwoong Park Department of Electrical and Computer Engineering Texas A&M University EMAIL, Seulki Park University of Michigan EMAIL, Yuneil Yeo Department of Civil and Environmental Engineering UC Berkeley EMAIL, Seong Jin Ahn KAIST EMAIL |
| Pseudocode | No | The paper describes the proposed method using mathematical formulas and descriptive text in sections 4.1, 4.2, 4.3, and 4.4, but does not include a distinct pseudocode or algorithm block. |
| Open Source Code | Yes | The source code is available at https://github.com/daehoum1/hetgfd. |
| Open Datasets | Yes | Data Setting. We conduct experiments on three widely used heterogeneous graph datasets (ACM, DBLP, and IMDB) (Jin et al., 2021) from different domains. Detailed descriptions of these datasets and their sources can be found in Appendix B.2. ... We downloaded all the datasets used in this paper from the Git Hub repository for Jin et al. (2021). ... In the protein-protein interaction networks (PPI) dataset (Zitnik & Leskovec, 2017)... |
| Dataset Splits | Yes | We utilize the node split suggested in Jin et al. (2021), which uses 10% nodes for training, 10% nodes for validation, and 80% nodes for testing. ... For the link prediction splits, as described in Kipf & Welling (2016b), we divide target edges into training, validation, and testing sets, comprising 10%, 5%, and 85% of the edges, respectively. ... We use 80% nodes for training, 10% nodes for validation, and 10% nodes for testing. |
| Hardware Specification | Yes | All experiments are conducted with an Intel Core I5-6600 CPU @ 3.30 GHz and a single GPU (NVIDIA Ge Force RTX 2080 Ti). |
| Software Dependencies | No | The paper mentions Pytorch and Pytorch Geometric as implementation frameworks and cites relevant papers for them, but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | We tune hyperparameters for training downstream GNN models and conduct a grid search based on the validation sets. Specifically, we search for the optimal number of layers from {1, 2, 3} and the learning rate from {0.1, 0.01, 0.001, 0.0001}. We set the the hidden dimension to 64 for all the models. ... The maximum number of epochs is set to 1000 and we apply an early stopping strategy with the patience of 200 epochs. ... To find the optimal hyperparameters α and β for Het GFD, we perform a grid search on validation sets. The search range is set to {(α, β)|α {0.9, 0.7, 0.5, 0.3, 0.1}, β {0.99, 0.9, 0.8, 0.5, 0.4, 0.2, 0.1, 0.05}}. We set the value of K to 100. |