Enhancing Graph Representation Learning with Localized Topological Features
Authors: Zuoyu Yan, Qi Zhao, Ze Ye, Tengfei Ma, Liangcai Gao, Zhi Tang, Yusu Wang, Chao Chen
JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that the localized topological features greatly enhance the representation learning, and achieve state-of-the-art results on various node classification and link prediction benchmarks. We also explore the option of end-to-end learning of the topological features, i.e., treating topological computation as a differentiable operator during learning. Our theoretical analysis and empirical study provide insights and potential guidelines for employing topological features in graph learning tasks. |
| Researcher Affiliation | Academia | Zuoyu Yan1,5 EMAIL Qi Zhao2 EMAIL Ze Ye3 EMAIL Tengfei Ma3, EMAIL Liangcai Gao1, EMAIL Zhi Tang1 EMAIL Yusu Wang4, EMAIL Chao Chen3, EMAIL 1 Wangxuan Institute of Computer Technology, Peking University 2 Computer Science and Engineering Department, University of California, San Diego 3 Department of Biomedical Informatics, Stony Brook University 4 Halıcıo glu Data Science Institute, University of California, San Diego 5 Weill Cornell Medicine, Cornell University |
| Pseudocode | Yes | Algorithm 1 Computation of 1D EPD corresponding to cycles 1: Input: filter function f, input graph G = (V, E) 2: V, E = sorted(V, E, f), PD1 = {} 3: for i V do 4: Ci = {Cij|(i, j) E, f(j) > f(i)}, Ei = E 5: for Cij Ci do 6: f(Cij) = f(i), Ei = Ei {(i, j)} + {(Cij, j)} 8: PDi 1 = Union-Find-step(V + Ci {i}, Ei, f, Ci) 9: PD1+ = PDi 1 10: end for 11: Output: PD0, PD1 Algorithm 2 Union-Find-step |
| Open Source Code | Yes | Source code is available at https://github.com/pkuyzy/TLC-GNN. |
| Open Datasets | Yes | 1. Cora, Citeseer, and Pub Med (Sen et al., 2008) are standard citation networks where nodes represent scientific papers, and edges denote citations between them. 2. Photo and Computers (Shchur et al., 2018) are graphs derived from Amazon shopping records. 3. Physics and CS (Shchur et al., 2018) are co-authorship graph datasets where nodes represent authors. 4. PPI Networks (Zitnik and Leskovec, 2017) are protein-protein interaction networks originally designed for graph classification tasks. |
| Dataset Splits | Yes | We split the training, validation, and test set following (Kipf and Welling, 2017; Veliˇckovi c et al., 2018). To be specific, the training set consists of 20 nodes from each class, and the validation (test, resp.) set consists of 500 (1000, resp.) nodes. Following (Chami et al., 2019), we use 5% (resp. 10%) of existing links in the input graph as the positive validation set (resp. positive test set). An equal number of non-existent links are sampled as the negative validation set and negative test set. The remaining 85% existing links are used as the positive training set. |
| Hardware Specification | No | The paper does not explicitly mention any specific hardware used for running its experiments, such as GPU/CPU models or cloud resources. |
| Software Dependencies | Yes | We utilize the Python package Dionysus, version 2.0.7, to compute the EPD. |
| Experiment Setup | Yes | We initialize the model with Glorot initialization and use cross-entropy loss and Adam optimizer to train our model. In the optimizer, the learning rate is 0.005, and the weight decay is 0.0005. The training epoch is set to 200, and the early stopping on the validation set is 100 (patience epochs). For a fair comparison, we set the number of node embeddings of the hidden layer to be the same (64) for all networks. For all the models, the number of GNN layers is set to 2. All the activation function after each graph convolution block is ELU. Cross-Entropy Loss is chosen as the loss function and Adam is adopted as the optimizer with the learning rate set to 0.01 and weight-decay set to 0. Dropout is 0.8 for Cora and Citeseer, and 0.5 for the rest of the graphs. ... The training epoch is 2000, and the early stopping on the validation set is 200 (patience epochs). |