GPEN: Global Position Encoding Network for Enhanced Subgraph Representation Learning
Authors: Nannan Wu, Yuming Huang, Yiming Zhao, Jie Chen, Wenjun Wang
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on eight public datasets identify that GPEN significantly outperforms state-of-the-art methods in subgraph representation learning. ... 5. Experimental Evaluation |
| Researcher Affiliation | Academia | 1College of Intelligence and Computing, Tianjin University, Tianjin, China 2College of Management and Economics, Tianjin University, Tianjin, China 3Yazhou Bay Innovation Institute, Hainan Tropical Ocean University, China. Correspondence to: Nannan Wu <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Global Position Encoding Input: Graph G = (V, E) Output: Global Position Encodings {pv}v V 1: R Eq.(3) 2: wij R[i] + R[j], (i, j) E 3: r arg maxv V R[v] 4: G (V, E, W), W = {wij} 5: T Eq.(6) 6: for all v V do 7: tv dist T (v, r) 8: pv one hot(tv) 9: end for{pv}v V |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | We use the same four real-world datasets and four synthetic datasets (Alsentzer et al., 2020b; Wang & Zhang, 2022; Kim & Oh, 2024). Detailed information about these datasets is presented in Table 2 and Appendix A.2.1. |
| Dataset Splits | Yes | The datasets are divided according to the split ratios outlined in the baselines (Alsentzer et al., 2020b; Wang & Zhang, 2022; Jacob et al., 2023; Kim & Oh, 2024). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions calculating vector differences using a COO-formatted sparse adjacency matrix, but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | we set the number of iterations t for Page Rank to 100, with the damping factor a set to 0.85. Similar to GLASS and Sub GNN, our model pre-trains nodes to generate node features for real-world datasets. Additionally, global position encoding is added as the initial feature for all datasets. We calculate the vector differences of node representations using a COO-formatted sparse adjacency matrix A, which significantly reduces memory usage. We use classic loss functions for classification tasks: BCE loss for binary classification and cross-entropy loss for multi-class classification. ... The balance factor controls the integration of local structural features and global position information during boundary-aware convolution. As illustrated in Figure 4(a), we observe a consistent pattern across all datasets: optimal performance emerges when b falls within a moderate range, typically between 0.6 and 0.8. ... The threshold parameter c determines the minimum size of connected components used for generating augmented samples. ... Smaller batch sizes generally lead to better performance... |