iN2V: Bringing Transductive Node Embeddings to Inductive Graphs
Authors: Nicolas Lell, Ansgar Scherp
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on several benchmark datasets and demonstrate that i N2V is an effective approach to bringing transductive embeddings to an inductive setting. Using i N2V embeddings improves node classification by 1 point on average, with up to 6 points of improvement depending on the dataset and the number of unseen nodes. |
| Researcher Affiliation | Academia | Nicolas Lell 1 Ansgar Scherp 1 1Research Group on Data Science and Big Data Analytics, Ulm University, Ulm, Germany. Correspondence to: Nicolas Lell <EMAIL>. |
| Pseudocode | No | The paper describes an "iterative algorithm" in section 3.3 "Generating Inductive Embeddings" with mathematical formulas (1a, 1b, 1c) but does not present it as a clearly labeled pseudocode or algorithm block. |
| Open Source Code | Yes | Code to reproduce the results is available at https://github.com/Foisunt/iN2V. |
| Open Datasets | Yes | We use the Cora (Sen et al., 2008), Cite Seer (Sen et al., 2008), and Pub Med (Namata et al., 2012) citation graphs, the Computers (Shchur et al., 2018) and Photo (Shchur et al., 2018) co-purchase graphs, and the Wiki CS Wikipedia page graph (Mernyei & Cangea, 2020). These graphs are homophilic, i. e., neighboring nodes usually share the same class. The following graphs are more heterophilic, i. e., neighboring nodes usually do not share the same class. Actor (Pei et al., 2020) is a Wikipedia co-occurrence graph, Amazon-ratings (Platonov et al., 2023b) is a co-purchase graph, and Roman-empire (Platonov et al., 2023b) is a textbased graph. |
| Dataset Splits | Yes | For all datasets, we use 5 splits of different sizes that always utilize the full dataset and have a validation and test set of the same size. The training set sizes are 10%, 20%, 40%, 60%, and 80%, with respective validation and test set sizes of 45%, 40%, 30%, 20%, and 10% of all nodes. |
| Hardware Specification | No | This work was performed on the computational resource bw Uni Cluster funded by the Ministry of Science, Research and the Arts Baden-Württemberg and the Universities of the State of Baden-Württemberg, Germany, within the framework program bw HPC |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used for the experiments. It only mentions using N2V and GNN models like MLP and Graph SAGE. |
| Experiment Setup | Yes | For N2V, we use a context size of 10 for positive samples and 1 negative per positive sample. We use a batch size of 128 and early stopping with patience of 50 epochs. For every epoch, we sample 10 walks of length 20 per node. We do grid search over all combinations of p and q {0.2, 1, 5}, embedding size d {64, 256}, and learning rate {0.1, 0.01, 0.001}. For the sampling based method, we try r {0.2, 0.4, 0.6, 0.8}. The loss weights α {0, 0.1, 1, 10} and β {0, 0.001, 0.01, 0.1} are tuned separately from r. For Feature Propagation, we search the number of iterations in {10, 20, 40, 60}. For MLP and Graph SAGE, we use grid search for the full 10 seeds per split. We search over all combination of number of layer {1, ... 5}, hidden size {64, 512}, learning rate {0.01, 0.001}, weight decay {0, 0.0001, 0.01}, dropout {0.2, 0.5, 0.8}, and whether to use jumping knowledge (Xu et al., 2018) connections. |