Universal Link Predictor By In-Context Learning on Graphs

Authors: Kaiwen Dong, Haitao Mao, Zhichun Guo, Nitesh V Chawla

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through rigorous experimentation, we demonstrate Uni LP s effectiveness in adapting to new, unseen graphs at test time, showcasing its ability to perform comparably or even outperform parametric models that have been finetuned for specific datasets. Our findings highlight Uni LP s potential to set a new standard in link prediction, combining the strengths of heuristic and parametric methods in a single, versatile framework.
Researcher Affiliation Academia Kaiwen Dong EMAIL University of Notre Dame Haitao Mao EMAIL Michigan State University Zhichun Guo EMAIL University of Notre Dame Nitesh V. Chawla EMAIL University of Notre Dame
Pseudocode No The paper describes methods and definitions in prose and mathematical equations but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code No The paper does not contain any explicit statement or link indicating the release of source code for the described methodology.
Open Datasets Yes Benchmark datasets. The foundation for our model s training is a collection of graph datasets spanning a variety of domains. Following (Mao et al., 2023), we have carefully selected graph data from fields such as biology (Von Mering et al., 2002; Zhang et al., 2018; Watts & Strogatz, 1998), transport (Watts & Strogatz, 1998; Batagelj & Mrvar, 2006), web (Ackland & others, 2005; Spring et al., 2002; Adamic & Glance, 2005), academia collaboration (Shchur et al., 2019; Newman, 2006b), citation (Yang et al., 2016), and social networks (Rozemberczki et al., 2021). This diverse selection ensures that we can pretrain and evaluate the LP model based on a wide array of connectivity patterns. The details of the curated graph datasets can be found in Table 1.
Dataset Splits Yes For evaluation, each test dataset is split into 70%/10%/20% for training/validation/testing. The training set here forms the observed links Eo, while validation and test sets represent unobserved links Eu.
Hardware Specification Yes We conduct our experiments on a Linux system equipped with an NVIDIA A100 GPU with 80GB of memory.
Software Dependencies No We implement Uni LP in Pytorch Geometric framework (Fey & Lenssen, 2019). The paper mentions the framework but does not specify its version number, nor versions for underlying libraries like PyTorch or Python.
Experiment Setup Yes During pretraining, we dynamically sample 40 positive and negative links as in-context links S+ S for each query link from the corresponding pretrain dataset. For evaluation, each test dataset is split into 70%/10%/20% for training/validation/testing. During the inference, we sample k = 200 positive and negative links as in-context links per test dataset. The pretraining phase incorporates an early stopping criterion based on performance across a merged validation set