Node Duplication Improves Cold-start Link Prediction

Authors: Zhichun Guo, Tong Zhao, Yozen Liu, Kaiwen Dong, William Shiao, Mingxuan Ju, Neil Shah, Nitesh V Chawla

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that Node Dup achieves 38.49%, 13.34%, and 6.76% relative improvements on isolated, low-degree, and warm nodes, respectively, on average across all datasets compared to GNNs and the existing cold-start methods.
Researcher Affiliation Collaboration 1University of Washington 2Snap Inc. 3University of Notre Dame 4University of California, Riverside
Pseudocode Yes Algorithm 1: Node Dup.
Open Source Code Yes Our code can be found at https://github.com/zhichunguo/Node Dup.
Open Datasets Yes We conduct experiments on 7 benchmark datasets: Cora, Citeseer, CS, Physics, Computers, Photos and IGB-100K, with their details specified in Appendix B.
Dataset Splits Yes We randomly split edges into training, validation, and testing sets. We allocated 10% for validation and 40% for testing in Computers and Photos, 5%/10% for testing in IGB-100K, and 10%/20% in other datasets.
Hardware Specification Yes All methods were implemented in Python 3.10.9 with Pytorch 1.13.1 and Py Torch Geometric (Fey & Lenssen, 2019). The experiments were all conducted on an NVIDIA P100 GPU with 16GB memory.
Software Dependencies Yes All methods were implemented in Python 3.10.9 with Pytorch 1.13.1 and Py Torch Geometric (Fey & Lenssen, 2019).
Experiment Setup Yes We use 2-layer GNN architectures with 256 hidden dimensions for all GNNs and datasets. The dropout rate is set as 0.5. We report the results over 10 random seeds. Hyperparameters were tuned using an early stopping strategy based on performance on the validation set. We manually tune the learning rate for the final results. For the results with the inner product as the decoder, we tune the learning rate over range: lr {0.001, 0.0005, 0.0001, 0.00005}. For the results with MLP as the decoder, we tune the learning rate over range: lr {0.01, 0.005, 0.001, 0.0005}.