Node Duplication Improves Cold-start Link Prediction
Authors: Zhichun Guo, Tong Zhao, Yozen Liu, Kaiwen Dong, William Shiao, Mingxuan Ju, Neil Shah, Nitesh V Chawla
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that Node Dup achieves 38.49%, 13.34%, and 6.76% relative improvements on isolated, low-degree, and warm nodes, respectively, on average across all datasets compared to GNNs and the existing cold-start methods. |
| Researcher Affiliation | Collaboration | 1University of Washington 2Snap Inc. 3University of Notre Dame 4University of California, Riverside |
| Pseudocode | Yes | Algorithm 1: Node Dup. |
| Open Source Code | Yes | Our code can be found at https://github.com/zhichunguo/Node Dup. |
| Open Datasets | Yes | We conduct experiments on 7 benchmark datasets: Cora, Citeseer, CS, Physics, Computers, Photos and IGB-100K, with their details specified in Appendix B. |
| Dataset Splits | Yes | We randomly split edges into training, validation, and testing sets. We allocated 10% for validation and 40% for testing in Computers and Photos, 5%/10% for testing in IGB-100K, and 10%/20% in other datasets. |
| Hardware Specification | Yes | All methods were implemented in Python 3.10.9 with Pytorch 1.13.1 and Py Torch Geometric (Fey & Lenssen, 2019). The experiments were all conducted on an NVIDIA P100 GPU with 16GB memory. |
| Software Dependencies | Yes | All methods were implemented in Python 3.10.9 with Pytorch 1.13.1 and Py Torch Geometric (Fey & Lenssen, 2019). |
| Experiment Setup | Yes | We use 2-layer GNN architectures with 256 hidden dimensions for all GNNs and datasets. The dropout rate is set as 0.5. We report the results over 10 random seeds. Hyperparameters were tuned using an early stopping strategy based on performance on the validation set. We manually tune the learning rate for the final results. For the results with the inner product as the decoder, we tune the learning rate over range: lr {0.001, 0.0005, 0.0001, 0.00005}. For the results with MLP as the decoder, we tune the learning rate over range: lr {0.01, 0.005, 0.001, 0.0005}. |