Harnessing Language Model for Cross-Heterogeneity Graph Knowledge Transfer
Authors: Jinyu Yang, Ruijia Wang, Cheng Yang, Bo Yan, Qimin Zhou, Yang Juan, Chuan Shi
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on four real-world datasets have demonstrated the superior performance of LMCH over state-of-the-art methods. |
| Researcher Affiliation | Collaboration | 1Beijing University of Posts and Telecommunications, Beijing, China 2China Telecom Cloud Computing Research Institute, Beijing, China |
| Pseudocode | Yes | Detailed algorithm is presented in Appendix A.11. |
| Open Source Code | Yes | Code https://github.com/BUPT-GAMMA/LMCH |
| Open Datasets | Yes | We conduct extensive experiments on four benchmark datasets: IMDB, DBLP (Wang et al. 2019), YELP (Lu et al. 2019) and Pub Med (Zhang et al. 2024a). |
| Dataset Splits | Yes | Following prior research (Ding, Wang, and Liu 2023), we also employ a leave-one-out strategy, wherein one dataset is designated as the target HG while the remaining datasets function as source HGs. For the fair comparison, we leverage a pretrained LM to encode the nodes attribute information as initial features. We train all models using supervised node classification tasks both on the source and target HGs. Parameter Settings. In our experiments, few-shot learning follows an N-way K-shot setting, with N in {2, 3} and K in {1, 3, 5}. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU models, or memory amounts used for running its experiments. |
| Software Dependencies | No | The GNN used is RGCN (Schlichtkrull et al. 2018). All MLPs are 2-layer networks with a hidden dimension of 128. |
| Experiment Setup | Yes | All MLPs are 2-layer networks with a hidden dimension of 128. For fairness, we set the node embedding dimension to 128 for both LMCH and baselines. We apply early stopping to control iterations in the GNN-supervised LM fine-tuning, with a maximum of 10 iterations. LMCH hyper-parameters are optimized via gridsearching for best performance, while baseline parameters are initially set according to the original papers and then optimized. Full hyper-parameter settings are provided in Appendix A.5 (Table 5), with a detailed study in Appendix A.2. |