reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Harnessing Language Model for Cross-Heterogeneity Graph Knowledge Transfer

Authors: Jinyu Yang, Ruijia Wang, Cheng Yang, Bo Yan, Qimin Zhou, Yang Juan, Chuan Shi

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on four real-world datasets have demonstrated the superior performance of LMCH over state-of-the-art methods.
Researcher Affiliation	Collaboration	1Beijing University of Posts and Telecommunications, Beijing, China 2China Telecom Cloud Computing Research Institute, Beijing, China
Pseudocode	Yes	Detailed algorithm is presented in Appendix A.11.
Open Source Code	Yes	Code https://github.com/BUPT-GAMMA/LMCH
Open Datasets	Yes	We conduct extensive experiments on four benchmark datasets: IMDB, DBLP (Wang et al. 2019), YELP (Lu et al. 2019) and Pub Med (Zhang et al. 2024a).
Dataset Splits	Yes	Following prior research (Ding, Wang, and Liu 2023), we also employ a leave-one-out strategy, wherein one dataset is designated as the target HG while the remaining datasets function as source HGs. For the fair comparison, we leverage a pretrained LM to encode the nodes attribute information as initial features. We train all models using supervised node classification tasks both on the source and target HGs. Parameter Settings. In our experiments, few-shot learning follows an N-way K-shot setting, with N in {2, 3} and K in {1, 3, 5}.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU models, or memory amounts used for running its experiments.
Software Dependencies	No	The GNN used is RGCN (Schlichtkrull et al. 2018). All MLPs are 2-layer networks with a hidden dimension of 128.
Experiment Setup	Yes	All MLPs are 2-layer networks with a hidden dimension of 128. For fairness, we set the node embedding dimension to 128 for both LMCH and baselines. We apply early stopping to control iterations in the GNN-supervised LM fine-tuning, with a maximum of 10 iterations. LMCH hyper-parameters are optimized via gridsearching for best performance, while baseline parameters are initially set according to the original papers and then optimized. Full hyper-parameter settings are provided in Appendix A.5 (Table 5), with a detailed study in Appendix A.2.