reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Bootstrapping Heterogeneous Graph Representation Learning via Large Language Models: A Generalized Approach

Authors: Hang Gao, Chenhao Zhang, Fengge Wu, Changwen Zheng, Junsuo Zhao, Huaping Liu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Theoretical analysis and experimental validation have demonstrated the effectiveness of our method. Experiments Comparison with State of the Art methods. Datasets. We utilized existing commonly used heterogeneous and homogeneous graph representation learning datasets, as well as more challenging heterogeneous graph datasets that we newly constructed. Table 2: Comparative experiment results for IMDB and DBLP datasets.
Researcher Affiliation	Academia	1National Key Laboratory of Space Integrated Information System, Institute of Software, Chinese Academy of Sciences. 2University of Chinese Academy of Sciences. 3Tsinghua University. Corresponding author, EMAIL.
Pseudocode	No	No explicit pseudocode or algorithm blocks were found in the main body of the paper. The methodology is described using prose and mathematical equations.
Open Source Code	Yes	Code, Datasets and Appendix https://github.com/zch65458525/GHGRL/tree/main
Open Datasets	Yes	Code, Datasets and Appendix https://github.com/zch65458525/GHGRL/tree/main. Specifically, we employed the IMDB, DBLP, ACM (Zhang et al. 2019) and Wiki-CS (Mernyei and Cangea 2020) datasets.
Dataset Splits	Yes	Datasets IMDB (10% Training) IMDB (40% Training) DBLP (10% Training) DBLP (40% Training). Additionally, we adjusted the proportion of training data in the datasets to compare test results under different conditions.
Hardware Specification	No	No specific hardware details (like GPU models, CPU types, or cloud instance specifications) are provided in the main text of the paper. The paper states: "The specific experimental setup, including hyperparameters and the environment used, is detailed in Appendix D.", but Appendix D is not provided.
Software Dependencies	No	The paper mentions using 'Llama 3 (Dubey et al. 2024)' as the backbone LLM, but does not provide specific version numbers for software libraries or dependencies used in the implementation. It refers to the LLM model itself rather than specific software versions.
Experiment Setup	No	The specific experimental setup, including hyperparameters and the environment used, is detailed in Appendix D. The parameters mfmt and mcont are hyperparameters controlling the size of Φfmt and Φcont, respectively. where α be a hyperparameter to control the proportion of original node features. However, specific values for these hyperparameters are not provided in the main text, and Appendix D is not available.