reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Aggregation Mechanism Based Graph Heterogeneous Networks Distillation

Authors: Xiaobin Hong, Mingkai Lin, Xiangkai Ma, Wenzhong Li, Sanglu Lu

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on 8 standard and 4 large-scale datasets demonstrate that AMEND consistently outperforms state-of-the-art distillation methods. To fully evaluate the proposed method, we conduct extensive experiments on 8 regular graph datasets and 4 large-scale graph datasets to compare with state-of-the-art methods.
Researcher Affiliation	Academia	Xiaobin Hong , Mingkai Lin , Xiangkai Ma , Wenzhong Li , Sanglu Lu State Key Laboratory for Novel Software Technology, Nanjing University EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 AMEND Algorithm Input: graph G = {V, E}, node feature matrix X, and precomputed position encoding Xpe Output: optimized parameters of the student MLP S, predict node labels ˆY. 1: Model initialization and Dataset Partitioning. 2: Pretrain the teacher model T with cross-entropy loss. 3: #Student MLP Training 4: for Epochs do 5: #Aggragation Context Preservation 6: ZT = T (X, E, Xpe), 7: ZS = S(X, Xpe); 8: #Aggregation-enhanced CKA 9: LACKA ACKA(ZT , ZS) in Eq. 7; 10: #Shared Manifold mixup 11: Zmix T = λZT + (1 λ)Z T ; 12: Zmix S = λZS + (1 λ)Z S; 13: ˆYT , ˆYS g T (ZT ), g S(ZS); 14: ˆYmix T , ˆYmix S g T (Zmix T ), g S(Zmix S ); 15: #Logit distillation 16: Llogit = Lmix + Lpred in Eq. 11; 17: #Overall loss compute 18: LS = Ltask + βLACKA + γLlogit in Eq. 12; 19: Gradient backward and model optimization. 20: end for 21: return S, ˆY
Open Source Code	No	The paper does not explicitly state that source code is provided or give a link to a repository. Phrases like "we release our code" or similar are not present.
Open Datasets	Yes	Datasets. To fully evaluate our proposed method, we use 8 public regular graph benchmarks [Yang et al., 2021], i.e. Cora, Citeseer, Pubmed, Computer, Photo, Corafull, Coauthor-CS, Coauthor-Physics, and 4 large-scale graphs [Hu et al., 2020], i.e., Ogbn-Arxiv, Aminer, Reddit, and Ogbn-Products.
Dataset Splits	Yes	For each dataset, we follow the dataset protocol in [Chen et al., 2023], where 6/2/2 of the nodes are used as training/validation/test sets, respectively. For the first two datasets, we randomly selected two non-overlapping 10% nodes as the validation and test sets, respectively, and doubled 1% for the last two datasets.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	In Figure 5, we explore the sensitivity of hyper parameters β and γ in overall objective function Eq. 12 on three citation graphs. β and γ represent the contributions of the ACKA and manifold mixup logit distillation, respectively. The results indicate that the optimal performance is achieved with β = 10 and γ = 0.1. According to the definition of LACKA, its value range is [0, 1]. We monitored the values of each component of the loss function during training and found that, with β = 10, γ = 0.1, the scales of LACKA and Llogit were comparable to the task loss component Ltask, leading to optimal model convergence.