reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Noisy Node Classification by Bi-level Optimization Based Multi-Teacher Distillation

Authors: Yujing Liu, Zongqian Wu, Zhengyu Lu, Ci Nie, Guoqiu Wen, Yonghua Zhu, Xiaofeng Zhu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results on real datasets show that our method achieves the best results compared to state-of-the-art methods. We conduct extensive experiments on five real datasets. Experimental results demonstrate that our method outperforms existing SOTA methods. Result Analysis We compare our proposed BO-NNC with all comparison methods on five datsets in terms of node classification tasks with different noise rates and report the results in Table 1. Ablation Study Our method contains three important components, i.e., using the student model to integrate knowledge from multiple teacher models (C1 for short), multi-teacher knowledge distillation based on bi-level optimization (C2 for short), and label improvement (C3 for short). To demonstrate the effectiveness of each component, we report the node classification performance of different component combinations on all datasets at the highest noise rates in Table 2.
Researcher Affiliation	Academia	1School of Computer Science and Engineering, Guangxi Normal University, Guilin, China 2Guangxi Key Lab of Multisource Information Mining and Security, Guangxi Normal University, Guilin, China 3School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China 4School of Computer Science and Information Engineering, Anyang Institute of Technology, Anyang, China 5Information Systems Technology Design Pillar, Singapore University of Technology and Design, Singapore
Pseudocode	No	The paper describes the methodology using mathematical equations and descriptive text, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide a statement about the availability of its own source code or a link to a repository. It mentions obtaining source code for comparison methods: "In addition, we obtain the source code of all comparison methods from the authors and set the parameters of all comparison methods according to the original literature so that they output the best performance on all datasets."
Open Datasets	Yes	The benchmark datasets in our experiments include three citation datasets (i.e., DBLP (Bojchevski and G unnemann 2017), Cora and Citeseer (Yang, Cohen, and Salakhudinov 2016)), and two business datasets (i.e., Computers and Photo (Shchur et al. 2018)).
Dataset Splits	No	We partition each dataset into three nonoverlapping subsets, i.e., training set, validation set, and test sets. Furthermore, we follow the literature (Han et al. 2018; Jiang et al. 2018) to add noise into the training datasets.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions employing methods like GCA (Zhu et al. 2021), DGI (Veliˇckovi c et al. 2018), SUGRL (Mo et al. 2022), Jo Co R (Wei et al. 2020), NRGNN (Dai, Aggarwal, and Wang 2021), and MTS-GNN (Liu et al. 2023a), and using a two-layer GCN as a backbone. However, it does not specify any software versions (e.g., Python, PyTorch, TensorFlow versions) or specific solver versions.
Experiment Setup	No	The paper describes general experimental procedures such as partitioning datasets, adding noise, conducting five experiments with different random seeds, and using a two-layer GCN as the backbone. It also mentions two key hyperparameters, ρ and r, for which sensitivity analysis is performed. However, it does not provide specific numerical values for hyperparameters (e.g., learning rate, batch size, number of epochs) or other detailed training configurations used for the main results.