reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Scalable Attribute-Missing Graph Clustering via Neighborhood Differentiation

Authors: Yaowen Hu, Wenxuan Tu, Yue Liu, Xinhang Wan, Junyi Yan, Taichun Zhou, Xinwang Liu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on six widely used graph datasets demonstrate that CMV-ND significantly improves the performance of various methods." and "We validate CMV-ND through experiments on six widely used graph datasets, evaluating its superiority, sensitivity, efficiency, robustness, and effectiveness.
Researcher Affiliation	Academia	1College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China 2School of Computer Science and Technology, Hainan University, Haikou 570228, China. Correspondence to: Xinwang Liu <EMAIL>.
Pseudocode	Yes	As shown in Algorithm 1, the full procedure of CMV-ND is presented." and "We provide the Py Torch-style pseudocode for our CMV-ND in Algorithm 2.
Open Source Code	No	The paper does not provide an explicit statement or link for the open-source code of the described methodology.
Open Datasets	Yes	Cora: https://docs.dgl.ai/#Cora Graph Dataset Cite Seer: https://docs.dgl.ai/#dgl.data.Citeseer Graph Dataset Amazon-Photo: https://docs.dgl.ai/#dgl.data.Amazon Co Buy Photo Dataset ogbn-arxiv: https://ogb.stanford.edu/docs/nodeprop/#ogbn-arxiv Reddit: https://docs.dgl.ai/#dgl.data.Reddit Dataset ogbn-products: https://ogb.stanford.edu/docs/nodeprop/#ogbn-products
Dataset Splits	No	The paper mentions evaluating clustering performance on six graph datasets with a 0.6 missing rate and setting the number of clusters to the ground-truth number of classes, but does not provide specific training/test/validation dataset splits for nodes or graph structure.
Hardware Specification	Yes	All experiments are performed on a system equipped with a 24GB RTX 3090 GPU and 64GB RAM.
Software Dependencies	Yes	All experiments are implemented using Python 3.9 and Py Torch 1.12.
Experiment Setup	Yes	Unless otherwise specified, we set the number of propagation hops to K = 7 and the missing attribute rate to 0.6. For fair comparison, all downstream clustering methods follow the default hyperparameter configurations used in their original implementations. The number of clusters is set to the ground-truth number of classes for each dataset.