reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Graph Consistency and Diversity Measurement for Federated Multi-View Clustering

Authors: Bohang Sun, Yongjian Deng, Yuena Lin, Qiuru Hai, Zhen Yang, Gengyu Lyu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on various datasets have demonstrated the effectiveness of our proposed method on clustering federated multi-view data. ... Experiments Experimental Settings Datasets We employ six widely-used multi-view datasets for comparative studies... The Compared Methods In order to verify the effectiveness of our proposed MGCD, we employ eight state-of-the-art multi-view clustering methods for comparative experiments... Metrics There are four widely-used metrics applied to quantitatively evaluate the performance of multi-view clustering methods... Ablation Study. We conduct two series of ablation studies from the perspective of loss functions and model components.
Researcher Affiliation	Collaboration	Bohang Sun1, Yongjian Deng1, Yuena Lin1,2, Qiuru Hai1, Zhen Yang1, Gengyu Lyu1* 1College of Computer Science, Beijing University of Technology 2Idealism Beijing Technology Co., Ltd. EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: The Training Process of MGCD.
Open Source Code	No	The paper states: "all compared methods are implemented according to the source codes released by the authors", but this refers to the source code of other methods, not the code for the method proposed in this paper. There is no explicit statement or link provided for the code related to MGCD.
Open Datasets	Yes	Datasets We employ six widely-used multi-view datasets for comparative studies, including Mfeat (Wang, Yang, and Liu 2019), Scene (Fei-Fei and Perona 2005), Aloi (Li et al. 2023), Animal (Li et al. 2016), Cifar10 (Zhang et al. 2018) and Noisy MNIST (Peng et al. 2019). The specific characteristics of these datasets are recorded in Table 1.
Dataset Splits	No	The paper mentions training on "mini-batches of size 256" but does not specify how the overall datasets (Mfeat, Scene, Aloi, Animal, Cifar10, Noisy MNIST) were split into training, validation, or test sets for reproduction.
Hardware Specification	Yes	All experiments are conducted on the same machine with the Intel(R) Xeon(R) Gold 6148 2.40GHz CPU, Ge Force RTX 3090 GPUs, and 512GB RAM.
Software Dependencies	No	The paper mentions using "Adam optimizer(Kingma and Ba 2014)" and "Py Torch(Paszke et al. 2019) framework" but does not specify explicit version numbers for PyTorch or other libraries, which is required for reproducible software dependencies.
Experiment Setup	Yes	The diversity encoder Ev d, consistency encoder Ev c and decoder Dv are formulated by four fully-connected layers and the dimensions are set to {Dv, 500, 500, 2000, 512}, {Dv, 500, 500, 2000, 512} and {512, 2000, 500, 500, Dv} respectively, where the activation function is RELU. The f v d ( ) in client is composed of a fully-connected layer with dimendisons {512, N}. To reduce the size of server model, the dimensions of fpre( ) and fpost( ) are respectively set as {N, 500, N} and {N, 500, N}, where N >> 500. At the first training epoch, we pre-train dual autoencoder 20 epochs in each client. Then, at the following training epoch, the clients and server iteratively train on mini-batches of size 256 by using Adam optimizer(Kingma and Ba 2014) with learning rate of 0.000001 in Py Torch(Paszke et al. 2019) framework. The hyperparameters γ and λ are set to 100 and 0.001 respectively.