reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Jigsaw Game: Federated Clustering

Authors: JINXUAN XU, Hong-You Chen, Wei-Lun Chao, Yuqian Zhang

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically demonstrate the robustness of Fe CA under various federated scenarios on both synthetic and real-world data. Additionally, we extend Fe CA to representation learning and present Deep Fe CA, which combines Deep Cluster and Fe CA for unsupervised feature learning in the federated setting. We evaluate both Fe CA and Deep Fe CA on benchmark datasets, including S-sets Fränti & Sieranoja (2018), CIFAR Krizhevsky et al. (2009), and Tiny-Image Net Le & Yang (2015).
Researcher Affiliation	Academia	Jinxuan Xu EMAIL Department of Electrical and Computer Engineering Rutgers University Hong-You Chen EMAIL Department of Computer Science and Engineering The Ohio State University Wei-Lun Chao EMAIL Department of Computer Science and Engineering The Ohio State University Yuqian Zhang EMAIL Department of Electrical and Computer Engineering Rutgers University
Pseudocode	Yes	Algorithm 1 Federated Centroid Aggregation (Fe CA) ... Algorithm 2 Fe CAClient Update ... Algorithm 3 Fe CARadius Assign (Theoretical) ... Algorithm 4 Fe CARadius Assign (Empirical) ... Algorithm 5 Fe CAServer Aggregation ... Algorithm 6 Deep Fe CA for Representation Learning
Open Source Code	No	The paper mentions 'We modified the official implementation of Deep Cluster-v2 Caron et al. (2020)' but does not provide a statement or link for the release of their own source code for the methodology described in this paper.
Open Datasets	Yes	We evaluate both Fe CA and Deep Fe CA on benchmark datasets, including S-sets Fränti & Sieranoja (2018), CIFAR Krizhevsky et al. (2009), and Tiny-Image Net Le & Yang (2015).
Dataset Splits	Yes	To simulate the non-IID data partitions, we follow Hsu et al. (2019) to split the data drawn from Dirichlet(α) for multiple clients. Smaller α indicates that the split is more heterogeneous. We also include the IID setting, in which clients are provided with uniformly split subsets of the entire dataset. Furthermore, we have standardized the number of clients to M = 10 for all experiments in this section.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, memory, or processor types used for running experiments.
Software Dependencies	No	The paper mentions models like 'Res Net-18' and methods like 'Deep Cluster-v2', and frameworks such as 'Fed Avg', but does not list specific software libraries or their version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	We randomly initialize a Res Net-18 model and train it for 150 rounds, with all 10 clients fully participating in the process. Each round we train 5 local epochs for clients models with batch size 128.