reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Progressive Distribution Matching for Federated Semi-Supervised Learning

Authors: Dongping Liao, Xitong Gao, Yabo Xu, Cheng-Zhong Xu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments, we demonstrate the superiority of Fed PDM on a variety of models and datasets compared with prior arts for FSSL. ... We present the main evaluation of Fed PDM and compared methods under two FSSL scenarios in Table 1. ... Figure 4 illustrates the convergence curves of evaluated FSSL methods under Mixed.
Researcher Affiliation	Collaboration	1 State Key Lab of Io TSC, Department of Computer and Information Science, University of Macau, Macau SAR, China 2 Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China 3 Shenzhen University of Advanced Technology, Shenzhen, China 4 Data Story Information Technology Co., Ltd
Pseudocode	Yes	Algorithm 1: Fed PDM: Progressive Distribution Matching Server Update (C, R, T):
Open Source Code	No	The paper does not explicitly state that the source code is available, nor does it provide a link to a code repository.
Open Datasets	Yes	We conduct experiments on five prevalent datasets for FSSL: Fashion-MNIST (Xiao, Rasul, and Vollgraf 2017), SVHN (Netzer et al. 2011), CIFAR-10 (Krizhevsky, Hinton et al. 2009), CIFAR-100 (Krizhevsky, Hinton et al. 2009), and ISIC2018 (Codella et al. 2019).
Dataset Splits	Yes	We partition the labeled and unlabeled datasets separately to multiple sub-datasets according to Dirichlet distribution Dir(α), where α controls the heterogeneity degree of local clients. A small α indicates high data heterogeneity, while α gives class-balanced data partitions.
Hardware Specification	No	The paper does not provide specific details about the hardware used for experiments (e.g., GPU models, CPU types, or memory specifications).
Software Dependencies	No	The paper mentions using the SGD optimizer but does not specify the version of any programming languages or libraries (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup	Yes	For all experiments, we used the SGD optimizer with batch size 64, momentum 0.9, and weight decay 5e 4. We utilized ResNet18 (He et al. 2016) for main evaluation. We set the number of communication rounds T as 400 and local epoch e as 5. For Mixed, we set the learning rate as 0.03. For Pure, the learning rate for labeled and unlabeled clients are respectively set as 0.03 and 0.02. A cosine learning rate scheduler with the total decay rounds being set as 400. ... To select high-quality pseudo-labels, we set the confidence threshold τ = 0.95 on all datasets without further tuning.