reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Data-Free Black-Box Federated Learning via Zeroth-Order Gradient Estimation

Authors: Xinge Ma, Jin Wang, Xuejie Zhang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on large-scale image classiﬁcation datasets and network architectures demonstrate the superiority of Fed ZGE in terms of data heterogeneity, model heterogeneity, communication efﬁciency, and privacy protection. Experimental Setup Datasets. The evaluation is conducted on two image classiﬁcation datasets commonly used in FL research: 1) CIFAR-10 (Krizhevsky 2009); 2) CIFAR-100 (Krizhevsky 2009).
Researcher Affiliation	Academia	Xinge Ma, Jin Wang*, Xuejie Zhang School of Information Science and Engineering Yunnan University Kunming, China EMAIL, EMAIL
Pseudocode	Yes	See Appendix A for detailed algorithmic procedures of the proposed Fed ZGE framework.
Open Source Code	Yes	Code https://github.com/maxinge8698/Fed ZGE
Open Datasets	Yes	The evaluation is conducted on two image classiﬁcation datasets commonly used in FL research: 1) CIFAR-10 (Krizhevsky 2009); 2) CIFAR-100 (Krizhevsky 2009).
Dataset Splits	Yes	To simulate data heterogeneity among clients, we follow prior work (Hsu, Qi, and Brown 2019) to heterogeneously partition the training set of each dataset among clients using a Dirichlet distribution Dir(α), where α is a concentration parameter that controls the degree of non-IID, with smaller values indicating more heterogeneous data distribution. To ensure reliable performance evaluation, we run the experiments three times with different random seeds and report the average accuracy with standard deviation of the global model on the original test set.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	Settings. The experiments are performed under two distinct FL settings with the key hyperparameters α={1, 0.1}, K=10, ε=1, T=100, B=500, and q=10: 1) homogeneous FL setting, where clients are forced to replicate homogeneous local models with the same architecture as the global model. We employ three types of network architectures to explore the effects of model scaling, including Res Net-18, Res Net34, and Res Net-50 (He et al. 2016); 2) heterogeneous FL setting, where clients are allowed to independently design heterogeneous local models. We employ Res Net-50 as the global model and allocate Res Net-18, Res Net-34, and Res Net-50 as the local models to clients in a ratio of 3:3:4. See Appendix C for full implementation details.