reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

pFedGPA: Diffusion-based Generative Parameter Aggregation for Personalized Federated Learning

Authors: Jiahao Lai, Jiaqi Li, Jian Xu, Yanru Wu, Boshi Tang, Siqi Chen, Yongfeng Huang, Wenbo Ding, Yang Li

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results consistently demonstrate the superior performance of the proposed method across multiple datasets, surpassing baseline approaches. Experiments Setup Datasets and Local Models. We conduct image classification tasks and evaluate our method on three widely-used benchmark datasets: Fashion-MNIST (Xiao, Rasul, and Vollgraf 2017), EMNIST (Cohen et al. 2017), and CIFAR-10 (Krizhevsky and Hinton 2009).
Researcher Affiliation	Academia	1 Tsinghua Shenzhen International Graduate School, Tsinghua University 2 Department of Computer Science and Engineering, The Chinese University of Hong Kong
Pseudocode	Yes	Algorithm 1: p Fed GPA Input: Communication rounds T, initialization rounds I Server executes: 1: for each round t = 1, 2, ..., T do 2: if client i uploads its model θi then 3: Update diffusion model with θi according to the loss Lddpm 4: Update local model ˆθi using Inversion by Eq. (15) 5: θi Local Update (i, ˆθi) 6: end if 7: if new client k joins the network then 8: Initialize local model as ˆθk 0 = θk 0 9: for each round l = 1, 2, ..., I do 10: θk l Local Update (k, ˆθk l 1) 11: ϵϕ(θk l ) = ϵϕ(θk l ) (1 + ω)(θk l 1 θk l ) 12: Update local model ˆθk l using denoising sampling by Eq. (6) iteratively for s steps 13: end for 14: θk Local Update (k, ˆθk I ) 15: end if 16: end for Local Update(i, ˆθi): 1: Update local model: θi ˆθi. 2: for each batch (x, y) Di do 3: Update local model: θi θi λ θℓi(Fi(θi; x), y) 4: end for 5: return θi
Open Source Code	No	The paper does not provide an explicit statement of code release or a link to a code repository. It only mentions that 'The complete details of the model are provided in the Appendix,' which refers to model architecture details, not source code availability.
Open Datasets	Yes	Experiments Setup Datasets and Local Models. We conduct image classification tasks and evaluate our method on three widely-used benchmark datasets: Fashion-MNIST (Xiao, Rasul, and Vollgraf 2017), EMNIST (Cohen et al. 2017), and CIFAR-10 (Krizhevsky and Hinton 2009).
Dataset Splits	Yes	Data Partitioning. For the heterogeneous data distribution setting, we adopt the approach proposed in (Karimireddy et al. 2020b; Zhang et al. 2021), ensuring that all clients have equal data sizes. A portion of the data (s%, default 20%) is uniformly sampled from all classes, while the remaining (100 s)% is sampled from a set of dominant classes specific to each client. Clients are grouped based on their dominant classes, though this grouping is unknown to the server. Additionally, the size of the local training data is kept small, specifically at 600 samples per client, to emphasize the necessity of FL. The testing data on each client is drawn from the same distribution as the training data.
Hardware Specification	Yes	Practical Considerations. In our experiments, training a round of the diffusion model takes about an hour on a single Nvidia 4090 24GB GPU, with the entire FL process completing in four hours.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch or TensorFlow).
Experiment Setup	Yes	Training Details. All local models are trained using mini-batch SGD as the optimizer, with 2 epochs per round and a batch size of 50. The number of global communication rounds is set to 200 for Fashion-MNIST and EMNIST, and 300 for CIFAR-10. For Fashion-MNIST, the entire model is generated, while for EMNIST and CIFAR-10, only the final two fully connected layers are generated. We report the average test accuracy across clients. For p Fed GPA, we trained the diffusion models using parameters collected from the last 20 rounds, generating new parameters in the final round.