Personalized Federated Learning of Probabilistic Models: A PAC-Bayesian Approach
Authors: Mahrokh Ghoddousi Boroujeni, Andreas Krause, Giancarlo Ferrari-Trecate
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate PAC-PFL on Gaussian Process (GP) regression and Bayesian Neural Network (BNN) classification as representative examples of probabilistic models. Our experiments demonstrate that PAC-PFL yields accurate and well-calibrated predictions (c1), even in highly heterogeneous (c2) and data-poor (c3) scenarios. |
| Researcher Affiliation | Academia | Mahrokh G. Boroujeni EMAIL Institute of Mechanical Engineering EPFL, Switzerland Andreas Krause EMAIL Department of Computer Science ETH Zürich, Switzerland Giancarlo Ferrari-Trecate EMAIL Institute of Mechanical Engineering EPFL, Switzerland |
| Pseudocode | Yes | Algorithm 1 PAC-PFL executed by the server... Algorithm 2 Client_Update for client i with dataset Si... Algorithm 3 Differentially private PAC-PFL with 1 SVGD particle executed by the server |
| Open Source Code | Yes | The codebase for our algorithm is available on https://sites.google.com/view/pac-pfl. ... The source code for our PAC-PFL implementation using GP is accessible within the same Google Drive repository. Upon acceptance, we intend to make the source code for BNN publicly available. To facilitate the use of our software, we have incorporated a demonstration Jupyter Notebook in the source code repository. |
| Open Datasets | Yes | We employ the FEMNIST dataset, which is curated and maintained by the LEAF project (Caldas et al., 2019). ... The PV dataset can be accessed via the following link: https://drive.google. com/drive/folders/153Me Alnt N4VORHdg YQ3w G3Oyl W0Sl Bf9?usp=sharing. ... The EMNIST dataset is detailed in Appendix 8.5. |
| Dataset Splits | Yes | We utilize the original train-test split provided with the data, without any additional preprocessing. ... The first dataset comprises the initial two weeks of June 2018, which provides a total of 150 samples for each client. The second dataset encompasses the data from both June and July 2018, resulting in 610 training samples per client. For all experiments, the test dataset consists of the data from June and July 2019. |
| Hardware Specification | No | The paper does not explicitly state specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It only mentions general computing environments like 'on a server' or 'on a cluster' implicitly. |
| Software Dependencies | No | The paper mentions the 'pvlib Python library (Holmgren et al., 2022)' and refers to other implementations, but does not provide specific version numbers for key software components or libraries (e.g., Python, PyTorch, TensorFlow, CUDA versions) used for its own experiments. |
| Experiment Setup | Yes | For all neural networks, we explore structures with the same number of neurons per layer. The number of neurons per layer can take values of 2n for n ∈ {1, ..., 6}, and we consider 2 or 4 hidden layers. For PAC-PFL, we employ 4 SVGD particles and set k = 4. The parameter β is set to the number of samples for each client, β = mi. ... In all PV experiments, we set the hyper-prior mean for the neural network weights and biases to 0 and the hyper-prior mean for the noise standard deviation to 0.4. |