Generalization in Federated Learning: A Conditional Mutual Information Framework
Authors: Ziqiao Wang, Cheng Long, Yongyi Mao
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluations confirm that our evaluated CMI bounds are non-vacuous and accurately capture the generalization behavior of FL algorithms. To verify our results, we conduct FL experiments using Fed Avg (Mc Mahan et al., 2017) on two datasets. We conduct image classification experiments on two datasets: MNIST (Le Cun et al., 2010) and CIFAR-10 (Krizhevsky, 2009). |
| Researcher Affiliation | Academia | 1School of Computer Science and Technology, Tongji University, Shanghai, China 2Department of Applied Physics and Applied Mathematics, Columbia University, New York, USA 3School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, Canada. |
| Pseudocode | No | The paper describes the methodology mathematically and textually but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | We adapt the code from https://github.com/hrayrhar/f-CMI for supersample construction and CMI computation, and we use the FL training code from https://github.com/vaseline555/Federated-Learning-in-Py Torch. |
| Open Datasets | Yes | We conduct image classification experiments on two datasets: MNIST (Le Cun et al., 2010) and CIFAR-10 (Krizhevsky, 2009). |
| Dataset Splits | Yes | Additionally, we apply a pathological non-IID data partitioning scheme as in Mc Mahan et al. (2017): data are sorted by label, split into 200 shards of size 300, and each client is randomly assigned 2 shards, and we evaluate prediction error as our performance metric... When analyzing generalization behavior concerning the sample size n, we fix the superclient size at 100, leading 50 participating clients and 50 non-participating clients randomly selected by V. The sample size per client varies within n {10, 50, 100, 250}. When analyzing generalization behavior with respect to the number of participating clients K, we set n = 100 for MNIST and n = 50 for CIFAR-10, varying the number of clients as K {10, 20, 30, 50}. |
| Hardware Specification | Yes | All experiments are performed using NVIDIA A100 GPUs with 40 GB of memory. |
| Software Dependencies | No | We adapt the code from https://github.com/hrayrhar/f-CMI for supersample construction and CMI computation, and we use the FL training code from https://github.com/vaseline555/Federated-Learning-in-Py Torch. No specific version numbers for libraries or frameworks are provided. |
| Experiment Setup | Yes | Each local training algorithm Ai trains this model using full-batch GD with an initial learning rate of 0.1, which decays by a factor of 0.01 every 10 steps. At each FL round, clients train locally for 5 epochs before sending their models to the central server. The entire training process spans communication 300 rounds between clients and the central server... Each local training algorithm Ai trains the CNN model using SGD with a mini-batch size of 50 and follows the same learning rate schedule as in the MNIST experiment. As in the MNIST setup, clients train locally for five epochs per round before sending their models to the central server, with training spanning 300 communication rounds. |