Generalization in Federated Learning: A Conditional Mutual Information Framework

Authors: Ziqiao Wang, Cheng Long, Yongyi Mao

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluations confirm that our evaluated CMI bounds are non-vacuous and accurately capture the generalization behavior of FL algorithms. To verify our results, we conduct FL experiments using Fed Avg (Mc Mahan et al., 2017) on two datasets. We conduct image classification experiments on two datasets: MNIST (Le Cun et al., 2010) and CIFAR-10 (Krizhevsky, 2009).
Researcher Affiliation Academia 1School of Computer Science and Technology, Tongji University, Shanghai, China 2Department of Applied Physics and Applied Mathematics, Columbia University, New York, USA 3School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, Canada.
Pseudocode No The paper describes the methodology mathematically and textually but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No We adapt the code from https://github.com/hrayrhar/f-CMI for supersample construction and CMI computation, and we use the FL training code from https://github.com/vaseline555/Federated-Learning-in-Py Torch.
Open Datasets Yes We conduct image classification experiments on two datasets: MNIST (Le Cun et al., 2010) and CIFAR-10 (Krizhevsky, 2009).
Dataset Splits Yes Additionally, we apply a pathological non-IID data partitioning scheme as in Mc Mahan et al. (2017): data are sorted by label, split into 200 shards of size 300, and each client is randomly assigned 2 shards, and we evaluate prediction error as our performance metric... When analyzing generalization behavior concerning the sample size n, we fix the superclient size at 100, leading 50 participating clients and 50 non-participating clients randomly selected by V. The sample size per client varies within n {10, 50, 100, 250}. When analyzing generalization behavior with respect to the number of participating clients K, we set n = 100 for MNIST and n = 50 for CIFAR-10, varying the number of clients as K {10, 20, 30, 50}.
Hardware Specification Yes All experiments are performed using NVIDIA A100 GPUs with 40 GB of memory.
Software Dependencies No We adapt the code from https://github.com/hrayrhar/f-CMI for supersample construction and CMI computation, and we use the FL training code from https://github.com/vaseline555/Federated-Learning-in-Py Torch. No specific version numbers for libraries or frameworks are provided.
Experiment Setup Yes Each local training algorithm Ai trains this model using full-batch GD with an initial learning rate of 0.1, which decays by a factor of 0.01 every 10 steps. At each FL round, clients train locally for 5 epochs before sending their models to the central server. The entire training process spans communication 300 rounds between clients and the central server... Each local training algorithm Ai trains the CNN model using SGD with a mini-batch size of 50 and follows the same learning rate schedule as in the MNIST experiment. As in the MNIST setup, clients train locally for five epochs per round before sending their models to the central server, with training spanning 300 communication rounds.