Adaptive Clipping for Differential Private Federated Learning in Interpolation Regimes

Authors: Takumi Fukami, Tomoya Murata, Kenta Niwa

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments confirm the superiority of our adaptive clipping algorithm over standard DP optimization with fixed clipping radius in federated learning settings. Additionally, we conduct numerical experiments to demonstrate the superiority of the proposed method over non-adaptive clipping approaches. 5 Numerical Experiments
Researcher Affiliation Industry Takumi Fukami EMAIL NTT Social Information Laboratories. Tomoya Murata EMAIL NTT DATA Mathematical Systems Inc.. Kenta Niwa EMAIL NTT Communication Science Laboratories
Pseudocode Yes The concrete procedures of our proposed method are given in Algorithm 1. In each round, we first compute Cr. ... The concrete procedures of DP-Fed Avg are given in Algorithm 2 for record level centralized DP.
Open Source Code No The paper does not contain any explicit statements about code availability, such as a link to a repository or a declaration that code is provided in supplementary materials.
Open Datasets Yes T2) Bank Marketing3 (ntrain = 45,211), T3) MNIST (ntrain = 3,000), and T4) Fashion MNIST (ntrain = 3,000) (Xiao et al., 2017). 3https://archive.ics.uci.edu/dataset/222/bank+marketing
Dataset Splits No The training dataset, consisting of ntrain samples, was partitioned across P = 2 clients such that each client holds nmin = ntrain/P samples. Test accuracy was recorded at the center server at the end of each round. As additional experiments, we conducted evaluations with P = 4, 6 clients, as well as experiments using a non-convex model. These results are provided in Appendix E.2 and Appendix E.3, respectively.
Hardware Specification No The paper mentions 'CPU/GPU memory usage[GB]' in Table 6 but does not specify the models or types of CPUs or GPUs used for the experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup Yes Hyperparameter Tuning We configured (R, K) for each dataset to ensure that the model parameters approached convergence using a minibatch size of b = 100 and τ = 1. ... The results obtained with the optimal combination of (C, ˆG) and η to achieve the lowest training loss are summarized in Subsec. 5.2. ... For the convex model, we employed a two-layer Multi-Layer Perceptron (MLP) with fixed (i.e., untrainable) first-layer weights to ensure convexity. ... In our setup, the hidden layer dimension was set to 128 units for T2) and 512 units for T3) T4). ... We employed Mobile Net as the model architecture, with η = 0.3 and C, ˆG = 0.05.