Certifiably Robust Model Evaluation in Federated Learning under Meta-Distributional Shifts

Authors: Amir Najafi, Samin Mahdizadeh Sani, Farzan Farnia

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our work is mainly theoretical; In any case, we present a series of experiments on real-world datasets to show tightness and computability of our bounds in practice. First, we outline our client generation model and present a number of non-robust risk CDF guarantees. A more complete set of experiments with complementary explanations can be found in Appendix G. We simulated a federated learning scenario with n = 1000 nodes, where each node contains 1000 local samples. The experiments were conducted using four different datasets: CIFAR-10 (Krizhevsky et al., 2009), SVHN (Netzer et al., 2011), EMNIST (Cohen et al., 2017), and Image Net (Russakovsky et al., 2015).
Researcher Affiliation Academia 1Department of Computer Engineering, Sharif University of Technology, Tehran, Iran (Corresponding author) 2Department of Electrical and Computer Engineering, University of Tehran, Tehran, Iran 3Department of Computer Science and Engineering, The Chinese University of Hong Kong (CUHK), Hong Kong. Correspondence to: Amir Najafi <EMAIL>, Samin Mahdizadeh Sani <EMAIL>, Farzan Farnia <EMAIL>.
Pseudocode Yes Algorithm 1 Server-side Bisection Algorithm Require: K, ε, δ input h, > 0, and poly K, log 1 query budget for c QVk, for all k [K] 1: Initialize a min ℓ( ) (or 0), b max ℓ( ) (or 1) 2: while b a > do 3: t (a + b)/2 4: Solve convex feasibility problem: 5: Find ρ1, . . . , ρK ε/K such that k [K] ρk ε 1 + 1 δ ) K 7: 1 K P k [K] c QVk(h, ρk) t 8: if problem is feasible then 9: a t 10: else 11: b t 12: end if 13: end while output upper-bound b
Open Source Code Yes The project code is available at: github.com/samin-mehdizadeh/ Robust-Evaluation-DKW
Open Datasets Yes The experiments were conducted using four different datasets: CIFAR-10 (Krizhevsky et al., 2009), SVHN (Netzer et al., 2011), EMNIST (Cohen et al., 2017), and Image Net (Russakovsky et al., 2015).
Dataset Splits Yes We simulated a federated learning scenario with n = 1000 nodes, where each node contains 1000 local samples. The experiments were conducted using four different datasets: CIFAR-10 (Krizhevsky et al., 2009), SVHN (Netzer et al., 2011), EMNIST (Cohen et al., 2017), and Image Net (Russakovsky et al., 2015). ... Figure 4 illustrates our bounds on the risk CDF of unseen clients with no shifts. We selected 100 nodes from the population and considered 400 other nodes as unseen clients.
Hardware Specification No No specific hardware details are provided in the paper.
Software Dependencies No No specific software dependencies with version numbers are listed in the paper.
Experiment Setup Yes Feature Distribution Shift: ... The standard deviation varies based on the dataset: 0.05 for CIFAR-10 and SVHN, 0.1 for EMNIST, and 0.01 for Image Net. ... Label Distribution Shift: ... In our experiments, we use α = 0.4. ... Resolutions: ... The Dirichlet α coefficients for the first (source) meta-distribution range from 0.4 to 0.7 for the four lower resolutions and from 0.7 to 1 for the four higher resolutions. For the second (target) meta-distribution, the ranges are reversed: 0.7 to 1 for the lower resolutions and 0.4 to 0.7 for the higher resolutions. ... Colors: The color intensity of the images varies from 0.00 (gray-scale) to 1.00 (fully colored). For the source meta-distribution, the α coefficients range from 0 to 0.5 for images with color intensity below 0.5, and from 0.5 to 1 for images above 0.5.