Certifiably Robust Model Evaluation in Federated Learning under Meta-Distributional Shifts
Authors: Amir Najafi, Samin Mahdizadeh Sani, Farzan Farnia
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our work is mainly theoretical; In any case, we present a series of experiments on real-world datasets to show tightness and computability of our bounds in practice. First, we outline our client generation model and present a number of non-robust risk CDF guarantees. A more complete set of experiments with complementary explanations can be found in Appendix G. We simulated a federated learning scenario with n = 1000 nodes, where each node contains 1000 local samples. The experiments were conducted using four different datasets: CIFAR-10 (Krizhevsky et al., 2009), SVHN (Netzer et al., 2011), EMNIST (Cohen et al., 2017), and Image Net (Russakovsky et al., 2015). |
| Researcher Affiliation | Academia | 1Department of Computer Engineering, Sharif University of Technology, Tehran, Iran (Corresponding author) 2Department of Electrical and Computer Engineering, University of Tehran, Tehran, Iran 3Department of Computer Science and Engineering, The Chinese University of Hong Kong (CUHK), Hong Kong. Correspondence to: Amir Najafi <EMAIL>, Samin Mahdizadeh Sani <EMAIL>, Farzan Farnia <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Server-side Bisection Algorithm Require: K, ε, δ input h, > 0, and poly K, log 1 query budget for c QVk, for all k [K] 1: Initialize a min ℓ( ) (or 0), b max ℓ( ) (or 1) 2: while b a > do 3: t (a + b)/2 4: Solve convex feasibility problem: 5: Find ρ1, . . . , ρK ε/K such that k [K] ρk ε 1 + 1 δ ) K 7: 1 K P k [K] c QVk(h, ρk) t 8: if problem is feasible then 9: a t 10: else 11: b t 12: end if 13: end while output upper-bound b |
| Open Source Code | Yes | The project code is available at: github.com/samin-mehdizadeh/ Robust-Evaluation-DKW |
| Open Datasets | Yes | The experiments were conducted using four different datasets: CIFAR-10 (Krizhevsky et al., 2009), SVHN (Netzer et al., 2011), EMNIST (Cohen et al., 2017), and Image Net (Russakovsky et al., 2015). |
| Dataset Splits | Yes | We simulated a federated learning scenario with n = 1000 nodes, where each node contains 1000 local samples. The experiments were conducted using four different datasets: CIFAR-10 (Krizhevsky et al., 2009), SVHN (Netzer et al., 2011), EMNIST (Cohen et al., 2017), and Image Net (Russakovsky et al., 2015). ... Figure 4 illustrates our bounds on the risk CDF of unseen clients with no shifts. We selected 100 nodes from the population and considered 400 other nodes as unseen clients. |
| Hardware Specification | No | No specific hardware details are provided in the paper. |
| Software Dependencies | No | No specific software dependencies with version numbers are listed in the paper. |
| Experiment Setup | Yes | Feature Distribution Shift: ... The standard deviation varies based on the dataset: 0.05 for CIFAR-10 and SVHN, 0.1 for EMNIST, and 0.01 for Image Net. ... Label Distribution Shift: ... In our experiments, we use α = 0.4. ... Resolutions: ... The Dirichlet α coefficients for the first (source) meta-distribution range from 0.4 to 0.7 for the four lower resolutions and from 0.7 to 1 for the four higher resolutions. For the second (target) meta-distribution, the ranges are reversed: 0.7 to 1 for the lower resolutions and 0.4 to 0.7 for the higher resolutions. ... Colors: The color intensity of the images varies from 0.00 (gray-scale) to 1.00 (fully colored). For the source meta-distribution, the α coefficients range from 0 to 0.5 for images with color intensity below 0.5, and from 0.5 to 1 for images above 0.5. |