Byzantine-Robust and Hessian-Free Federated Bilevel Optimization
Authors: Shruti P Maralappanavar, Bharath B N
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we present experimental results for data hyperclearning application for various attacks and corroborate our theoretical findings. In particular, we consider the following attacks (i) Bit Flipping (BF), (ii) Label Flipping (LF), (iii) Inner Product Manipulation (IPM), and (iv) A Little is Enough (ALIE), and show that both gradient and constraint violation go down and converge to a constant with increasing number of communication rounds. ... Next, we present the experimental results for Rob-Fed BOB algorithm and corroborate our theoretical findings made in this paper for N = 16 nodes: |
| Researcher Affiliation | Academia | Shruti Maralappanavar EMAIL Department of Electrical, Electronics and Communication Engineering Indian Institute of Technology Dharwad Dharwad, India B. N. Bharath EMAIL Department of Electrical, Electronics and Communication Engineering Indian Institute of Technology Dharwad Dharwad, India |
| Pseudocode | Yes | Algorithm 1 Rob-Fed BOB Algorithm 1: Initialize x0 Rd1, y0 Rd2 2: for r = 0, 1, 2, . . . , R 1 do 3: Send xr, yr to each node 4: Set zr,0 = yr 5: for t = 0, 1, . . . , T 1 do 6: for k N in parallel do 7: Send ygk (xr, yr,t), k N 8: end for 9: yr,t+1 = yr,t γ ygag (xr, yr,t) 10: Send yr,t+1 to all nodes 11: end for 12: Set yr,T = yr,T 13: Server updates xr+1 and yr+1 using equation 8 and equation 7, respectively. 14: end for Output: x R and y R |
| Open Source Code | No | The paper does not provide a specific link to source code, nor does it contain an explicit statement about the release of code for the described methodology. |
| Open Datasets | Yes | We train a linear model on the MNIST dataset with N = 16 nodes. |
| Dataset Splits | No | The paper states: "We train a linear model on the MNIST dataset with N = 16 nodes. We divide the dataset into equal parts among G good nodes, and ensured that the data distribution is heterogeneous." And for applications like Data Hyper-cleaning: "Dtrain k := {( ak,i, bk,i)}m i=1 is the noisy training data set and Dval k := {(ak,i, bk,i)}n i=1 denotes a clean validation data set." However, it does not provide specific global percentages or absolute counts for training, validation, and test splits. |
| Hardware Specification | No | The paper does not specify any particular hardware used for running the experiments. It only mentions training a linear model on the MNIST dataset. |
| Software Dependencies | No | The paper does not provide specific software names along with their version numbers that would be necessary to replicate the experiments. |
| Experiment Setup | Yes | Suppose assumptions 1-3 hold, then for the aggregator RAgg, Algorithm 1 achieves the following bound ... for constant learning rates η 1 Lh , β 1 Lh , and T 2 γµg log 8λ2RL2 g,maxl2 g,max µ2g ... We have chosen B = 3 and B = 6 for a total of N = 16 nodes which result in approximately α = 0.2 and α = 0.4, respectively. ... Figure 5: Effect of λ on the convergence of Rob-Fed BOB (see (a)) and violation (see (b)) under BF attack in the log scale for the data hyperclearning application. (λ = 1 λ = 300) ... Figure 6: Effect of T on the convergence of Rob-Fed BOB (see (a)) and violation (see (b)) under LF attack in the log scale for the data hypercleaning application. (T = 1 T = 10) |