A Non-parametric View of FedAvg and FedProx:Beyond Stationary Points
Authors: Lili Su, Jiaming Xu, Pengkun Yang
JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments further corroborate our theoretical findings. Specifically, in Section 7.1 we demonstrate that both Fed Avg and Fed Prox can achieve low estimation errors despite the failure of reaching stationary points. The same phenomenon is found to still persist when minibatches are used in local updates. In Sections 7.2 and 7.3, we adapt the experiment setup to allow for unbalanced local data partition, covariate heterogeneity, and model heterogeneity. For both Fed Avg and Fed Prox, we empirically observe that the federation gains are large when a client has a small local data size and the data heterogeneity is moderate, matching our theoretical predictions. In Section 7.4, we fit nonlinear models and confirm that both Fed Avg and Fed Prox continue to attain nearly optimal estimation rates. |
| Researcher Affiliation | Academia | Lili Su EMAIL Electrical and Computer Engineering Northeastern University Jiaming Xu EMAIL The Fuqua School of Business Duke University Pengkun Yang EMAIL Center for Statistical Science Tsinghua University |
| Pseudocode | No | The paper describes algorithms like Fed Avg and Fed Prox verbally and with mathematical equations, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any specific links to source code repositories or an explicit statement about releasing code in supplementary materials or otherwise. |
| Open Datasets | No | The paper describes generating synthetic datasets for its numerical experiments. For example, in Section 7.1, it states: "The local design matrices Xi are independent random matrices with i.i.d. N(0, 1) entries." In Section 7.4, it states: "In the experiments, we generate covariates xij using the uniform grid." No pre-existing public datasets are used or explicitly made available. |
| Dataset Splits | Yes | In Section 7.1: "We let M = 25, d = 100, and ni = 500." In Section 7.2: "We choose M = 20, d = 100, ni = 50 for half of the clients, and ni = 500 for the remaining clients." In Section 7.3: "We choose M = 20, d = 100, σ = 0.5, ni = 50 for half of the clients, and ni = 500 for the remaining clients." In Section 7.4: "We consider M = 20 clients and equal size local dataset ni {1, 2, . . . , 10}". |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | Yes | In Section 7.1: "For both Fed Avg and Fed Prox, we choose the step size η = 0.1." In Section 7.2: "We choose M = 20, d = 100, ni = 50 for half of the clients, and ni = 500 for the remaining clients." In Section 7.3: "we rescale the stepsize by choosing η = 0.1/ (d/r) for the stability of local iterations." In Section 7.4: "We run Fed Avg and Fed Prox with the same stepsize η = 0.1 as before." |