A Non-parametric View of FedAvg and FedProx:Beyond Stationary Points

Authors: Lili Su, Jiaming Xu, Pengkun Yang

JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments further corroborate our theoretical findings. Specifically, in Section 7.1 we demonstrate that both Fed Avg and Fed Prox can achieve low estimation errors despite the failure of reaching stationary points. The same phenomenon is found to still persist when minibatches are used in local updates. In Sections 7.2 and 7.3, we adapt the experiment setup to allow for unbalanced local data partition, covariate heterogeneity, and model heterogeneity. For both Fed Avg and Fed Prox, we empirically observe that the federation gains are large when a client has a small local data size and the data heterogeneity is moderate, matching our theoretical predictions. In Section 7.4, we fit nonlinear models and confirm that both Fed Avg and Fed Prox continue to attain nearly optimal estimation rates.
Researcher Affiliation Academia Lili Su EMAIL Electrical and Computer Engineering Northeastern University Jiaming Xu EMAIL The Fuqua School of Business Duke University Pengkun Yang EMAIL Center for Statistical Science Tsinghua University
Pseudocode No The paper describes algorithms like Fed Avg and Fed Prox verbally and with mathematical equations, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any specific links to source code repositories or an explicit statement about releasing code in supplementary materials or otherwise.
Open Datasets No The paper describes generating synthetic datasets for its numerical experiments. For example, in Section 7.1, it states: "The local design matrices Xi are independent random matrices with i.i.d. N(0, 1) entries." In Section 7.4, it states: "In the experiments, we generate covariates xij using the uniform grid." No pre-existing public datasets are used or explicitly made available.
Dataset Splits Yes In Section 7.1: "We let M = 25, d = 100, and ni = 500." In Section 7.2: "We choose M = 20, d = 100, ni = 50 for half of the clients, and ni = 500 for the remaining clients." In Section 7.3: "We choose M = 20, d = 100, σ = 0.5, ni = 50 for half of the clients, and ni = 500 for the remaining clients." In Section 7.4: "We consider M = 20 clients and equal size local dataset ni {1, 2, . . . , 10}".
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running its experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup Yes In Section 7.1: "For both Fed Avg and Fed Prox, we choose the step size η = 0.1." In Section 7.2: "We choose M = 20, d = 100, ni = 50 for half of the clients, and ni = 500 for the remaining clients." In Section 7.3: "we rescale the stepsize by choosing η = 0.1/ (d/r) for the stability of local iterations." In Section 7.4: "We run Fed Avg and Fed Prox with the same stepsize η = 0.1 as before."