Federated Conditional Stochastic Optimization
Authors: Xidong Wu, Jianhui Sun, Zhengmian Hu, Junyi Li, Aidong Zhang, Heng Huang
NeurIPS 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results on various tasks validate the efficiency of these algorithms. ... 5 Experiments The experiments are run on CPU machines with AMD EPYC 7513 32-Core Processors as well as NVIDIA RTX A6000. |
| Researcher Affiliation | Academia | Xidong Wu Department of ECE University of Pittsburgh Pittsburgh, PA 15213 EMAIL Jianhui Sun Computer Science University of Virginia Charlottesville, VA 22903 EMAIL Zhengmian Hu Computer Science University of Maryland College Park, MD 20742 EMAIL Junyi Li Department of ECE University of Pittsburgh Pittsburgh, PA 15213 EMAIL Aidong Zhang Computer Science University of Virginia Charlottesville, VA 22903 EMAIL Heng Huang Computer Science University of Maryland College Park, MD 20742 EMAIL |
| Pseudocode | Yes | Algorithm 1 FCSG and FCSG-M Algorithm ... Algorithm 2 Acc-FCSG-M Algorithm |
| Open Source Code | Yes | The code is available and Federated Online AUPRC maximization task follow [38] . https://github.com/xidongwu/Federated-Minimax-and-Conditional-Stochastic-Optimization/tree/main https://github.com/xidongwu/D-AUPRC |
| Open Datasets | Yes | We apply our methods to few-shot image classification on the Omniglot [24, 10]. ... We choose MNIST dataset and CIFAR-10 datasets. |
| Dataset Splits | Yes | We divide the characters to train/validation/test with 1028/172/423 by Torchmeta [7] and tasks are evenly partitioned into disjoint sets and we distribute tasks randomly among 16 clients. |
| Hardware Specification | Yes | The experiments are run on CPU machines with AMD EPYC 7513 32-Core Processors as well as NVIDIA RTX A6000. |
| Software Dependencies | No | The paper mentions 'Torchmeta [7]' and 'Py Torch' (within the reference for Torchmeta) but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | We carefully tune hyperparameters for both methods. λ = 0.001 and α = 10. We run a grid search for the learning rate and choose the learning rate in the set {0.01, 0.005, 0.001}. β in FCSG-M are chosen from the set {0.001, 0.01, 0.1, 0.5, 0.9}. The local update step is set as 50. ... For all methods, the model is trained using a single gradient step with a learning rate of 0.4. The model was evaluated using 3 gradient steps [10]. Then we use grid search and carefully tune other hyper-parameters for each method. We choose the learning rate from the set {0.1, 0.05, 0.01} and η as 1 [11]. We select the inner state momentum coefficient for Local-SCGD and Local-SCGDM from {0.1, 0.5, 0.9} and outside momentum coefficient for Local-SCGDM, FCSG-M and Acc-FCSG-M from {0.1, 0.5, 0.9}. |