GADMM: Fast and Communication Efficient Framework for Distributed Machine Learning

Authors: Anis Elgabli, Jihong Park, Amrit S. Bedi, Mehdi Bennis, Vaneet Aggarwal

JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To validate our theoretical foundations, we numerically evaluate the performance of GADMM in linear and logistic regression tasks, compared with the following benchmark algorithms.
Researcher Affiliation Academia Anis Elgabli EMAIL Jihong Park EMAIL Amrit S. Bedi EMAIL Mehdi Bennis EMAIL Vaneet Aggarwal EMAIL
Pseudocode Yes The detailed steps of the proposed algorithm are summarized in Algorithm 1. ... The detailed steps of D-GADMM is described in Algorithm 2.
Open Source Code No No explicit statement or link for the paper's source code is provided.
Open Datasets Yes All simulations are conducted using the synthetic and real datasets described in (Dua and Graff, 2017; Chen et al., 2018). ... Next, the real data tests linear and logistic regression tasks with Body Fat (252 samples, 14 features) and Derm (358 samples, 34 features) datasets (Dua and Graff, 2017), respectively. ... Dheeru Dua and Casey Graff. UCI machine learning repository, 2017. URL http://archive. ics.uci.edu/ml.
Dataset Splits No The paper mentions data being "evenly split into workers" (e.g., "We consider 1,200 samples with 50 features, which are evenly split into workers.") and the number of workers for real/synthetic datasets, but does not provide details on specific training/test/validation splits (percentages or counts) or reference to standard splits for model evaluation.
Hardware Specification No The paper discusses communication cost metrics, bandwidth (B = 2MHz), noise spectral density (N0 = 1E-6), and power consumption for communication, but does not specify any hardware used for computation (e.g., CPU, GPU models, or memory) for running the machine learning experiments.
Software Dependencies No For the tuning parameters, we use the setup in (Chen et al., 2018). No specific software dependencies with version numbers are provided for the implementation of GADMM or D-GADMM.
Experiment Setup Yes For linear regression with the synthetic dataset, Fig. 2 shows that all variants of GADMM with ρ = 3, 5, and 7 achieve the target objective error of 10-4 in less than 1,000 iterations... For logistic regression, Figs. 4 and 5 validate that GADMM outperforms the benchmark algorithms... D-GADMM, ρ=1, coherence time =15 iter... D-GADMM, refresh rate =1, D-GADMM, refresh rate =10, D-GADMM, refresh rate =50.