Robust and Communication-Efficient Collaborative Learning
Authors: Amirhossein Reisizadeh, Hossein Taheri, Aryan Mokhtari, Hamed Hassani, Ramtin Pedarsani
NeurIPS 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we numerically evaluate the performance of the proposed Quan Timed-DSGD method described in Algorithm 1 for solving a class of non-convex decentralized optimization problems. In particular, we compare the total run-time of Quan Timed-DSGD scheme with the ones for three benchmarks which are briefly described below. ... We carry out two sets of experiments over CIFAR-10 and MNIST datasets... |
| Researcher Affiliation | Academia | Amirhossein Reisizadeh ECE Department University of California, Santa Barbara EMAIL Hossein Taheri ECE Department University of California, Santa Barbara EMAIL Aryan Mokhtari ECE Department The University of Texas at Austin EMAIL Hamed Hassani ESE Department University of Pennsylvania EMAIL Ramtin Pedarsani ECE Department University of California, Santa Barbara EMAIL |
| Pseudocode | Yes | Algorithm 1 Quan Timed-DSGD at node i |
| Open Source Code | No | The paper does not provide an explicit statement or link indicating the availability of open-source code for the described methodology. |
| Open Datasets | Yes | We carry out two sets of experiments over CIFAR-10 and MNIST datasets, where each worker is assigned with a sample set of size m = 200 for both datasets. |
| Dataset Splits | No | The paper mentions using CIFAR-10 and MNIST datasets and assigns `m=200` samples per worker, but it does not explicitly provide information on train/validation/test splits (percentages, counts, or predefined citations). |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers, such as programming languages, libraries, or frameworks. |
| Experiment Setup | Yes | In experiments over CIFAR-10, step-sizes are fine-tuned as follows: ( , ") = (0.08/T 1/6, 14/T 1/2) for Quan Timed-DSGD and Q-DSGD, and = 0.015 for DSGD and Asynchronous DSGD. In MNIST experiments, step-sizes are fine-tuned to ( , ") = (0.3/T 1/6, 15/T 1/2) for Quan Timed-DSGD and Q-DSGD, and = 0.2 for DSGD. We implement the unbiased low precision quantizer in (7) with various quantization levels s, and we let Tc denote the communication time of a p-vector without quantization (16-bit precision). The communication time for a quantized vector is then proportioned according the quantization level. In order to ensure that the expected batch size used in each node is a target positive number b, we choose the deadline Td = b/E[V ], where V Uniform(10, 90) is the random computation speed. The communication graph is a random Erdös-Rènyi graph with edge connectivity pc = 0.4 and n = 50 nodes. The weight matrix is designed as W = I L/ where L is the Laplacian matrix of the graph and > λmax(L)/2. |