cpSGD: Communication-efficient and differentially-private distributed SGD
Authors: Naman Agarwal, Ananda Theertha Suresh, Felix Xinnan X. Yu, Sanjiv Kumar, Brendan McMahan
NeurIPS 2018 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We trained a three-layer model (60 hidden nodes each with Re LU activation) on the infinite MNIST dataset [8] with 25M data points and 25M clients. At each step 10,000 clients send their data to the server. The results are in Figure 2. |
| Researcher Affiliation | Industry | Naman Agarwal Google Brain Princeton, NJ 08540 EMAIL Ananda Theertha Suresh Google Research New York, NY EMAIL Felix Yu Google Research New York, NY EMAIL Sanjiv Kumar Google Research New York, NY EMAIL H. Brendan Mc Mahan Google Research Seattle, WA EMAIL |
| Pseudocode | No | The paper describes the steps of its mechanisms but does not provide a formal pseudocode block or algorithm box. |
| Open Source Code | No | The paper does not provide any statement or link indicating the availability of open-source code for the described methodology. |
| Open Datasets | Yes | We trained a three-layer model (60 hidden nodes each with Re LU activation) on the infinite MNIST dataset [8] with 25M data points and 25M clients. |
| Dataset Splits | No | The paper mentions using the 'infinite MNIST dataset' and training for 'one epoch', but it does not specify explicit train/validation/test splits, percentages, or sample counts. |
| Hardware Specification | No | The paper refers to 'mobile devices' as clients, but it does not specify the hardware (e.g., GPU models, CPU types) used by the authors to run their experiments. |
| Software Dependencies | No | The paper cites 'Tensorflow' [1] as a basic building block, but it does not specify a version number for TensorFlow or any other software dependencies used in their implementation. |
| Experiment Setup | Yes | We trained a three-layer model (60 hidden nodes each with Re LU activation) on the infinite MNIST dataset [8] with 25M data points and 25M clients. At each step 10,000 clients send their data to the server... k is the number of quantization levels, and m is the parameter of the binomial noise (p = 0.5, s = 1). The baseline is without quantization and differential privacy. δ = 10 9. We note that we trained the model with exactly one epoch. |