Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
CANITA: Faster Rates for Distributed Convex Optimization with Communication Compression
Authors: Zhize Li, Peter Richtarik
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6 Experiments In this section, we demonstrate the performance of our accelerated method CANITA (Algorithm 1) and previous methods QSGD and DIANA (the theoretical convergence results of these algorithms can be found in Table 1) with different compression operators on the logistic regression problem, min x Rd f(x) := 1 n Pn i=1 log 1 + exp( bia T i x) , (14) where {ai, bi}n i=1 Rd { 1} are data samples. We use three standard datasets: a9a, mushrooms, and w8a in the experiments. All datasets are downloaded from LIBSVM [4]. In Figures 1 3, we compare our CANITA with QSGD and DIANA with three compression operators: random sparsification (left), natural compression (middle), and random quantization (right) on three datasets: a9a (Figure 1), mushrooms (Figure 2), and w8a (Figure 3). The x-axis and y-axis represent the number of communication bits and the training loss, respectively. |
| Researcher Affiliation | Academia | Zhize Li KAUST EMAIL Peter Richtárik KAUST EMAIL |
| Pseudocode | Yes | Algorithm 1 Distributed compressed accelerated ANITA method (CANITA) |
| Open Source Code | No | The paper does not contain any explicit statement about providing open-source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We use three standard datasets: a9a, mushrooms, and w8a in the experiments. All datasets are downloaded from LIBSVM [4]. |
| Dataset Splits | No | The paper mentions using 'three standard datasets: a9a, mushrooms, and w8a' but does not specify any explicit training, validation, or test split percentages or sample counts for these datasets. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, or memory specifications used for running experiments. |
| Software Dependencies | No | The paper mentions using LIBSVM but does not provide specific version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | In our experiments, we directly use the theoretical stepsizes and parameters for all three algorithms: QSGD [1, 24], DIANA [12], our CANITA (Algorithm 1). To compare with the settings of DIANA and CANITA, we use local gradients (not stochastic gradients) in QSGD. |