reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Adaptive Compression for Communication-Efficient Distributed Training

Authors: Maksim Makarenko, Elnur Gasanov, Abdurakhmon Sadiev, Rustem Islamov, Peter Richtárik

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We propose Adaptive Compressed Gradient Descent (Ada CGD) a novel optimization algorithm for communication-efﬁcient training of supervised machine learning models with adaptive compression level. ... In this work, we use the similar setup described in (Richtárik et al., 2022). Namely, we aim to solve the logistic regression problem with a nonconvex regularizer: ... Figure 1 compares the performance of our proposed algorithm, Ada CGD, with other popular 3PC methods.
Researcher Affiliation	Academia	Maksim Makarenko EMAIL King Abdullah University of Science and Technology Elnur Gasanov EMAIL King Abdullah University of Science and Technology Rustem Islamov EMAIL Institut Polytechnique de Paris Abdurakhmon Sadiev EMAIL King Abdullah University of Science and Technology Peter Richtárik EMAIL King Abdullah University of Science and Technology
Pseudocode	Yes	Algorithm 1 DCGD method with master compression
Open Source Code	No	The paper does not provide an explicit link to source code, nor does it state that the code will be made publicly available. It only mentions that simulations are implemented in Python.
Open Datasets	Yes	In training we use LIBSVM Chang & Lin (2011) datasets phishing, a1a, a9a.
Dataset Splits	Yes	Each dataset has been split into 𝑛= 20 equal parts, each representing a different client.
Hardware Specification	Yes	We implemented all simulations in Python 3.8, and ran them on a cluster of 48 nodes with Intel(R) Xeon(R) Gold 6230R CPUs.
Software Dependencies	Yes	All simulations are implemented in Python 3.8
Experiment Setup	Yes	In training we use LIBSVM Chang & Lin (2011) datasets phishing, a1a, a9a. Each dataset has been split into 𝑛= 20 equal parts, each representing a different client. ... we fine-tuned the stepsize of each considered algorithm with a set of multiples of the corresponding theoretical stepsize, ranging from 20 to 28. ... we chose the Top-𝑘operator as our compressor of choice. For EF21 and CLAG, we used the top-1 compressor ... For Ada CGD, we chose an array of compressors that varied from full compression (skip communication) to zero compression (sending the full gradient), with a step of 5. We used the communication cost of the algorithm as the stopping criterion for all experiments.