reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ErrorCompensatedX: error compensation for variance reduced algorithms

Authors: Hanlin Tang, Yao Li, Ji Liu, Ming Yan

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we train Res Net-50 (He et al., 2016) on CIFAR10, which consists of 50000 training images and 10000 testing images, each has 10 labels. We run the experiments on eight workers, each having a 1080Ti GPU. The batch size on each worker is 16 and the total batch size is 128. ... Figure 2: Epoch-wise convergence comparison on Res Net-50 for Momenum SGD (left column), STORM (middle column), and IGT (right column) with different communication implementations.
Researcher Affiliation	Collaboration	Hanlin Tang Department of Computer Science University of Rochester EMAIL Yao Li Department of Mathematics Michigan State University EMAIL Ji Liu Kuaishou Technology EMAIL Ming Yan Department of Computational Mathematics, Science and Technology; Department of Mathematics Michigan State University EMAIL
Pseudocode	Yes	Algorithm 1 Error Compensated X for general A (x; ξ)
Open Source Code	No	The paper does not provide any statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets	Yes	In this section, we train Res Net-50 (He et al., 2016) on CIFAR10, which consists of 50000 training images and 10000 testing images, each has 10 labels.
Dataset Splits	No	The paper states '50000 training images and 10000 testing images' for CIFAR-10 but does not specify a validation set split.
Hardware Specification	Yes	We run the experiments on eight workers, each having a 1080Ti GPU.
Software Dependencies	No	The paper does not specify version numbers for any software dependencies used in the experiments.
Experiment Setup	Yes	The batch size on each worker is 16 and the total batch size is 128. ... We use the 1-bit compression in Tang et al. (2019), which leads to an overall 96% of communication volume reduction. ... We grid search the best learning rate from {0.5, 0.1, 0.001} and c0 from {0.1, 0.05, 0.001}, and find that the best learning rate is 0.01 with c0 = 0.05 for both original STORM and IGT. ... We set β = 0.3 for the low-pass filter in all cases.