reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Collaborative Compressors in Distributed Mean Estimation with Limited Communication Budget

Authors: Harsh Vardhan, Arya Mazumdar

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform a simulation for DME with our schemes as the dissimilarities vary and compare the three different error metrics from above with various existing baselines (Fig 2a-2c). We also used our DME subroutines in the downstream tasks of KMeans, power iteration, and linear regression on real (and federated) datasets (Fig 2d-2i).
Researcher Affiliation	Academia	Harsh Vardhan EMAIL Department of Computer Science and Engineering University of California, San Diego; Arya Mazumdar EMAIL Halicioğlu Data Science Institute University of California, San Diego
Pseudocode	Yes	Algorithm 1 Noisy Sign; Algorithm 2 Hadamard1DEnc; Algorithm 3 Hadamard Multi Dim; Algorithm 4 Sparse Reg; Algorithm 5 One Bit
Open Source Code	Yes	We provide the code in the supplementary material and all the experiments took 6 days to run on a single 20 core machine with 25 GB RAM.
Open Datasets	Yes	For KMeans and power iteration, we run on MNIST (Le Cun & Cortes, 2010) and FEMNIST (Caldas et al., 2018) datasets... For linear regression, we run gradient descent on UJIndoor Loc (Torres Sospedra et al., 2014)... For logistic regression, we run gradient descent on the HAR (Reyes-Ortiz et al., 2012) dataset
Dataset Splits	Yes	FEMNIST is a real federated dataset where each client has handwritten digits from a different person. ... For MNIST and UJIndoor Loc, we split the dataset uniformly into m chunks one per client. ... For logistic regression for binary classification, we select the last 2 classes of the HAR and label them with 1. We split the dataset into m = 20 clients iid.
Hardware Specification	Yes	We provide the code in the supplementary material and all the experiments took 6 days to run on a single 20 core machine with 25 GB RAM.
Software Dependencies	No	The paper does not explicitly state any specific software dependencies with version numbers.
Experiment Setup	Yes	The communication budget is 2375 * 25 bits/client for each compressor every round. It is often not possible for two different forms of compressors, like sparsification methods and quantization methods to achieve the exact same communication budget even after tuning hence we allow a small range of communication budgets. ... For KMeans and power iteration, we set m = 50. ... We run 20 iterations of Lloyd's algorithm (Lloyd, 1982) for KMeans and 30 power iterations. ... For both the linear regression datasets, we run 50 iterations of GD. ... We run distributed gradient descent with learning rate 0.001 for T = 200 iterations on the logistic loss... We compare against all baselines in Table 2 for 5 random seeds and report the methods which perform the best. For DME experiemnts, the shaded corresponds to 2 standard deviations around the mean, while for all downstream tasks, it corresponds to 1 standard deviation.