reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Communication-Constrained Distributed Quantile Regression with Optimal Statistical Guarantees

Authors: Kean Ming Tan, Heather Battey, Wen-Xin Zhou

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	A thorough simulation study further elucidates our findings. Keywords: Communication efficiency; convolution smoothing; data heterogeneity; decentralized learning; distributed inference; multiplier bootstrap; quantile regression. [...] 4. Numerical Studies
Researcher Affiliation	Academia	Kean Ming Tan EMAIL Department of Statistics University of Michigan Ann Arbor MI, 48109, USA. Heather Battey EMAIL Department of Mathematics Imperial College London London, SW7 2AZ, U.K. Wen-Xin Zhou EMAIL Department of Mathematical Sciences University of California, San Diego La Jolla, CA 92093, USA.
Pseudocode	Yes	Algorithm 1 Distributed Quantile Regression via Convolution Smoothing. [...] Algorithm 2 Gradient descent with Barzilai-Borwein stepsize (GD-BB) for solving (8). [...] Algorithm 3 Local adaptive majorize-minimize (LAMM) algorithm for solving (39). [...] Algorithm 4 Efficient Distributed Quantile Regression via Two-Step Conquer.
Open Source Code	No	The paper does not provide an explicit statement about releasing its own source code or a link to a code repository. It mentions using existing R packages 'conquer' and 'quantreg' but not proprietary code for the described methodology.
Open Datasets	No	To generate the data, we consider two types of heteroscedastic models: [...] where xi is generated from a multivariate uniform distribution on the cube 31/2 [ 1, 1]p+1 [...] The random noise is generated from a t-distribution with 2 degrees of freedom, denoted by t2.
Dataset Splits	Yes	The regularization parameter λ > 0 is selected using a validation set of size n = 200 for easier illustration and comparison. Note that Wang et al. (2017) use 60% of data for training, 20% as held-out validation set for tuning the parameters, and the remaining 20% for testing.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments.
Software Dependencies	No	We employ the R packages conquer and quantreg to compute the conquer and standard QR estimators, respectively.
Experiment Setup	Yes	For the bandwidths, we set h = 2.5 {(p+log N)/N}1/3 and b = 2.5 {(p+log n)/n}1/3 according to the theoretical analysis in Section 2.2. To generate the data, we consider two types of heteroscedastic models: [...] Table 1 presents the results when n = 300, p = 10, m {50, 100, 200, 400, 600, 1000}, and τ = 0.8, averaged over 100 trials.