Communication-Constrained Distributed Quantile Regression with Optimal Statistical Guarantees

Authors: Kean Ming Tan, Heather Battey, Wen-Xin Zhou

JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental A thorough simulation study further elucidates our findings. Keywords: Communication efficiency; convolution smoothing; data heterogeneity; decentralized learning; distributed inference; multiplier bootstrap; quantile regression. [...] 4. Numerical Studies
Researcher Affiliation Academia Kean Ming Tan EMAIL Department of Statistics University of Michigan Ann Arbor MI, 48109, USA. Heather Battey EMAIL Department of Mathematics Imperial College London London, SW7 2AZ, U.K. Wen-Xin Zhou EMAIL Department of Mathematical Sciences University of California, San Diego La Jolla, CA 92093, USA.
Pseudocode Yes Algorithm 1 Distributed Quantile Regression via Convolution Smoothing. [...] Algorithm 2 Gradient descent with Barzilai-Borwein stepsize (GD-BB) for solving (8). [...] Algorithm 3 Local adaptive majorize-minimize (LAMM) algorithm for solving (39). [...] Algorithm 4 Efficient Distributed Quantile Regression via Two-Step Conquer.
Open Source Code No The paper does not provide an explicit statement about releasing its own source code or a link to a code repository. It mentions using existing R packages 'conquer' and 'quantreg' but not proprietary code for the described methodology.
Open Datasets No To generate the data, we consider two types of heteroscedastic models: [...] where xi is generated from a multivariate uniform distribution on the cube 31/2 [ 1, 1]p+1 [...] The random noise is generated from a t-distribution with 2 degrees of freedom, denoted by t2.
Dataset Splits Yes The regularization parameter λ > 0 is selected using a validation set of size n = 200 for easier illustration and comparison. Note that Wang et al. (2017) use 60% of data for training, 20% as held-out validation set for tuning the parameters, and the remaining 20% for testing.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments.
Software Dependencies No We employ the R packages conquer and quantreg to compute the conquer and standard QR estimators, respectively.
Experiment Setup Yes For the bandwidths, we set h = 2.5 {(p+log N)/N}1/3 and b = 2.5 {(p+log n)/n}1/3 according to the theoretical analysis in Section 2.2. To generate the data, we consider two types of heteroscedastic models: [...] Table 1 presents the results when n = 300, p = 10, m {50, 100, 200, 400, 600, 1000}, and τ = 0.8, averaged over 100 trials.