Communication-Constrained Distributed Quantile Regression with Optimal Statistical Guarantees
Authors: Kean Ming Tan, Heather Battey, Wen-Xin Zhou
JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | A thorough simulation study further elucidates our findings. Keywords: Communication efficiency; convolution smoothing; data heterogeneity; decentralized learning; distributed inference; multiplier bootstrap; quantile regression. [...] 4. Numerical Studies |
| Researcher Affiliation | Academia | Kean Ming Tan EMAIL Department of Statistics University of Michigan Ann Arbor MI, 48109, USA. Heather Battey EMAIL Department of Mathematics Imperial College London London, SW7 2AZ, U.K. Wen-Xin Zhou EMAIL Department of Mathematical Sciences University of California, San Diego La Jolla, CA 92093, USA. |
| Pseudocode | Yes | Algorithm 1 Distributed Quantile Regression via Convolution Smoothing. [...] Algorithm 2 Gradient descent with Barzilai-Borwein stepsize (GD-BB) for solving (8). [...] Algorithm 3 Local adaptive majorize-minimize (LAMM) algorithm for solving (39). [...] Algorithm 4 Efficient Distributed Quantile Regression via Two-Step Conquer. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its own source code or a link to a code repository. It mentions using existing R packages 'conquer' and 'quantreg' but not proprietary code for the described methodology. |
| Open Datasets | No | To generate the data, we consider two types of heteroscedastic models: [...] where xi is generated from a multivariate uniform distribution on the cube 31/2 [ 1, 1]p+1 [...] The random noise is generated from a t-distribution with 2 degrees of freedom, denoted by t2. |
| Dataset Splits | Yes | The regularization parameter λ > 0 is selected using a validation set of size n = 200 for easier illustration and comparison. Note that Wang et al. (2017) use 60% of data for training, 20% as held-out validation set for tuning the parameters, and the remaining 20% for testing. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments. |
| Software Dependencies | No | We employ the R packages conquer and quantreg to compute the conquer and standard QR estimators, respectively. |
| Experiment Setup | Yes | For the bandwidths, we set h = 2.5 {(p+log N)/N}1/3 and b = 2.5 {(p+log n)/n}1/3 according to the theoretical analysis in Section 2.2. To generate the data, we consider two types of heteroscedastic models: [...] Table 1 presents the results when n = 300, p = 10, m {50, 100, 200, 400, 600, 1000}, and τ = 0.8, averaged over 100 trials. |