Distributed Sparse Regression via Penalization

Authors: Yao Ji, Gesualdo Scutari, Ying Sun, Harsha Honnappa

JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we provide some experiments on synthetic and real data. The former are instrumental to validate our theoretical findings. More specifically, we validate the following theoretical results. (i) On the statistical error front, we show that with a proper choice of γ, the solution of the distributed formulation (4) achieves the statistical accuracy of the centralized LASSO estimator (Theorem 7). We also validate the dependency of γ with the dimension d [cf. (22)]. On the computational front, (ii) we demonstrate that DGD-like algorithm (28) displays linear convergence up to statistical precision; (iii) we also validate the scaling of the communication rounds with the network size m, as predicted by (57). (iv) Finally, we illustrate the speed-accuracy dilemma, as shown in Theorem 13.
Researcher Affiliation Academia Yao Ji EMAIL School of Industrial Engineering Purdue University West Lafayette, IN 47906, USA Gesualdo Scutari EMAIL School of Industrial Engineering Purdue University West Lafayette, IN 47906, USA Ying Sun EMAIL School of Electrical Engineering and Computer Science The Pennsylvania State University State College, PA 16802, USA Harsha Honnappa EMAIL School of Industrial Engineering Purdue University West Lafayette, IN 47906, USA
Pseudocode No The paper describes the algorithm in mathematical terms (e.g., equation 28 for the update rule) and refers to it as the 'proximal-gradient algorithm' or 'DGD-like algorithm', but it does not present a formally structured pseudocode or algorithm block with a dedicated label like 'Algorithm 1'.
Open Source Code No The paper provides a license for attribution ('CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v24/21-1333.html.') but does not contain any explicit statement or link indicating that the authors have released their own source code for the methodology described in the paper.
Open Datasets Yes We test our findings on the data set eyedata in the Normal Beta-Prime package (Bai and Ghosh, 2019).
Dataset Splits Yes We randomly divide the data set into training sample set with size Ntrain = 80 and test data set with size Ntest = 40. We partition the training data into m = 10 subsets.
Hardware Specification Yes All the experiments were run on a server equipped with Intel(R) Xeon(R) CPU E5-2699A v4 @ 2.40GHz.
Software Dependencies No The paper mentions using the 'Normal Beta-Prime package (Bai and Ghosh, 2019)' but does not specify its version or any other software dependencies with version numbers.
Experiment Setup Yes Experimental setup (synthetic data): The ground truth θ is set by randomly sampling a multivariate Gaussian N(0, Id) and thresholding the smallest d s elements to zero. The noise vector w is assumed to be multivariate Gaussian N(0, 0.25IN). We construct X RN d by independently generating each row xi Rd, adopting the following procedure (Agarwal et al., 2012): let z1, . . . , zd 1 be i.i.d. standard normal random variables, set xi,1 = z1/ 1 0.252 and xi,t+1 = 0.25xi,t + zt, for t = 1, 2, . . . , d 1. It can be verified that all the eigenvalues of Σ = cov(xi) lie within the interval [0.64, 2.85]. We partition (X, y) as X = [X 1 , X 2 , . . . , X m] and y = [y 1 , . . . , y m] , and agent i owns the data set portion (Xi, yi) and we have m agents in total. We simulate an undirected graph G following the Erdös-Rényi model G(m, p), where m is the number of agents and p is the probability that an edge is independently included in the graph. The coefficient of the matrix W are chosen according to the Lazy Metropolis rule (Olshevsky, 2017). All results are using Monte Carlo with 30 repetitions. 1) Statistical accuracy verification (Theorem 7). We set N = 220, m = 20, d = 400, and consider two types of graphs, namely a fully connected and a weakly connected graph, the latter generated as Erdös-Rényi graph with edge probability p = 0.1, resulting in ρ 0.973.