Sharp Oracle Inequalities for Square Root Regularization

Authors: Benjamin Stucky, Sara van de Geer

JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Based on a simulation we illustrate some advantages of the square root SLOPE. (Abstract) ... The goal of this simulation is to see how the estimation and prediction errors for the square root LASSO and the square root SLOPE behave under some Gaussian designs. ... The results can be found in Table 1,2,3 and 4.
Researcher Affiliation Academia Benjamin Stucky EMAIL Seminar for Statistics ETH Z urich R amistrasse 101 8092 Zurich, Switzerland Sara van de Geer EMAIL Seminar for Statistics ETH Z urich R amistrasse 101 8092 Zurich, Switzerland
Pseudocode Yes Algorithm 1: sr SLOPE input : β0 a starting parameter vector, λ a desired penalty level with a decreasing sequence, Y the response vector, X the design matrix. output: ˆβsr SLOPE = arg min β Rp ( Y Xβ n + λJλ(β)) 1 for i 0 to istop do 2 σi+1 Y Xβi n; 3 βi+1 arg min β Rp Y Xβ 2 n + σi+1λJλ(β) ;
Open Source Code No For the square root LASSO we have used the R-Package flare by Li et al. (2014).
Open Datasets No The design matrix X is chosen with the rows being fixed i.i.d. realizations from N(0, Σ). Here the covariance matrix Σ has a Toeplitz structure Σi,j = 0.9|i j|. We choose i.i.d. Gaussian errors ϵ with a variance of σ2 = 1.
Dataset Splits No We consider a high-dimensional linear regression model: Y = Xβ0 + ϵ, with n = 100 response variables and p = 500 unknown parameters. ... We use r = 100 repetitions to calculate the ℓ1 estimation error, the sorted ℓ1 estimation error and the ℓ2 prediction error.
Hardware Specification No The paper does not provide specific hardware details for running the experiments.
Software Dependencies Yes For the square root LASSO we have used the R-Package flare by Li et al. (2014).
Experiment Setup Yes We consider a high-dimensional linear regression model: Y = Xβ0 + ϵ, with n = 100 response variables and p = 500 unknown parameters. The design matrix X is chosen with the rows being fixed i.i.d. realizations from N(0, Σ). Here the covariance matrix Σ has a Toeplitz structure Σi,j = 0.9|i j|. We choose i.i.d. Gaussian errors ϵ with a variance of σ2 = 1. ... As for the definition of the sorted ℓ1 norm, we chose a regular decreasing sequence from 1 to 0.1 with length 500. We use r = 100 repetitions...