Alternating Linearization for Structured Regularization Problems

Authors: Xiaodong Lin, Minh Pham, Andrzej Ruszczyński

JMLR 2014 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we present numerical results for several synthetic and real-world examples, including a three-dimensional fused lasso problem, which illustrate the scalability, efficacy, and accuracy of the method. Keywords: lasso, fused lasso, nonsmooth optimization, operator splitting. Section 5 presents simulation results and real data examples, which illustrate the efficacy, accuracy, and scalability of the alternating linearization method.
Researcher Affiliation Academia Xiaodong Lin EMAIL Department of Management Science and Information Systems Rutgers University Piscataway, NJ 08854. Minh Pham EMAIL Statistical and Applied Mathematical Sciences Institute (SAMSI) Durham, NC 27707. Andrzej Ruszczy nski EMAIL Department of Management Science and Information Systems Rutgers University Piscataway, NJ 08854.
Pseudocode Yes Algorithm 1 Alternating Linearization
Open Source Code No The paper mentions that ALIN was implemented as a MATLAB script and that they implemented the Augmented Lagrangian and Alternating Direction method in MATLAB. However, there is no explicit statement or link indicating that this code is open-source or publicly available.
Open Datasets Yes To make a fair comparison between the methods, we run FISTA on each instance of the problem. FISTA is set to run to tol = 10 5 or 5, 000 iterations, whichever comes first. Then ALIN, Spa RSA, and SPG are set to run until the objective function values obtained are as good as that of FISTA. We set a parameter τ = 0.1 XT y and chose values of λ =τ, 10 1τ, 5 10 2τ, 10 2τ, and 10 3τ. We allow Spa RSA to run its monotone and continuation feature. Continuation is a special feature of Spa RSA for cases when the parameter λ is small. With this feature, Spa RSA computes the solutions for bigger values of λ and uses them to find solutions for smaller values of λ. We did not let Spa RSA use its special de-bias feature since it involves removing zero coefficients to reduce the size of the data set. This feature makes it unfair for the other competing methods. In Table 1, we report the average time elapsed (in CPU seconds) and the standard deviation after 20 runs. We can see that the performance of ALIN is comparable to the other methods. In terms of running time, ALIN does better than all competing methods for all but the largest λ considered. For the large value of λ, ALIN performs worse than FISTA and SPARSA. In this case, the solution is fairly close to the starting point 0, therefore the overhead cost of the update steps and the f-subproblem slow ALIN down. When the value of λ decreases, the benefits of these steps become more evident. ALIN outperforms other methods in terms of the running time, by factors of two to three. (Section 5.1). We further compare the efficiency of these methods by a validation experiment. All the features of the previous experiment remain the same, except that the noise variance is increased from 10 4 to 10 1 (which should favor higher regularization). We generate n = 211 samples, where half of them are used as the training set and the other half for testing. (Section 5.1, paragraph 2). In this study we present the results on analyzing the CGH data using fused lasso penalty. CGH is a technique for measuring DNA copy numbers of selected genes on the genome. [...] Tibshirani and Wang (2008) applied the fused lasso signal approximator for detecting such copy number variations. (Section 5.3). The deblurring results on a standard example ( Lena ) are shown in Figure 6; similar deblurring results from ALIN and FISTA are observed. (Section 5.4). The functional MRI data are collected from 313 children with ages 5 to 18 (Schmithorst et al., 2006). (Section 5.5)
Dataset Splits Yes We generate n = 211 samples, where half of them are used as the training set and the other half for testing.
Hardware Specification Yes All these studies were performed on an AMD 2.6 GHZ, 4 GB RAM computer using MATLAB.
Software Dependencies No The paper mentions using MATLAB for implementation and refers to various algorithms like FISTA (implemented in C), Spa RSA, SPG, SQOPT, SLEP, BREGMAN, ADMM, TVAL, and Genlasso. However, specific version numbers are not provided for MATLAB or any of the mentioned software libraries/solvers.
Experiment Setup Yes To make a fair comparison between the methods, we run FISTA on each instance of the problem. FISTA is set to run to tol = 10 5 or 5, 000 iterations, whichever comes first. Then ALIN, Spa RSA, and SPG are set to run until the objective function values obtained are as good as that of FISTA. We set a parameter τ = 0.1 XT y and chose values of λ =τ, 10 1τ, 5 10 2τ, 10 2τ, and 10 3τ. (Section 5.1). We first run ADMM using the stopping criteria of Boyd et al. (2010) with tol = 1e 6 or 30 iterations since each iteration of ADMM can be expensive. (Section 5.2). The f-subproblem is solved using the block coordinate descent method and the h-subproblem is solved using the preconditioned conjugate gradient method with tol = 10 5, as discussed previously. (Section 5.4). We let FISTA run for 100 iterations with the monotone feature, which results in a monotonic decrease of the objective function decrease, and the tolerance is set to 10 5. Then we run ALIN to the same objective function value. TVAL, unfortunately, cannot reach the same objective function value. Thus we let TVAL run 10, 000 iterations or to tol = 10 5, whichever comes first. (Section 5.4). The optimal tuning parameters are 0.2 for both fused lasso and lasso penalties. (Section 5.5). As suggested by Wahlberg et al. (2012), we choose ρ = λ to keep the algorithm stable for the implementation. (Section 5.5).