Scalable First-order Method for Certifying Optimal k-Sparse GLMs

Authors: Jiachang Liu, Soroosh Shafiee, Andrea Lodi

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on synthetic and real-world datasets show that our approach significantly accelerates dual bound computations and is highly effective in providing optimality certificates for large-scale problems. We evaluate our proposed methods using both synthetic and real-world datasets to address three key empirical questions: How fast is our customized PAVA algorithm in evaluating proxg compared to existing solvers? How fast is our proposed FISTA method in calculating the lower bounds compared to existing solvers? How fast is our customized Bn B algorithm compared to existing solvers?
Researcher Affiliation Academia 1School of Operations Research and Information Engineering, Cornell University, Ithaca, NY, USA 2Jacobs Technion-Cornell Institute, Cornell Tech and Technion IIT, New York, NY, USA. Correspondence to: Jiachang Liu <EMAIL>, Soroosh Shafiee <EMAIL>, Andrea Lodi <EMAIL>.
Pseudocode Yes Algorithm 1 Customized PAVA to solve proxρg (µ) ... Algorithm 2 Algorithm to compute g(β) ... Algorithm 3 Main algorithm to solve problem (5) ... Algorithm 4 Up and Down Block Algorithm for Merging in PAVA
Open Source Code Yes Implementations discussed in this paper are available at https://github.com/jiachangliu/OKGLM.
Open Datasets Yes For the real-world datasets, we use the dataset cancer drug response (Liu et al., 2020) for linear regression and DOROTHEA (Asuncion & Newman, 2007) for logistic regression.
Dataset Splits Yes For the results in the main paper, we set n-to-p ratio to be 1.0, the box constraint M to be 2, the number of nonzero coefficients k (also the cardinality constraint) to be 10, and ℓ2 regularization coefficient λ2 to be 1.0. ... The k values for both real-world datasets are selected based on doing 5 fold cross validation with a heuristic sparse learning algorithm first.
Hardware Specification Yes When investigating how much GPU can accelerate our computation, we ran the experiments with both CPU and GPU implementations on the Nvidia A100s. For everything else, we ran the experiments with the CPU implementation on AMD Milan with CPU speed 2.45 Ghz and 8 cores.
Software Dependencies Yes We implement our algorithms in python. For baselines, we compare with the following state-of-the-art commercial and open-source SOCP solvers: Gurobi (Gurobi Optimization, LLC, 2025), MOSEK (Ap S, 2025), SCS (O Donoghue et al., 2023), and Clarabel (Goulart & Chen, 2024), with the python package cvxpy (Diamond & Boyd, 2016) as the interface to these solvers.
Experiment Setup Yes Input: number of iterations T, coefficient λ2 for the ℓ2 regularization, and step size L (Lipschitz-continuity parameter of F(β)) ... For the cardinality constraint k, we set k = 10 for both synthetic datasets... For the cancer drug response dataset, we set k = 5. For DOROTHEA, we set k = 10. ... For the ℓ2 regularization coefficient, we set λ2 = 1. For the box constraint, we set M = 2 for the synthetic datasets and DOROTHEA. ... For the cancer drug response dataset, we set M = 5