Stochastic Interior-Point Methods for Smooth Conic Optimization with Applications
Authors: Chuan He, Zhanwang Deng
JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now conduct numerical experiments to evaluate the performance of our proposed SIPMs. For SIPM-ME, we consider two versions: SIPM-ME+ with increasing batch sizes and SIPM-ME1 with a fixed batch size. We compare our SIPMs against a deterministic variant of Algorithm 1 with full-batch gradients (IPM-FG) and other popular methods on robust linear regression (Section 4.1), multi-task relationship learning (Section 4.2), and clustering data streams (Section 4.3). We evaluate the approximate solutions found by our SIPMs using two measures: relative objective value: f(xk)/f(x0); and relative estimated stationary error: mk+AT λk xk/ m0+ AT λ0 x0. |
| Researcher Affiliation | Academia | Chuan He EMAIL Department of Mathematics Link oping University Link oping, SE-581 83, Sweden Zhanwang Deng EMAIL Academy for Advanced Interdisciplinary Studies Peking University Beijing, 100871, China |
| Pseudocode | Yes | In what follows, we propose an SIPM framework in Algorithm 1 for solving problem (3). Subsequently, we will employ four distinct stochastic estimators to construct {mk}k 0, with specific schemes provided in (16), (22), (29), and (38), respectively. We remark that computations involving 2B(xk) 1 can be performed efficiently for common nonlinear cones K. For example, when K is the second-order cone, B(xk) 1 can be computed analytically (e.g., see Alizadeh and Goldfarb (2003)), and (A 2B(xk) 1AT ) 1v for any v Rm can be efficiently evaluated using Cholesky factorization. When K is the semidefinite cone, we have that 2B(Xk) 1[V ] = Xk V Xk holds for any n n symmetric matrix V , and that A 2B(Xk) 1AT can be efficiently computed by exploiting the sparsity of A (see Toh et al. (1999) for details). Algorithm 1 An SIPM framework Input: starting point x0 Ω , nonincreasing step sizes {ηk}k 0 (0, sη] with sη (0, 1), nonincreasing barrier parameters {µk}k 0 (0, 1]. |
| Open Source Code | Yes | All experiments are carried out using Matlab 2024b on a standard PC with 3.20 GHz AMD R7 5800H microprocessor with 16GB of memory. The code to reproduce our numerical results is available at https://github.com/ChuanH6/SIPM. |
| Open Datasets | Yes | We consider two typical regression datasets, namely, wine-quality and energy-efficiency , from the UCI repository.1 1. see archive.ics.uci.edu/datasets |
| Dataset Splits | No | The paper mentions simulating missing values by randomly deleting 25% of the features in 50% of the training samples, creating subsets for multi-task learning, and applying "drifts" to data streams. However, it does not explicitly provide standard training/validation/test splits with percentages, absolute counts, or references to predefined splits for model evaluation reproducibility. |
| Hardware Specification | Yes | All experiments are carried out using Matlab 2024b on a standard PC with 3.20 GHz AMD R7 5800H microprocessor with 16GB of memory. |
| Software Dependencies | Yes | All experiments are carried out using Matlab 2024b on a standard PC with 3.20 GHz AMD R7 5800H microprocessor with 16GB of memory. |
| Experiment Setup | Yes | For SIPMs, we set the barrier function as B(u, t) = ln(t2 u 2), associated with the second-order cone Qd+1 .= {(u, t) Rd R+ : u t}. For SIPM-ME1, SIPM-PM, SIPM-EM, SIPM-RM, and SALM-RM, we set the maximum number of epochs as 10, 000, and the batch size as 200 and 500 for the wine-quality and energy-efficiency datasets, respectively. For SIPM-ME+, we initialize the batch size as 1 and increase it by 1 per iteration. For a fair comparison, we set the maximum number of iterations for SIPM-ME+ and IPM-FG such that the total number of data points used during training equals that of the other methods in 10, 000 epochs. For all methods, we set the initial point (w0, v0, θ0) to (0, 1, 1). We set the other hyperparameters including step sizes, momentum parameters, and barrier parameters as diminishing sequences of the form {(k + 1) α}k 0, with the exponent α tuned via grid search to best suit each method in terms of computational performance. |