SOREL: A Stochastic Algorithm for Spectral Risks Minimization
Authors: Yuze Ge, Rujun Jiang
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on real datasets show that our algorithm outperforms existing ones in most cases, both in terms of runtime and sample complexity. |
| Researcher Affiliation | Academia | Yuze Ge & Rujun Jiang School of Data Science, Fudan University EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1 SOREL |
| Open Source Code | Yes | The source code is available at https://github.com/SXFXuz/SOREL. |
| Open Datasets | Yes | Five tabular regression benchmarks are used for the least squares loss: yacht (Tsanas & Xifara, 2012), energy (Baressi Šegota et al., 2019), concrete (Yeh, 2006), kin8nm (Akujuobi & Zhang, 2017), power (Tüfekci, 2014). |
| Dataset Splits | Yes | We split the training set and test set in a 4:1 ratio and used five-fold cross-validation to report the average results on the test set. |
| Hardware Specification | Yes | We run all experiments on a laptop with 16.0 GB RAM and Intel i7-1360P 2.20 GHz CPU. |
| Software Dependencies | No | The paper states: "All algorithms are implemented in Python 3.8." This provides a programming language and its version but does not list any specific versioned libraries or solvers used, which is required for a 'Yes' classification. |
| Experiment Setup | Yes | For the selection of step size α, we set the random seed s {1, . . . , S}. For a single seed s, we calculate the average training loss of the last ten epochs, donated by Ls(α). We choose α that minimizes 1/S PS s=1 Ls(α), where α {1e-4, 3e-4, 1e-3, 3e-3, 1e-2, 3e-2, 1e-1, 3e-1}. For LSVRG, we set the length of an epoch to n. For SOREL, we set Tk = mk = n. Moreover, we set batch size to 64 for all algorithms with mini-batching. For SOREL, we follow the parameter values given in Theorem 1. In particular, we set θk = k/(k+1) and τk = 20n/(k+1) in all experiments. Therefore, there are only two parameters α and ηk left to tune. We set ηk = C/(k+1)n and choose C from {1e-2, 2e-2, 4e-2, 1e-1, 2e-1, 4e-1, 1e0, 2e0, 4e0, 1e1}... |