SOREL: A Stochastic Algorithm for Spectral Risks Minimization

Authors: Yuze Ge, Rujun Jiang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on real datasets show that our algorithm outperforms existing ones in most cases, both in terms of runtime and sample complexity.
Researcher Affiliation Academia Yuze Ge & Rujun Jiang School of Data Science, Fudan University EMAIL EMAIL
Pseudocode Yes Algorithm 1 SOREL
Open Source Code Yes The source code is available at https://github.com/SXFXuz/SOREL.
Open Datasets Yes Five tabular regression benchmarks are used for the least squares loss: yacht (Tsanas & Xifara, 2012), energy (Baressi Šegota et al., 2019), concrete (Yeh, 2006), kin8nm (Akujuobi & Zhang, 2017), power (Tüfekci, 2014).
Dataset Splits Yes We split the training set and test set in a 4:1 ratio and used five-fold cross-validation to report the average results on the test set.
Hardware Specification Yes We run all experiments on a laptop with 16.0 GB RAM and Intel i7-1360P 2.20 GHz CPU.
Software Dependencies No The paper states: "All algorithms are implemented in Python 3.8." This provides a programming language and its version but does not list any specific versioned libraries or solvers used, which is required for a 'Yes' classification.
Experiment Setup Yes For the selection of step size α, we set the random seed s {1, . . . , S}. For a single seed s, we calculate the average training loss of the last ten epochs, donated by Ls(α). We choose α that minimizes 1/S PS s=1 Ls(α), where α {1e-4, 3e-4, 1e-3, 3e-3, 1e-2, 3e-2, 1e-1, 3e-1}. For LSVRG, we set the length of an epoch to n. For SOREL, we set Tk = mk = n. Moreover, we set batch size to 64 for all algorithms with mini-batching. For SOREL, we follow the parameter values given in Theorem 1. In particular, we set θk = k/(k+1) and τk = 20n/(k+1) in all experiments. Therefore, there are only two parameters α and ηk left to tune. We set ηk = C/(k+1)n and choose C from {1e-2, 2e-2, 4e-2, 1e-1, 2e-1, 4e-1, 1e0, 2e0, 4e0, 1e1}...