Dual Feature Reduction for the Sparse-group Lasso and its Adaptive Variant

Authors: Fabio Feser, Marina Evangelou

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through synthetic and real data studies, it is shown that DFR drastically reduces the computational cost under many different scenarios. DFR applies two layers of screening through the application of dual norms and subdifferentials.
Researcher Affiliation Academia 1Department of Mathematics, Imperial College London, London, UK. Correspondence to: Fabio Feser <EMAIL>.
Pseudocode Yes Algorithm A1 Dual Feature Reduction (DFR) for SGL Algorithm A2 Dual Feature Reduction (DFR) for a SGL
Open Source Code Yes DFR is implemented in the dfr R package (Feser, 2024), available on CRAN.
Open Datasets Yes Table A16: Dataset information for the six datasets used in the real data analysis. brca1 ... Source (National Cancer Institute, 1988)2 scheetz ... Source (Scheetz et al., 2006)2 trust-experts ... Source (Salomon et al., 2021)3 adenoma ... Source (Sabates-Bellver et al., 2007)4 celiac ... Source (Heap et al., 2009)4 tumour ... Source (Pei et al., 2009; Ellsworth et al., 2013; Li et al., 2016)4 2downloaded on 08/2024 from https://iowabiostat.github.io/data-sets/. 3downloaded on 08/2024 from https://github.com/dajmcdon/sparsegl. 4downloaded on 08/2024 from https://www.ncbi.nlm.nih.gov/.
Dataset Splits Yes Table A11: The improvement factor for the strong rules applied to synthetic data, under the linear and logistic models, with 10-fold CV, with standard errors.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies Yes DFR is implemented in the dfr R package (Feser, 2024), available on CRAN.
Experiment Setup Yes Table A2: Default model, data, and algorithm parameters for the synthetic and real data analyses. α: 0.95 b1 = b2 (a SGL only): 0.1 Path length (l): 50 (Synthetic), 100 (Real) Path termination (λl): 0.1λ1 (Synthetic), 0.2λ1 (Real) Path shape: Log-linear Maximum iterations: 5000 (Synthetic), 10000 (Real) Backtracking (ATOS only): 0.7 Maximum backtracking iterations (ATOS only): 100 Convergence tolerance: 10-5