reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Dual Feature Reduction for the Sparse-group Lasso and its Adaptive Variant

Authors: Fabio Feser, Marina Evangelou

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through synthetic and real data studies, it is shown that DFR drastically reduces the computational cost under many different scenarios. DFR applies two layers of screening through the application of dual norms and subdifferentials.
Researcher Affiliation	Academia	1Department of Mathematics, Imperial College London, London, UK. Correspondence to: Fabio Feser <EMAIL>.
Pseudocode	Yes	Algorithm A1 Dual Feature Reduction (DFR) for SGL Algorithm A2 Dual Feature Reduction (DFR) for a SGL
Open Source Code	Yes	DFR is implemented in the dfr R package (Feser, 2024), available on CRAN.
Open Datasets	Yes	Table A16: Dataset information for the six datasets used in the real data analysis. brca1 ... Source (National Cancer Institute, 1988)2 scheetz ... Source (Scheetz et al., 2006)2 trust-experts ... Source (Salomon et al., 2021)3 adenoma ... Source (Sabates-Bellver et al., 2007)4 celiac ... Source (Heap et al., 2009)4 tumour ... Source (Pei et al., 2009; Ellsworth et al., 2013; Li et al., 2016)4 2downloaded on 08/2024 from https://iowabiostat.github.io/data-sets/. 3downloaded on 08/2024 from https://github.com/dajmcdon/sparsegl. 4downloaded on 08/2024 from https://www.ncbi.nlm.nih.gov/.
Dataset Splits	Yes	Table A11: The improvement factor for the strong rules applied to synthetic data, under the linear and logistic models, with 10-fold CV, with standard errors.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	Yes	DFR is implemented in the dfr R package (Feser, 2024), available on CRAN.
Experiment Setup	Yes	Table A2: Default model, data, and algorithm parameters for the synthetic and real data analyses. α: 0.95 b1 = b2 (a SGL only): 0.1 Path length (l): 50 (Synthetic), 100 (Real) Path termination (λl): 0.1λ1 (Synthetic), 0.2λ1 (Real) Path shape: Log-linear Maximum iterations: 5000 (Synthetic), 10000 (Real) Backtracking (ATOS only): 0.7 Maximum backtracking iterations (ATOS only): 100 Convergence tolerance: 10-5