reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

skscope: Fast Sparsity-Constrained Optimization in Python

Authors: Zezhi Wang, Junxian Zhu, Xueqin Wang, Jin Zhu, Huiyang Pen, Peng Chen, Anran Wang, Xiaoke Zhang

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments reveal the available solvers in skscope can achieve up to 80x speedup on the competing relaxation solutions obtained via the benchmarked convex solver. skscope is published on the Python Package Index (Py PI) and Conda, and its source code is available at: https://github.com/abess-team/skscope. Keywords: sparsity-constrained optimization, automatic diﬀerentiation, nonlinear optimization, high-dimensional data, Python... We conducted a comprehensive comparison among the sparse-learning solvers employed in skscope and two alternative approaches.
Researcher Affiliation	Academia	1 Department of Statistics and Finance/International Institute of Finance, School of Management, University of Science and Technology of China, Hefei, Anhui, China 2 Saw Swee Hock School of Public Health, National University of Singapore, Singapore 3 Department of Statistics, London School of Economics and Political Science, London, UK
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. It provides Python code snippets as usage examples (Figure 1 and 2), but these are not the pseudocode for the underlying algorithms implemented in skscope.
Open Source Code	Yes	skscope is published on the Python Package Index (Py PI) and Conda, and its source code is available at: https://github.com/abess-team/skscope.
Open Datasets	No	The datasets are generated using the make glm data function implemented in the abess package. For all tasks, the non-zero coeﬃcients are randomly chosen from {1, . . . , p}. The mean metrics are computed over 10 replications with standard deviation in parentheses.
Dataset Splits	No	The paper mentions data generation and replications for experiments (e.g., "The mean metrics are computed over 10 replications" and "The results are the average of 100 replications"), but it does not specify explicit training/test/validation dataset splits for any fixed dataset.
Hardware Specification	Yes	We conducted a comprehensive comparison among the sparse-learning solvers employed in skscope and two alternative approaches. The ﬁrst competing approach solves (1) by recruiting the widely-used mixed-integer optimization solver, GUROBI7. We compare this approach assuming the optimal s of (1) is known and present the results in Table A3. The second approach utilizes the ℓ1 relaxation of (1), implemented using the open-source solver, cvxpy (Diamond and Boyd, 2016). The comparison with cvxpy assumes the optimal s of (1) is unknown and searches with information criteria. The corresponding results are reported in Table A4. These comparisons covered a wide range of concrete SCO problems and were performed on a Ubuntu platform with Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz and 64 RAM.
Software Dependencies	Yes	The skscope can run on most Linux distributions, mac OS, and Windows 32 or 64-bit with Python (version 3.9)... The dependencies of skscope are minimal and just include the standard Python library such as numpy, scikit-learn; additionally, two powerful and well-maintained libraries, jax and nlopt (Frostig et al., 2018; Johnson, 2014), are used for obtaining AD and solving unconstrained nonlinear optimization, respectively... GUROBI: version 10.0.2; cvxpy: version 1.3.1; skscope: version 0.1.8.
Experiment Setup	Yes	solver = Grasp Solver(10, 3)... solver = Scope Solver(len(x), 10)... Time Limit is set to 1000. Note that optimization may not immediately stop upon hitting Time Limit. ...information criteria are used for selecting the optimal one. Specifically, for linear regression, special information criterion (Zhu et al., 2020) is used; as for logistic regression, Ising model, and non-linear feature selection, generalized information criterion (Zhu et al., 2023) is employed; and for trend ﬁltering and robust feature selection, Bayesian information criterion (Wen et al., 2023) is used.