reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On the Stability of Feature Selection Algorithms

Authors: Sarah Nogueira, Konstantinos Sechidis, Gavin Brown

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present a rigorous statistical treatment for this issue. In particular, with this work we consolidate the literature and provide (1) a deeper understanding of existing work based on a small set of properties, and (2) a clearly justiﬁed statistical approach with several novel beneﬁts. This approach serves to identify a stability measure obeying all desirable properties, and (for the ﬁrst time in the literature) allowing conﬁdence intervals and hypothesis tests on the stability, enabling rigorous experimental comparison of feature selection algorithms. 5. Empirical Validation of the Statistical Tools 6. Experiments
Researcher Affiliation	Academia	Sarah Nogueira EMAIL Konstantinos Sechidis EMAIL Gavin Brown EMAIL School of Computer Science University of Manchester Manchester M13 9PL, UK
Pseudocode	No	The paper describes methods and concepts mathematically and textually, but does not include any explicitly labeled pseudocode or algorithm blocks with structured, code-like steps.
Open Source Code	Yes	The code in R and Matlab at github.com/nogueirs/JMLR2018 for the proposed measure and associated statistical tools. The code for all experiments is also available, enabling reproducible research. A Python package and a demonstration notebook using the package at github.com/ nogueirs/JMLR2018/tree/master/python/ A demonstration web page at www.cs.man.ac.uk/~gbrown/stability
Open Datasets	Yes	We use a synthetic data set (Kamkar et al., 2015) a binary classiﬁcation problem, with 2000 instances and d = 100 features, where only the ﬁrst 50 features are relevant to the target class. ... In this section, we use our proposed measure to quantify just how stable their stable set can be. To that end, we look at how much the ﬁnal set picked by Stability Selection varies in the context of LASSO and we will show on 4 data sets that it will indeed yield more stable results (in the sense of ˆΦ(Z)) than its non-ensemble version (LASSO). ... Figure 12 compares the two approaches for variable selection in 4 data sets, three binary classiﬁcation problems (Spambase/Sonar/Madelon) and one regression (Boston housing).
Dataset Splits	Yes	We take 2000 samples and divide them into 1000 for model selection (i.e. to select the regularizing parameter λ) and 1000 for selection of the ﬁnal set of features. ... It proposes to apply LASSO to M random sub-samples of size n/2 of the original data set (where n is the sample size) for a set of regularizing parameters λ Λ, where Λ is a subset of R+.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, or memory amounts) used for running its experiments. It mentions software environments like R, Matlab, and Python but lacks hardware specifications.
Software Dependencies	No	The paper mentions the use of 'R and Matlab' and 'A Python package' for its code, but does not specify version numbers for these software environments or any specific libraries or dependencies with their versions.
Experiment Setup	Yes	We use a L1-regularized logistic regression where λ is the regularizing parameter, inﬂuencing the amount of features selected as λ increases, more and more coeﬃcients are equal to zero and therefore less and less features are selected. ... We study 4 degrees of redundancy: ρ = 0 (no redundancy, the features are independent from each other), ρ = 0.3 (low redundancy), ρ = 0.5 (medium) and ρ = 0.8 (high). We apply L1-logistic regression to M = 100 bootstrap samples of the data set. ... Stability selection possesses 3 hyperparameters: (1) the cut-oﬀvalue πthr, (2) the average number of features selected qΛ over the all values of λ Λ and (3) the set of regularizing parameters Λ (where the two last hyperparameters are dependent). We used the values suggested by the original authors: that is πthr (0.6 0.9) and qΛ around 0.8d.