reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Stabilizing black-box model selection with the inflated argmax

Authors: Melissa Adrian, Jake A Soloff, Rebecca Willett

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We illustrate this method in (a) a simulation in which strongly correlated covariates make standard LASSO model selection highly unstable, (b) a Lotka Volterra model selection problem focused on identifying how competition in an ecosystem influences species abundances, (c) a graph subset selection problem using cell-signaling data from proteomics, and (d) unsupervised κ-means clustering. In these settings, the proposed method yields stable, compact, and accurate collections of selected models, outperforming a variety of benchmarks.
Researcher Affiliation	Academia	Melissa Adrian EMAIL Data Science Institute, University of Chicago Jake A. Soloff EMAIL Department of Statistics, University of Michigan Rebecca Willett EMAIL Department of Statistics, University of Chicago Department of Computer Science, University of Chicago NSF-Simons National Institute for Theory and Mathematics in Biology
Pseudocode	Yes	Algorithm 1 Bagged model selection (Breiman, 1996a;b) Algorithm 2 Computing the number of clusters
Open Source Code	No	The paper mentions using 'pysindy Python package' and 'sklearn.tree.Decision Tree Classifier function', which are third-party tools. There is no explicit statement or link provided by the authors for the release of their own source code for the methodology described in this paper.
Open Datasets	Yes	We generate synthetic datasets... We provide further details of this data generation process in D.1. We compute LOO stability results from a flow cytometry dataset in Sachs et al. (2005) Mouse embryonic stem cells were sequenced for their gene expression... (Veleslavov and Stumpf, 2020).
Dataset Splits	Yes	N = 100 trials (i.e., independent datasets) are independently generated according to the same data generation process. We perform a grid search across two parameters to find a combination that leads to a low validation MSE. The 5-fold validation MSE is measured as... We compute LOO stability results from a flow cytometry dataset
Hardware Specification	No	In our experiments, we utilize a cluster computing system to distribute parallel jobs across CPU nodes. No specific CPU models, memory, or other detailed hardware specifications are provided.
Software Dependencies	No	We utilize the pysindy Python package (Kaptanoglu et al., 2022; de Silva et al., 2020) for their implementations of these methods... Our experiments use the default hyper-parameters in the sklearn.tree.Decision Tree Classifier function. No specific version numbers are provided for these software components.
Experiment Setup	Yes	We choose the hyperparameter combination that should give sparser models (larger λ and larger ω) in the case of tied validation MSE, which leads us to choose λ = 0.01 and ω = 0.18. Based on this figure, the best choice of λ is λ = 77, which we keep constant throughout our experiments in 5.2.1. In our generated example, the maximum number of clusters M = 29, and the slope tolerance ω = 5. Our experiments use the default hyper-parameters in the sklearn.tree.Decision Tree Classifier function.