reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Revisiting minimum description length complexity in overparameterized models

Authors: Raaz Dwivedi, Chandan Singh, Bin Yu, Martin Wainwright

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Via an array of simulations and real-data experiments, we show that a data-driven Prac-MDL-COMP informs hyper-parameter tuning for optimizing test MSE with ridge regression in limited data settings, sometimes improving upon cross-validation and (always) saving computational costs.
Researcher Affiliation	Collaboration	Raaz Dwivedi EMAIL Department of Operations Research & Information Engineering Cornell Tech, Cornell University New York City, NY Chandan Singh EMAIL Microsoft Research Seattle, WA Bin Yu EMAIL Department of Statistics, and Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA Martin Wainwright EMAIL Department of Electrical Engineering and Computer Sciences, and Mathematics Massachusetts Institute of Technology Cambridge, MA
Pseudocode	No	The paper describes algorithms and methods in text and mathematical formulas but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks with structured steps.
Open Source Code	Yes	Code and documentation for easily reproducing the results are provided at github.com/csinva/mdl-complexity.
Open Datasets	Yes	Databases are taken from PMLB (Olson et al., 2017; Vanschoren et al., 2013), a repository of diverse tabular databases for benchmarking machine-learning algorithms. ...functional magnetic-resonance imaging (fMRI), as they are shown natural movies (Nishimoto et al., 2011).
Dataset Splits	Yes	The test set consists of 25% of the entire dataset. The training data consists of 7,200 time points and the test data consists of 540 time points, where at each timepoint a subject is watching a video clip. Moreover, this criterion can provide computational savings especially while training overparameterized models in contrast to the vanilla K-fold cross-validation (since computation is only required for a single fold).
Hardware Specification	No	The paper does not specify any particular GPU models, CPU models, or other detailed hardware specifications used for running the experiments. It only implies the use of computing resources for running simulations and experiments.
Software Dependencies	No	linear models (ridge) and kernel methods are ﬁt using scikit-learn (Pedregosa et al., 2011) and optimization for hyper-parameter tuning (see (34)) is performed using Sci Py (Virtanen et al., 2020). For the neural tangent kernel computation, we use the neural-tangents library (Novak et al., 2020) with its default parameters.
Experiment Setup	Yes	We tune the parameter λ over 20 values equally spaced on a log-scale from 10 3 to 106. We vary the number of covariates (d) used for ﬁtting the model and report the results for d/n {1/10, 1/2, 1, 2, 10} (noting that we have a misspeciﬁed model when ﬁtting with d < 50 features). For a given dataset, we fix d to be the number of features, and we vary n downwards from its maximum value (by subsampling the dataset) to construct instances with different values of the ratio d/n. The hyperparameter λ takes on 10 values equally spaced on a log scale between 10 3 and 103. In all f MRI experiments, λ takes on 40 values equally spaced on a log scale between 100 and 106. For the neural tangent kernel computation, we use the neural-tangents library (Novak et al., 2020) with its default parameters (Re LU nonlinearity, two hidden linear layers with hidden size of 512).