reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

No Free Lunch from Random Feature Ensembles: Scaling Laws and Near-Optimality Conditions

Authors: Benjamin Samuel Ruben, William Lingxiao Tong, Hamza Tahir Chaudhry, Cengiz Pehlevan

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate monotonicity with P and N in Fig. 1, where we plot EK g as a function of both sample size P and the network size N in ensembles of Re LU random feature models applied to a binarized CIFAR-10 image classification task (see Appendix E.2). Numerically, we verify that error monotonicity with P and N holds at the level of a 0-1 loss on the predicted classes of held-out test examples for both scoreaveraging and majority-vote ensembling over the predictors (see fig. S3).
Researcher Affiliation	Academia	1Biophysics Ph D Program, Harvard University, Cambridge, MA 02138, USA 2John A. Paulson School of Engineering and Applied Science, Harvard University, Cambridge, MA 02138, USA 3Center for Brain Science, Harvard University, Cambridge, MA 02138, USA 4Kempner Institute, Harvard University, Cambridge, MA 02138, USA.
Pseudocode	No	The paper describes methods and derivations using mathematical equations and prose (e.g., in Section 2, 'Preliminaries' and Appendix A), but it does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	All code used to generate the figures presented in this work is publicly available at https://github.com/benruben87/Random Feature Ensembles.git.
Open Datasets	Yes	We demonstrate monotonicity with P and N in Fig. 1, where we plot EK g as a function of both sample size P and the network size N in ensembles of Re LU random feature models applied to a binarized CIFAR-10 image classification task (see Appendix E.2).
Dataset Splits	No	The paper mentions using "training sets" of MNIST and CIFAR-10, and also refers to "held-out test examples". However, it does not provide specific details on the exact percentages, sample counts, or methodology used for creating training, validation, and test splits needed to reproduce the data partitioning.
Hardware Specification	No	The paper does not provide specific details such as GPU models, CPU types, or other hardware specifications used for running the experiments.
Software Dependencies	No	The paper mentions using the 'neural tangents library (Novak et al., 2019)' and the 'scipy library (Virtanen et al., 2020)'. However, it does not provide specific version numbers for these software dependencies, which are required for a reproducible description.
Experiment Setup	Yes	We fix N = 256 and vary both P and K. Color corresponds to the regularization λ. Markers show numerical experiments and dotted lines theoretical predictions. Error is monotonically decreasing with P provided that the regularization λ is tuned to its optimal value.