reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Enhanced Feature Learning via Regularisation: Integrating Neural Networks and Kernel Methods

Authors: Bertille FOLLAIN, Francis BACH

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments conﬁrm our optimisation intuitions, and BKer NN outperforms kernel ridge regression, and favourably compares to a one-hidden layer neural network with Re LU activations in various settings and real data sets.
Researcher Affiliation	Academia	Bertille Follain EMAIL Inria, Département d Informatique de l Ecole Normale Supérieure, PSL Research University 48 rue Barrault, 75013, Paris, France Francis Bach EMAIL Inria, Département d Informatique de l Ecole Normale Supérieure, PSL Research University 48 rue Barrault, 75013, Paris, France
Pseudocode	Yes	3.1.3 Algorithm Pseudocode We now have all the components necessary to provide the pseudocode (Algorithm 1) of the proposed method BKer NN, speciﬁcally for the square loss.
Open Source Code	Yes	The source code, along with all necessary scripts to reproduce the experiments, is available at https://github.com/Bertille Follain/BKer NN.
Open Datasets	Yes	In Experiment 6, we evaluate the R2 scores, deﬁned in Equation (15), of four methods: BKRR, BKer NN with concave variable regularisation, BKer NN with concave feature regularisation, and Re LUNN, across 17 real-world data sets. These data sets were obtained from the tabular benchmark numerical regression suite via the Open ML platform, as described by Grinsztajn et al. (2022).
Dataset Splits	Yes	The training set consisted of 214 samples and the test set of 1024. We used 412 training samples and 1024 test samples, with a data dimensionality of d = 20 and k = 5 relevant features. Each data set was processed to include only numerical variables and rescaled to have centred covariates with standard deviation equal to one. The data sets were uniformly cropped to contain 400 training samples and 100 testing samples (except for a few datasets, see Appendix B.5)
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The BKer NN implementation in Python is fully compatible with Scikit-learn (Pedregosa et al., 2011), ensuring seamless integration with existing machine learning workﬂows.
Experiment Setup	Yes	The training was set for 20 iterations and the stepsize parameter (γ) was set to 500, with backtracking enabled. Regularisation parameter candidates were λ = {0.05, 0.1, 0.5, 1, 1.5} 2 maxi [n] xi 2/n. Once the regularisation parameters had been selected, we trained from scratch for 200 iterations, with the other parameters kept as before.