Enhanced Feature Learning via Regularisation: Integrating Neural Networks and Kernel Methods

Authors: Bertille FOLLAIN, Francis BACH

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments confirm our optimisation intuitions, and BKer NN outperforms kernel ridge regression, and favourably compares to a one-hidden layer neural network with Re LU activations in various settings and real data sets.
Researcher Affiliation Academia Bertille Follain EMAIL Inria, Département d Informatique de l Ecole Normale Supérieure, PSL Research University 48 rue Barrault, 75013, Paris, France Francis Bach EMAIL Inria, Département d Informatique de l Ecole Normale Supérieure, PSL Research University 48 rue Barrault, 75013, Paris, France
Pseudocode Yes 3.1.3 Algorithm Pseudocode We now have all the components necessary to provide the pseudocode (Algorithm 1) of the proposed method BKer NN, specifically for the square loss.
Open Source Code Yes The source code, along with all necessary scripts to reproduce the experiments, is available at https://github.com/Bertille Follain/BKer NN.
Open Datasets Yes In Experiment 6, we evaluate the R2 scores, defined in Equation (15), of four methods: BKRR, BKer NN with concave variable regularisation, BKer NN with concave feature regularisation, and Re LUNN, across 17 real-world data sets. These data sets were obtained from the tabular benchmark numerical regression suite via the Open ML platform, as described by Grinsztajn et al. (2022).
Dataset Splits Yes The training set consisted of 214 samples and the test set of 1024. We used 412 training samples and 1024 test samples, with a data dimensionality of d = 20 and k = 5 relevant features. Each data set was processed to include only numerical variables and rescaled to have centred covariates with standard deviation equal to one. The data sets were uniformly cropped to contain 400 training samples and 100 testing samples (except for a few datasets, see Appendix B.5)
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The BKer NN implementation in Python is fully compatible with Scikit-learn (Pedregosa et al., 2011), ensuring seamless integration with existing machine learning workflows.
Experiment Setup Yes The training was set for 20 iterations and the stepsize parameter (γ) was set to 500, with backtracking enabled. Regularisation parameter candidates were λ = {0.05, 0.1, 0.5, 1, 1.5} 2 maxi [n] xi 2/n. Once the regularisation parameters had been selected, we trained from scratch for 200 iterations, with the other parameters kept as before.