reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Optimal Rates of Kernel Ridge Regression under Source Condition in Large Dimensions

Authors: Haobo Zhang, Yicheng Li, Weihao Lu, Qian Lin

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Motivated by studies of neural networks, particularly the neural tangent kernel theory, we investigate the large-dimensional behavior of kernel ridge regression (KRR)... We first establish the exact order (both upper and lower bounds) of the generalization error of KRR for the optimally chosen regularization parameter λ. Furthermore, we show that KRR is minimax optimal when 0 < s ≤ 1, whereas for s > 1, KRR fails to achieve minimax optimality, exhibiting the saturation effect. Our results illustrate that the convergence rate w.r.t. dimension d varying along γ exhibits a periodic plateau behavior, and the convergence rate w.r.t. sample size n exhibits a multiple descent behavior.
Researcher Affiliation	Academia	Haobo Zhang EMAIL Yicheng Li EMAIL Weihao Lu EMAIL Qian Lin EMAIL Department of Statistics and Data Science Tsinghua University
Pseudocode	No	The paper primarily presents mathematical derivations, theorems, and proofs. There are no explicitly labeled sections or figures for "Pseudocode" or "Algorithm", nor are there any structured code-like blocks describing a procedure.
Open Source Code	No	The paper does not contain any explicit statements regarding the release of source code for the methodology described, nor does it provide links to any code repositories in the main text.
Open Datasets	No	This is a theoretical paper focusing on mathematical analysis of Kernel Ridge Regression. It does not conduct experiments on specific datasets and therefore does not provide access information for any publicly available or open dataset. The paper discusses theoretical settings like the "unit sphere Sd" and "square-integrable function space" but these are mathematical constructs, not empirical datasets.
Dataset Splits	No	The paper is theoretical and does not perform experiments that would require dataset splits. Therefore, no information on training/test/validation splits is provided.
Hardware Specification	No	The paper focuses on theoretical analysis and does not describe any experimental implementation. Consequently, there is no mention of specific hardware used for running experiments.
Software Dependencies	No	The paper is theoretical and does not describe any experimental implementation. As such, it does not list specific software dependencies with version numbers.
Experiment Setup	No	This paper is purely theoretical, presenting mathematical proofs and analysis. It does not include an experimental setup, hyperparameters, or training configurations for any practical implementation.