Bagging in overparameterized learning: Risk characterization and risk monotonization

Authors: Pratik Patil, Jin-Hong Du, Arun Kumar Kuchibhotla

JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we study the prediction risk of variants of bagged predictors under the proportional asymptotics regime, in which the ratio of the number of features to the number of observations converges to a constant. Specifically, we propose a general strategy to analyze the prediction risk under squared error loss of bagged predictors using classical results on simple random sampling. ... From a technical perspective, during the course of our risk analysis of the bagged ridge and ridgeless predictors, we derive novel deterministic equivalents for ridge resolvents with random Tikhonov-type regularization. ... Our discussion will primarily revolve around the case of ridgeless predictors for the sake of illustration. While it is possible to obtain similar results for ridge predictors, the resulting expressions would be more involved.
Researcher Affiliation Academia Pratik Patil EMAIL Department of Statistics University of California, Berkeley Berkeley, CA 94720, USA. Jin-Hong Du EMAIL Department of Statistics and Data Science & Machine Learning Department Carnegie Mellon University Pittsburgh, PA 15213, USA. Arun Kumar Kuchibhotla EMAIL Department of Statistics and Data Science Carnegie Mellon University Pittsburgh, PA 15213, USA.
Pseudocode Yes Algorithm 1 Cross-validation for subagging or splagging
Open Source Code Yes The source code generating all the experimental illustrations in this paper can be accessed at https: //jaydu1.github.io/overparameterized-ensembling/bagging/.
Open Datasets No Under model (M-AR1-LI) when ρar1 = 0.25 and σ2 = 1. ... Under model (M-ISO-LI) when ρ2 = 1 and σ2 = 1. ... The paper uses data generated from specific theoretical models (M-AR1-LI, M-ISO-LI) rather than referencing external publicly available datasets. The description outlines the mathematical properties of the simulated data.
Dataset Splits Yes For each value of M, the points denote finite-sample risks averaged over 100 dataset repetitions, with n = 1000, nte = 63, and p = nϕ . ... Algorithm 1 Cross-validation for subagging or splagging ... 1: Data splitting: Randomly split Dn into training set Dtr and test set Dte as: Dtr = {(xi, yi) : i Str}, and Dte = {(xj, yj) : j Ste}, where Ste [n] with |Ste| = nte, and Str = [n]  Ste.
Hardware Specification No The paper mentions 'finite-sample risks averaged over 100 dataset repetitions' and various numerical illustrations, but it does not specify any particular hardware (e.g., GPU, CPU models, or cloud computing resources) used to run these computations.
Software Dependencies No The paper provides a link to source code for experimental illustrations but does not explicitly list the software dependencies or their specific version numbers (e.g., Python, NumPy, PyTorch, etc.) that would be required to reproduce the results.
Experiment Setup Yes In demonstrating the proposed procedure for bagged ridge and ridgeless predictors, we thoroughly investigate the oracle properties of the optimal subsample size and provide an in-depth comparison between different bagging variants. ... The left and the right panels correspond to the cases when SNR = 0.33 (ρar1 = 0.25) and 0.6 (ρar1 = 0.5), respectively. ... Algorithm 1 Cross-validation for subagging or splagging. Input: A dataset Dn = {(xi, yi) Rp R : 1 i n}, a positive integer nte < n (number of test samples), a base prediction procedure bf, a real number ν (0, 1) (bag size unit parameter), a natural number M (number of bags), a centering procedure CEN {AVG, MOM}, a real number η when CEN = MOM.