Benign Overfitting in Out-of-Distribution Generalization of Linear Models
Authors: Shange Tang, Jiayun Wu, Jianqing Fan, Chi Jin
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We simulate the covariate shift discussed in section 3, where the overall magnitude of the target covariance matrix s minor directions is comparable to that of the source. ... The results are shown in Figure 1a, 1b. The fast rate O(1) of minimum norm interpolation is confirmed, as the log-log plot of excess risk versus n has a slope near -1 across all combinations of T and tr[U]/tr[V]. ... Figure 1c present the results. As expected, PCR nearly achieves the fast rate of O(1/n), with the log-log slope of excess risk versus n being -0.99. |
| Researcher Affiliation | Academia | Department of Operations Research and Financial Engineering, Princeton University; EMAIL Department of Computer Science and Technology, Tsinghua University; EMAIL Department of Electrical and Computer Engineering, Princeton University; EMAIL |
| Pseudocode | No | The paper describes methods like Ridge Regression and Principal Component Regression using mathematical formulations and prose, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | No | In the experiment, data is generated according to these conditions. y = x T β + ϵ, where β Rk+n2, with β k = (1/k)T , β k = 0 and k = 10. The noise ϵ follows a centered gaussian distribution with variance 0.1, and x is drawn from a multivariate normal distribution with zero mean and a source covariance matrix ΣS = diag(Ik, n 1.5In2). |
| Dataset Splits | Yes | For each pair of T and tr[U]/tr[V], we generate training samples of various sizes n and 1000 test samples. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific details about ancillary software components, including library or solver names with version numbers, used to replicate the experiment. |
| Experiment Setup | Yes | y = x T β + ϵ, where β Rk+n2, with β k = (1/k)T , β k = 0 and k = 10. The noise ϵ follows a centered gaussian distribution with variance 0.1, and x is drawn from a multivariate normal distribution with zero mean and a source covariance matrix ΣS = diag(Ik, n 1.5In2). ... We run minimum norm interpolation (ridgeless regression) with λ = 0. |