Learning Neural Networks with Distribution Shift: Efficiently Certifiable Guarantees
Authors: Gautam Chandrasekaran, Adam Klivans, Lin Lin Lee, Konstantinos Stavropoulos
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We give the first provably efficient algorithms for learning neural networks with respect to distribution shift. We work in the Testable Learning with Distribution Shift framework (TDS learning) of Klivans et al. (2024a), where the learner receives labeled examples from a training distribution and unlabeled examples from a test distribution and must either output a hypothesis with low test error or reject if distribution shift is detected. No assumptions are made on the test distribution. All prior work in TDS learning focuses on classification, while here we must handle the setting of nonconvex regression. Our results apply to real-valued networks with arbitrary Lipschitz activations and work whenever the training distribution has strictly sub-exponential tails. For training distributions that are bounded and hypercontractive, we give a fully polynomial-time algorithm for TDS learning one hidden-layer networks with sigmoid activations. We achieve this by importing classical kernel methods into the TDS framework using data-dependent feature maps and a type of kernel matrix that couples samples from both train and test distributions. |
| Researcher Affiliation | Academia | Gautam Chandrasekaran, Adam R. Klivans, Lin Lin Lee, Konstantinos Stavropoulos The University of Texas at Austin EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1: TDS Regression via the Kernel Method Input: Parameters M, R, B, A, C, ℓ 1, ϵ, δ (0, 1) and sample access to D, D x Algorithm 2: TDS Regression via Uniform Approximation Input: Parameters ϵ > 0, δ (0, 1), R 1, M 1, and sample access to D, D x |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | No | The paper discusses various types of theoretical distributions like 'training distribution', 'test distribution', 'bounded distributions', and 'strictly sub-exponential tails'. However, it does not mention or provide access information for any specific publicly available or open datasets used for empirical evaluation. |
| Dataset Splits | No | The paper is theoretical and does not present empirical experiments on specific datasets. Therefore, it does not provide any information regarding training/test/validation dataset splits. |
| Hardware Specification | No | The paper is theoretical and focuses on algorithm design, proofs, and theoretical guarantees. It does not describe any empirical experiments or specify the hardware used to run such experiments. |
| Software Dependencies | No | The paper is theoretical and focuses on algorithm design and theoretical analysis. It does not describe any empirical experiments, and thus, no specific software dependencies with version numbers are mentioned. |
| Experiment Setup | No | The paper is theoretical, presenting algorithms and proofs without empirical evaluation. Consequently, it does not detail any experimental setup, including hyperparameters or system-level training settings. |