Learning Neural Networks with Distribution Shift: Efficiently Certifiable Guarantees

Authors: Gautam Chandrasekaran, Adam Klivans, Lin Lin Lee, Konstantinos Stavropoulos

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We give the first provably efficient algorithms for learning neural networks with respect to distribution shift. We work in the Testable Learning with Distribution Shift framework (TDS learning) of Klivans et al. (2024a), where the learner receives labeled examples from a training distribution and unlabeled examples from a test distribution and must either output a hypothesis with low test error or reject if distribution shift is detected. No assumptions are made on the test distribution. All prior work in TDS learning focuses on classification, while here we must handle the setting of nonconvex regression. Our results apply to real-valued networks with arbitrary Lipschitz activations and work whenever the training distribution has strictly sub-exponential tails. For training distributions that are bounded and hypercontractive, we give a fully polynomial-time algorithm for TDS learning one hidden-layer networks with sigmoid activations. We achieve this by importing classical kernel methods into the TDS framework using data-dependent feature maps and a type of kernel matrix that couples samples from both train and test distributions.
Researcher Affiliation Academia Gautam Chandrasekaran, Adam R. Klivans, Lin Lin Lee, Konstantinos Stavropoulos The University of Texas at Austin EMAIL EMAIL
Pseudocode Yes Algorithm 1: TDS Regression via the Kernel Method Input: Parameters M, R, B, A, C, ℓ 1, ϵ, δ (0, 1) and sample access to D, D x Algorithm 2: TDS Regression via Uniform Approximation Input: Parameters ϵ > 0, δ (0, 1), R 1, M 1, and sample access to D, D x
Open Source Code No The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets No The paper discusses various types of theoretical distributions like 'training distribution', 'test distribution', 'bounded distributions', and 'strictly sub-exponential tails'. However, it does not mention or provide access information for any specific publicly available or open datasets used for empirical evaluation.
Dataset Splits No The paper is theoretical and does not present empirical experiments on specific datasets. Therefore, it does not provide any information regarding training/test/validation dataset splits.
Hardware Specification No The paper is theoretical and focuses on algorithm design, proofs, and theoretical guarantees. It does not describe any empirical experiments or specify the hardware used to run such experiments.
Software Dependencies No The paper is theoretical and focuses on algorithm design and theoretical analysis. It does not describe any empirical experiments, and thus, no specific software dependencies with version numbers are mentioned.
Experiment Setup No The paper is theoretical, presenting algorithms and proofs without empirical evaluation. Consequently, it does not detail any experimental setup, including hyperparameters or system-level training settings.