Stochastic Approximation with Decision-Dependent Distributions: Asymptotic Normality and Optimality
Authors: Joshua Cutler, Mateo Díaz, Dmitriy Drusvyatskiy
JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We analyze a stochastic approximation algorithm for decision-dependent problems, wherein the data distribution used by the algorithm evolves along the iterate sequence. The primary examples of such problems appear in performative prediction and its multiplayer extensions. We show that under mild assumptions, the deviation between the average iterate of the algorithm and the solution is asymptotically normal, with a covariance that clearly decouples the effects of the gradient noise and the distributional shift. Moreover, building on the work of Hájek and Le Cam, we show that the asymptotic performance of the algorithm with averaging is locally minimax optimal. Keywords: stochastic approximation, decision-dependent distributions, performative prediction, asymptotic normality, local asymptotic minimax optimality |
| Researcher Affiliation | Academia | Joshua Cutler EMAIL Department of Mathematics University of Washington Seattle, WA 98195, USA Mateo Díaz EMAIL Department of Applied Mathematics and Statistics Johns Hopkins University Baltimore, MD 21218, USA Dmitriy Drusvyatskiy EMAIL Department of Mathematics University of Washington Seattle, WA 98195, USA |
| Pseudocode | Yes | Algorithm 1 Stochastic Forward-Backward Method (SFB) Input: initial x0 X and step size sequence (ηt)t ≥ 0 ∈ (0, ∞) Step t ≥ 0: Sample zt ∼ D(xt) Set xt+1 = proj X xt − ηt G(xt, zt) |
| Open Source Code | Yes | Visit https://github.com/mateodd25/Asymptotic-normality-in-performative-prediction for code. |
| Open Datasets | No | Consider the problem corresponding to G(x, z) = xℓ(x, z) with ℓ(x, z) = 1/2 ||x - z||^2 and D(x1, x2) = N(ρ(x2, x1), I2). A simple computation shows Σ = I2 and W = [1, ρ; ρ, 1]. |
| Dataset Splits | No | The paper uses a synthetic problem for illustration purposes and does not describe the use of any real-world datasets with specified training/test/validation splits. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the algorithm illustrations. |
| Software Dependencies | No | The paper does not explicitly list any software dependencies with version numbers. |
| Experiment Setup | Yes | We run algorithm SFB 400 times using ηt = t −3/4 for 106 iterations. The first row depicts the resulting average iterates laid over the confidence regions (plotted in logarithmic scale) corresponding to the asymptotic normal distribution. |