Stochastic Approximation with Decision-Dependent Distributions: Asymptotic Normality and Optimality

Authors: Joshua Cutler, Mateo Díaz, Dmitriy Drusvyatskiy

JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We analyze a stochastic approximation algorithm for decision-dependent problems, wherein the data distribution used by the algorithm evolves along the iterate sequence. The primary examples of such problems appear in performative prediction and its multiplayer extensions. We show that under mild assumptions, the deviation between the average iterate of the algorithm and the solution is asymptotically normal, with a covariance that clearly decouples the effects of the gradient noise and the distributional shift. Moreover, building on the work of Hájek and Le Cam, we show that the asymptotic performance of the algorithm with averaging is locally minimax optimal. Keywords: stochastic approximation, decision-dependent distributions, performative prediction, asymptotic normality, local asymptotic minimax optimality
Researcher Affiliation Academia Joshua Cutler EMAIL Department of Mathematics University of Washington Seattle, WA 98195, USA Mateo Díaz EMAIL Department of Applied Mathematics and Statistics Johns Hopkins University Baltimore, MD 21218, USA Dmitriy Drusvyatskiy EMAIL Department of Mathematics University of Washington Seattle, WA 98195, USA
Pseudocode Yes Algorithm 1 Stochastic Forward-Backward Method (SFB) Input: initial x0 X and step size sequence (ηt)t ≥ 0 ∈ (0, ∞) Step t ≥ 0: Sample zt ∼ D(xt) Set xt+1 = proj X xt − ηt G(xt, zt)
Open Source Code Yes Visit https://github.com/mateodd25/Asymptotic-normality-in-performative-prediction for code.
Open Datasets No Consider the problem corresponding to G(x, z) = xℓ(x, z) with ℓ(x, z) = 1/2 ||x - z||^2 and D(x1, x2) = N(ρ(x2, x1), I2). A simple computation shows Σ = I2 and W = [1, ρ; ρ, 1].
Dataset Splits No The paper uses a synthetic problem for illustration purposes and does not describe the use of any real-world datasets with specified training/test/validation splits.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the algorithm illustrations.
Software Dependencies No The paper does not explicitly list any software dependencies with version numbers.
Experiment Setup Yes We run algorithm SFB 400 times using ηt = t −3/4 for 106 iterations. The first row depicts the resulting average iterates laid over the confidence regions (plotted in logarithmic scale) corresponding to the asymptotic normal distribution.