Stronger Neyman Regret Guarantees for Adaptive Experimental Design

Authors: Georgy Noarov, Riccardo Fogliato, Martin Andres Bertran, Aaron Roth

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We first present the results for the non-contextual setting and then turn to the analysis of the performance for the contextual algorithm. Our code is available at the following link: https://github.com/amazon-science/adaptive-abtester. 5.1. Non-Contextual Experiments Tasks We compare our method Clip OGDSC with Clip OGD0 (Dai et al., 2023) on multiple tasks. Below, we show two key datasets (one synthetic and one realworld) used in our experiments, with full details in Appendix E. [...] Figure 1 shows the Neyman regret across these settings, matching our theoretical expectations: when σ = 0.1, the regret of Clip OGDSC drops to 0 quickly, but for larger σ, the regret remains high and converges later.
Researcher Affiliation Collaboration 1Department of Computer and Information Science, University of Pennsylvania. 2Amazon Web Services. Correspondence to: Georgy Noarov <EMAIL>.
Pseudocode Yes Algorithm 1 Clip OGD (Dai et al., 2023) [...] Algorithm 2 AMGAT E: Multigroup Adaptive Design [...] Algorithm 3 ASOLO: SOLO FTRL (Orabona & Pál, 2018) [...] Algorithm 4 ASOLO: Instantiation for scale-free sleeping experts [...] Algorithm 5 AOLO SE: Sleeping Experts to OLO Reduction (Orabona, 2024) [...] Algorithm 6 ASOLO SE: Sleeping Experts Algorithm [...] Algorithm 7 General Multigroup Adaptive Design
Open Source Code Yes Our code is available at the following link: https://github.com/amazon-science/adaptive-abtester.
Open Datasets Yes The second dataset comes from Egypt s largest microfinance organization (Groh & Mc Kenzie, 2016), covering 2,961 clients. [...] We also present experiments on the ASOS Digital Experiments Dataset (Liu et al., 2021), and on question-answering tasks for large language models, including Big Bench (Srivastava et al., 2023), in the Appendix.
Dataset Splits No The ASOS Digital Experiments Dataset (Liu et al., 2021) [...] This structure naturally creates 4 subsets of rows (each with about 6,000 rows). We treat each subset as a separate dataset and feed each of these 4 pairs of treatment and control outcome sequences into Clip OGDSC and Clip OGD0. [...] For the synthetic dataset is generated as follows: yt(i) iid N(µi, σ2) for t = 1, . . . , T and i = 0, 1 with µ0 = 1 and µ1 = 2. [...] We vary σi R+ to showcase where our method succeeds and where it struggles.
Hardware Specification No The paper does not explicitly describe the specific hardware used to run its experiments. It mentions 'simulation' but no GPU/CPU models, memory, or cloud instance types.
Software Dependencies No The paper does not provide specific software dependencies with version numbers for reproducing the experiments.
Experiment Setup Yes Hyperparameter Choices Throughout the experiments, we use the following hyperparameters. For our method, we set ηt = 2/t and δt = 1/h(t), where the clipping function is h(t) = exp (log(t + 2))1/4 . For Clip OGD0, we follow Dai et al. (2023) with a constant learning rate ηt = 1/ T and clipping rate δt = 0.5 t 1/ 5 log T .