reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Stochastic Online Instrumental Variable Regression: Regrets for Endogeneity and Bandit Feedback

Authors: Riccardo Della Vecchia, Debabrota Basu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present experimental results for both online regression and linear bandit in Figure 2a, 2b and 3. We compare the performance of O2SLS and Online Ridge Regression (Ridge). For LBEs, we compare the performance of OFUL-IV and OFUL (Abbasi-Yadkori, Pal, and Szepesvari 2011a). ... Summary of Results. 1. Regression. O2SLS outperforms Ridge in all the settings, and the performance-gain increases with increasing values of ρ, i.e. the level of endogeneity. 2. Bandits. OFUL builds a confidence ellipsoid centered at βRidge t , while OFUL-IV uses O2SLS to build an accurate estimate and an ellipsoid containing β. Figure 2 indicates that OFUL-IV incurs lower regret than OFUL.
Researcher Affiliation	Academia	Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRIStAL, F-59000 Lille, France EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: O2SLS ... Algorithm 2: OFUL-IV
Open Source Code	No	The paper does not contain any explicit statement about releasing source code or a link to a code repository.
Open Datasets	No	For different datasets with endogeneity, we experimentally show efficiencies of O2SLS and OFUL-IV. ... For further experiments and results with both synthetic and real data, we refer to Appendix D. ... Summary of Results. 1. Regression. O2SLS outperforms Ridge in all the settings, and the performance-gain increases with increasing values of ρ, i.e. the level of endogeneity.
Dataset Splits	No	We induce endogeneity in the problem in the following arbitrary way: by settings ηt = ρϵt,1 + eηt where ϵt,1 indicates the first component of the vector ϵt. Then, we control the level of endogeneity of the two stages through ρ. We choose dx = {2, 5, 8} and dz = {4, 10, 16} respectively. ... Then, we sample at each time t (and also for every arm a for the LBE setting) the vectors zt Ndz( 0, Idz) (zt,a Ndz( 0, Idz)), the vector noise ϵt Ndx( 0, Idx), and the scalar noise ηt = eηt+ρ ϵt,1 where eηt N1(0, 1). The paper describes synthetic data generation but does not provide specific train/test/validation splits for any dataset, including the mentioned 'real data'.
Hardware Specification	No	The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory amounts, or cloud computing specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide any specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	We choose dx = {2, 5, 8} and dz = {4, 10, 16} respectively. ... The algorithms with the same regularisation parameters, i.e. λ = 0.1. We repeat our experiments 20 times. We average the results, and for each algorithm, we report the mean and standard deviation of the cumulative regret (shaded areas correspond to one standard deviation).