On Mitigating Affinity Bias through Bandits with Evolving Biased Feedback
Authors: Matthew Faw, Constantine Caramanis, Jessica Hoffmann
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we observe this same phenomenon for UCB (Auer et al., 2002), EXP3 (Auer et al., 1995), and EXP3-IX (Koc ak et al., 2014) in Figure 34. ... In Figure 3, we demonstrate empirically that ignoring the bias structure of our problem leads to linear regret for many standard bandit algorithms... In Figures 6 and 7, we compare the performance of Algorithm 1 against two alternative algorithms, Efficient-UCBV (Mukherjee et al., 2018), and an implementation of LUCB (Jamieson & Nowak, 2014). |
| Researcher Affiliation | Collaboration | 1Georgia Institute of Technology 2The University of Texas at Austin 3Google Deep Mind. |
| Pseudocode | Yes | Algorithm 1 Elimination algorithm for unknown bias model Algorithm 2 The Elimination-style algorithm for unknown bias model, with added notations |
| Open Source Code | No | The paper does not contain an explicit statement about open-sourcing the code or a link to a code repository. |
| Open Datasets | No | We run each of these algorithms on a 2-armed Bernoulli bandit instance... We consider a 2-armed Bernoulli bandit instance... We consider a standard Gaussian bandit environment under bias model f(x) = xα. The paper describes simulated environments (Bernoulli bandit instances and Gaussian environments) and their parameters, rather than utilizing external, publicly available datasets. |
| Dataset Splits | No | The paper uses simulated bandit environments rather than pre-existing datasets that would typically require training/test/validation splits. Therefore, information regarding dataset splits is not applicable and not provided. |
| Hardware Specification | No | All experiments were performed locally on a Mac operating system, using Python 3.9 and Py Charm. This description of 'Mac operating system' is too general and does not provide specific hardware details such as CPU/GPU models, memory, or specific computer specifications. |
| Software Dependencies | Yes | All experiments were performed locally on a Mac operating system, using Python 3.9 and Py Charm. |
| Experiment Setup | Yes | We run each of these algorithms on a 2-armed Bernoulli bandit instance, where µ1 = .4 < .6 = µ2, with bias structure Wi(t) = T bias i (t 1) tbias 1 , where the initial number of arm plays for each arm are: T 0 2 = 10, and we vary T 0 1 {1, 3, 5, 10, 15, 20, 25, 30, 40, 50, 70, 90, 200}. The time horizon is n = 20, 000. Each experiment is repeated r = 50 times. ... We consider a 2-armed Bernoulli bandit instance, where µ1 = .4 < .6 = µ2, and the initial number of times each arm is played is T 0 1 = 100, T 0 2 = 10. We consider a time horizon n = 200, 000, and repeat each experiment 40 times. |