Bayesian Causal Bandits with Backdoor Adjustment Prior

Authors: Jireh Huang, Qing Zhou

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our proposed methodology in an extensive empirical study, demonstrating compelling cumulative regret performance against state-of-the-art standard algorithms as well as optimistic implementations of their causal variants that assume strong prior knowledge of the causal structure.
Researcher Affiliation Academia Jireh Huang EMAIL Department of Statistics University of California, Los Angeles Qing Zhou EMAIL Department of Statistics University of California, Los Angeles
Pseudocode Yes We present our proposed BBB methodology applied to Bayes-UCB, TS, and UCB in Algorithm 1. Algorithm 1 BBB-Alg(T, A, D0, c) Require: Horizon T, action set A, observational data D0, confidence level c 1: Compute the observational parent set posteriors (4) 2: for all a A and Z X \ X a do 3: Compute π0 a|Z according to (2) 4: end for 5: for all t = 1, . . . , T do 6: for all a A do 7: Compute criterion Ua(t) according to Alg: Bayes-UCB: Ua(t) = Q(1 − 1/(t(log T)c), πt−1 a ) as in (8) TS: Sample Ua(t) πt−1 a (µa) UCB: Ua(t) = Eπt−1 a [µa] + c q Varπt−1 a [µa] log(t) as in (7) 8: end for 9: Pull arm at argmaxa A Ua(t) and observe D(t) 10: for all Z X \ X a where a = at do 11: Update πt a|Z according to πt a|Z(θa) pθa(yt) πt−1 a|Z (θa) 12: end for 13: Compute or update the parent set posteriors (4) 14: end for
Open Source Code Yes The complete code and instructions for reproducing our results have been made available at the following link: https://github.com/jirehhuang/bcb
Open Datasets Yes We then applied BBB with hybrid MCMC to CHILD, a moderately sized discrete reference network with p = 20 nodes for diagnosing congenital heart disease (Spiegelhalter, 1992).
Dataset Splits No The paper does not explicitly provide training/test/validation dataset splits. It describes generating random CBN models and executing methods multiple times, and for the CHILD network, it focuses on sequential decision-making over time steps (T = 5000) rather than fixed data splits.
Hardware Specification No While the simulations were executed across a cluster of heterogeneous nodes, it was not difficult to observe that the computational requirements of BBB vastly exceeded that of the considered competing methods.
Software Dependencies No Order MCMC was implemented by extending the Bi DAG package (Suter et al., 2021) to accommodate computing scores with ensemble data as described in Section 3. For each iteration of BBB, the structure posterior was estimated by conducting 104 iterations with a thinning interval of 10 and discarding the first 20% as burn-in steps.
Experiment Setup Yes The best quantile constant in (8) was c = 0, in agreement with the empirical recommendation by Kaufmann et al. (2012). The best exploration parameter for UCB in (6) was c = 1/(2√2) for UCB(-) in the discrete setting. In the Gaussian setting, UCB and UCB preferred c = 1/2 and c = 1/√2, respectively, the latter of which we applied for BBB. We used standard uninformative priors for TS(-), with α0 = β0 = 1 for the Beta prior and m0 = 0, ν0 = 1, and u0 = v0 = 1 for the N-Γ-1 prior.