reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Simulating Counterfactuals

Authors: Juha Karvanen, Santtu Tikka, Matti Vihola

JAIR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The critical part of Algorithm 4 is the quality of the sample returned by Algorithm 3. In this section, the performance of Algorithm 3 is studied using randomly generated linear Gaussian SCMs. Importantly, we can analytically derive the true distribution for comparison in this particular scenario, see Online Appendix 2 for the details. In the simulation experiment, we generate linear Gaussian SCMs with random graph structure and random coefficients, apply Algorithm 3, and compare the simulated observations with the true conditional normal distribution. The parameters of the simulation and the performance measures are explained in Online Appendix 3. The key results are summarized in Table 1.
Researcher Affiliation	Academia	Juha Karvanen EMAIL Santtu Tikka EMAIL Matti Vihola EMAIL Department of Mathematics and Statistics University of Jyvaskyla, Finland
Pseudocode	Yes	Algorithm 1 An algorithm for simulating n observations from a u-monotonic SCM M = (V, U, F, p(u)) on the condition that the value of a continuous variable C is c. The optional argument D0 is an n-row data matrix containing the values of some variables V0 V, U0 U that precede C in the topological order and have already been fixed. Algorithm 2 An algorithm for simulating n observations from causal model M = (V, U, F, p(u)) on the condition that the value of discrete variable C is c. The optional argument D0 is an n-row data matrix containing the values of some variables V0 V, U0 U that precede C in the topological order and have been already fixed. Algorithm 3 An algorithm for simulating n observations from a u-monotonic SMC M = (V, U, F, p(u)) with respect to all continuous variables in C1, . . . , CK under the conditions C = (C1 = c1) (CK = c K). The topological order of the variables in the conditions is C1 < C2 < < CK. The batch size n = n is used in Algorithm 2. Algorithm 4 An algorithm for simulating n observations from a counterfactual distribution under the conditions C = (C1 = c1) (CK = c K) in an SCM M = (V, U, F, p(u)) that is u-monotonic with respect to all continuous variables in the set {C1, . . . , CK}. The topological order of the variables in the conditions is C1 < C2 < < CK. Algorithm 5 An algorithm for evaluating the fairness of a prediction model b Y ( ) in a u-monotonic SCM M with sensitive variables S and response variables Y. The case to be considered is defined by conditions W and C where W = (W1 = w1) (WM = w M) denotes the conditions for Pa(Y) \ S (the non-sensitive observed parents of the responses Y), and C = (C1 = c1) (CK = c K) denotes the conditions for some other variables (which may include S). The argument n defines the number of simulated counterfactual observations. Algorithm 6 Particle Filter(M0:J, G1:J, n)
Open Source Code	Yes	The simulation code is available at https://github.com/Juha Karvanen/simulating_counterfactuals. ...the R code for the example is available in the repository https://github.com/Juha Karvanen/simulating_counterfactuals in the file fairness_example.R.
Open Datasets	Yes	We consider variables that are similar to those typically present in creditscoring datasets, such as Statlog (German Credit Data, Hofmann, 1994).
Dataset Splits	No	To set up the example, we simulated training data from an SCM corresponding to the causal diagram of Figure 1 and fitted prediction models A, B, and C for the default risk using XGBoost (Chen & Guestrin, 2016; Chen, He, Benesty, Khotilovich, Tang, Cho, Chen, Mitchell, Cano, Zhou, Li, M., Xie, Lin, Geng, Li, Y., & Yuan, 2023). These models are opaque AI models for the fairness evaluator who can only see the probability of default predicted by the models. Algorithm 5 was applied to prediction models A, B, and C in 1000 cases that were again simulated from the same SCM.
Hardware Specification	No	CSC IT Center for Science, Finland, is acknowledged for computational resources.
Software Dependencies	Yes	Algorithms 1–5 are implemented in the R package R6causal (Karvanen, 2024) which contains R6 (Chang, 2021) classes and methods for SCMs. ...fitted prediction models A, B, and C for the default risk using XGBoost (Chen & Guestrin, 2016; Chen et al., 2023).
Experiment Setup	No	To set up the example, we simulated training data from an SCM corresponding to the causal diagram of Figure 1 and fitted prediction models A, B, and C for the default risk using XGBoost (Chen & Guestrin, 2016; Chen, He, Benesty, Khotilovich, Tang, Y., Cho, Chen, Mitchell, Cano, Zhou, Li, Xie, Lin, Geng, Li, & Yuan, 2023). These models are opaque AI models for the fairness evaluator who can only see the probability of default predicted by the models. Algorithm 5 was applied to prediction models A, B, and C in 1000 cases that were again simulated from the same SCM. In the algorithm, b Y ( ) was one of the prediction models, M was the SCM whose causal diagram is depicted in Figure 1, sensitive variables S were gender and ethnicity, condition C contained all observed values of the case, and number of observations was n = 1000.