Optimal Experiment Design for Causal Effect Identification

Authors: Sina Akbari, Jalal Etesami, Negar Kiyavash

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our evaluation consists of three parts. First, we evaluate our heuristic algorithms and the approximation version of Algorithm 2 against the exact algorithm, in terms of runtime and normalized regret, when solving the target query Q[S] is such that G[S] is a c-component. Throughout this section, the normalized regret of a solution A is defined as (C(A) C )/C , where C denotes the cost of the optimal minimum-cost solution. Second, we evaluate our exact approach, Algorithm 2 in terms of the number of hedges it requires to discover for recovering the optimal solution, against the number of total existing hedges formed for Q[S] in G. This indeed translates to the number of iterations of Algorithm 2. Finally, we compare the two algorithms proposed for minimum-cost intervention design when G[S] comprises multiple c-components, namely Algorithms 3 and 4, in terms of runtime.
Researcher Affiliation Academia Sina Akbari EMAIL EPFL, Lausanne, Switzerland Jalal Etestami EMAIL TUM, Munich, Germany Negar Kiyavash EMAIL EPFL, Lausanne, Switzerland
Pseudocode Yes Algorithm 1 Find Hhull(S, G), where G[S] is a c-component. Algorithm 2 Exact algorithm for minimum-cost intervention(S, G), where G[S] is a c-component. Algorithm 3 Naive general algorithm for minimum-cost intervention, when G[S] is not necessarily a c-component. Algorithm 4 MCFP-based approximation algorithm for minimum-cost intervention, when G[S] is not necessarily a c-component.
Open Source Code Yes The implementations of all the algorithms proposed in this work can be found at https://github.com/Sina Akbarii/ min_cost_intervention/tree/main.
Open Datasets Yes For evaluation, we generated causal graphs using the Erdos-Renyi generative model (Erd os and R enyi, 1960) as follows. We have evaluated our algorithms on a set of well-known graphs, which are the benchmark causal graphs in the causality literature. These graphs are obtained under the assumption of no latent variables. However, often the observed variables of a system are confounded by a hidden variable. We added a common confounder for each pair of variables in these graphs with probability q. We then ran our algorithms to find the min-cost intervention for identifying Q[S], where S is the last vertex in the causal order. We assumed that the cost of intervening on each variable is uniformly sampled from {1, 2, 3, 4}.
Dataset Splits No The paper describes how causal graphs were generated and parameters for random graphs were set (e.g., edge probabilities p and q, random cost assignments). It also mentions using benchmark causal graphs from literature (Barley, Water, Mehra structures) and modifying them. However, it does not specify any splits of these datasets into training, validation, or test sets for model evaluation. The focus is on generating or modifying graphs as inputs for algorithm performance evaluation, not on traditional dataset splitting.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments or simulations. It only discusses the theoretical time complexity of algorithms and presents runtime graphs in terms of seconds.
Software Dependencies No The paper mentions several algorithms and methods, such as 'Ford-Fulkerson, Edmonds-Karp, or push-relabel algorithm' and refers to the 'bnlearn.com' repository. However, it does not specify any software names with version numbers (e.g., Python 3.8, PyTorch 1.9, specific solver versions) that would be needed to replicate the experimental results.
Experiment Setup Yes For evaluation, we generated causal graphs using the Erdos-Renyi generative model (Erd os and R enyi, 1960) as follows. For a given number of vertices |V | = n, we fixed a causal order over the vertices. Then, directed edges were sampled with probability p = 0.35 and bidirected edges were sampled with probability q = 0.25 between the vertices, mutually independently. The set S was selected randomly among the last 5% of the vertices in the causal order such that G[S] is a c-component. Intervention costs of vertices were chosen independently at random from {1, 2, 3, 4}.