Robust Root Cause Diagnosis using In-Distribution Interventions

Authors: Lokesh Nagalapatti, Ashutosh Srivastava, Sunita Sarawagi, Amit Sharma

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on both synthetic and Pet Shop RCD benchmark datasets demonstrate that IDI consistently identifies true root causes more accurately and robustly than nine existing state-of-the-art RCD baselines. We then conduct experiments by systematically varying the SCM's complexity to demonstrate the cases where IDI's interventional approach outperforms the counterfactual approach and vice versa. Experiments on both synthetic and Pet Shop RCD benchmark datasets demonstrate that IDI consistently identifies true root causes more accurately and robustly than nine existing state-of-the-art RCD baselines.
Researcher Affiliation Collaboration 1Indian Institute of Technology Bombay 2International Institute of Information Technology Hyderabad 3Microsoft Research India
Pseudocode Yes The pseudocode for the multi-root-cause diagnosis algorithm is presented in Alg 1 in the Appendix.
Open Source Code No Code will be released at https://github.com/nlokeshiisc/IDI_release.
Open Datasets Yes Experiments on both synthetic and Pet Shop RCD benchmark datasets demonstrate that IDI consistently identifies true root causes more accurately and robustly than nine existing state-of-the-art RCD baselines. Pet Shop (Hardt et al., 2024) is a recent dataset designed for benchmarking RCD methods in the cloud domain, featuring a call graph G that causally links key performance indicators (KPIs).
Dataset Splits Yes We generate n {25, 50, 100, 1000} training samples, along with 100 validation and 100 test samples, each with a unique root cause.
Hardware Specification No No specific hardware details (GPU/CPU models, processor types, memory amounts, or detailed computer specifications) were explicitly mentioned for running the experiments. The paper generally refers to 'cloud services' without further specifications.
Software Dependencies No We implemented IDI in the RCD library released by Pet Shop (Hardt et al., 2024)1. Pet Shop uses Dowhy (Sharma & Kiciman, 2020) and gcm (Bl obaum et al., 2022) for causal inference and Py RCA (Liu et al., 2023) for root cause analysis.
Experiment Setup Yes We sample the linear weights w1, w2, w3 from N(0, 1) and define the non-linear model as X4 = |X2|+exp( X3) 3 + ϵ4. We draw the exogenous variables ϵ1, ϵ2, ϵ3 from N(0, 1) and define the structural equations for the root nodes as Xi = fi(ϵi) = ϵi for i {1, 2, 3}. ... We fit the linear model using closed-form regression and train the non-linear model as a three-layer MLP with 10 hidden nodes and Re LU activations via gradient descent. For the toy experiment, we sample xfix j from its true distribution N(0, 1). ...To assess anomalies, we use the Z-Score, defined for Xi as Z-score(xi) = |xi µi| σi where µi and σi are the sample mean and standard deviation computed for the ith node in the training data. ...Exogenous variables follow a uniform distribution ϵi U[0, 1] making their standard deviation std(ϵi) = 0.3. ... We start our search from 0 and increase them in steps of size 0.25 until the Z-score of the target node ϕn(xn) hits the anomaly threshold 3.