Learning Multiscale Non-stationary Causal Structures

Authors: Gabriele D'Acunto, Gianmarco De Francisci Morales, Paolo Bajardi, Francesco Bonchi

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical assessment on synthetic datasets demonstrates that MN-CASTLE outperforms baseline methods in various experimental settings and is robust to model misspecification. Finally, we apply MN-CASTLE to identify the drivers of the natural gas prices in the US market.
Researcher Affiliation Collaboration Gabriele D Acunto EMAIL DIAG, Sapienza University of Rome Centai Institute, Turin, Italy Gianmarco De Francisci Morales EMAIL Centai Institute, Turin, Italy Paolo Bajardi EMAIL Centai Institute, Turin, Italy Francesco Bonchi EMAIL Centai Institute, Turin, Italy
Pseudocode No The paper describes methods and processes through textual explanations and conceptual diagrams (e.g., Figure 2 "Sampling an MN-DAG"), but it does not include a formally structured pseudocode or algorithm block.
Open Source Code No The paper mentions using existing open-source libraries and implementations for baselines (e.g., "we exploit the implementations of Direct Li NGAM and GOLEM provided by g Castle2 (Zhang et al., 2021a), whereas we resort to causallearn3 for the implementation of CD-NOD." with GitHub links for these third-party tools). However, it does not provide an explicit statement or link for the open-source release of the MN-CASTLE methodology developed in this paper.
Open Datasets Yes In this section, we examine the key drivers of natural gas prices in the US market during the period spanning from January 1, 2018, to December 31, 2022. Our analysis considers several variables, including the price of natural gas (NG), crude oil (CO), deviations in gas storage (SD), rig counts targeting gas (RC), deviations from seasonal average values of gas consumption for cooling (CDD) and heating (HDD) environments, the crack spread between heating oil and crude oil (CS), and the economic uncertainty index (UI, Baker et al., 2016). We collected the data on a weekly basis... In detail, we downloaded Henry Hub natural gas futures prices (NG), WTI futures prices for crude oil (CO), New York Harbor No. 2 Heating Oil futures prices (HO), and US natural gas storage (ST) data from the website of the US Energy Information Administration (EIA). The crack spread was calculated as the difference between HO and CO, with HO being converted to dollars per barrel. The deviation of storage from the norm (SDD) was determined by comparing the ST value for a given week to the average value for the same week over the previous five years. We also downloaded rig counts (RC) data from Baker Hughes, and extracted deviations from seasonal average values of gas consumption for cooling (CDD) and heating (HDD) environments from the National Oceanic and Atmospheric Administration website. Finally, the economic uncertainty index (UI) was downloaded from the Federal Reserve Economic Data repository.
Dataset Splits No The paper describes generating synthetic datasets for evaluation (e.g., "For each possible combination (τ, µ) {0.0, 0.5, 0.9} {0.0, 0.5, 0.9}, we generate 20 datasets that contain N time series each of length T.") and mentions using real-world data from various public sources for a case study. However, it does not specify any training/test/validation splits for these datasets required to reproduce experiments.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions using several software tools and libraries (e.g., "Pyro (Bingham et al., 2019) a probabilistic programming language built on Python and PyTorch (Paszke et al., 2019)." and "GPyTorch (Gardner et al., 2018)", "R package mv LSW ((Taylor et al., 2019))"), but it does not specify exact version numbers for these components, which are crucial for reproducibility.
Experiment Setup Yes MN-CASTLE: fraction of inducing points equal to 64%; K = KRBF; in case τ = 0 (the estimated ˆOj is constant) we use as prior for λK a normal N(1. 103, 1. 10 3); ρ = 0.05; number of iterations iter= 6. 102 with 10 particles; MSCASTLE: ℓ1 penalty parameter λ = 1. 10 1; pruning threshold γ = 5. 10 2; Daubechies wavelet with filter length equal to 2; maximum value for dagness function htol = 1. 10 8; GOLEM: pruning threshold γ = 5. 10 2; number of iterations iter= 1. 104; Direct Li NGAM: pruning threshold γ = 5. 10 2; CDNOD: independence test = Fisher s Z; significance level α = 95%.