Identifying Causal Direction via Variational Bayesian Compression

Authors: Quang-Duy Tran, Bao Duong, Phuoc Nguyen, Thin Nguyen

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on both synthetic and real-world benchmarks in cause-effect identification demonstrate the effectiveness of our proposed method, showing promising performance enhancements on several datasets in comparison to most related methods.
Researcher Affiliation Academia 1Deakin Applied Artificial Intelligence Initiative, Deakin University, Geelong, Australia. Correspondence to: Quang-Duy Tran <EMAIL>.
Pseudocode Yes Algorithm 1 Forward pass of fully-connected Bayesian layer with Gaussian variational posteriors
Open Source Code No The paper states that the implementation of fully-connected Bayesian layers is adapted from publicly available code (Footnote 3: 'Bayesian layers: https://github.com/Karen Ullrich/Tutorial_Bayesian Compression For DL'), but it does not provide an explicit statement or link for the open-sourcing of the complete COMIC methodology developed in this paper.
Open Datasets Yes Accessing the Benchmarks All the benchmarks are available in the repository of LOCI (Immer et al., 2023)11. (Footnote 11: 'See Fn. 9' which refers to '9LOCI: https://github.com/aleximmer/loci')
Dataset Splits No All samples of each pair are utilized in the training process, i.e., we set the batch size to be equal to the number of available samples of each dataset. This indicates that for each cause-effect pair, all available data samples within that pair are used for training, implying no separate training/validation/test splits are made within each pair for model training.
Hardware Specification Yes All the experiments are conducted on a workstation with an Intel Core i7 processor, 64 GB of memory, and 3 TB of storage, except for those involving GPLVM, which are executed on NVIDIA Tesla V100 GPUs.
Software Dependencies No The paper mentions software like Python, PyTorch, R, rpy2, Causal Discovery Toolkit (CDT), GPLVM, LOCI, and CAM, but it does not provide specific version numbers for these key software components as required for reproducibility.
Experiment Setup Yes Each neural network includes one hidden layer with Dh = 50 nodes with the hyperbolic tangent (tanh) as the activation function and a fully-connected output layer. The natural exponential function is chosen as the positive link function... The neural networks of COMIC are optimized to minimize the variational Bayesian objective with the Adam optimizer (Kingma & Ba, 2015) and the cosine learning rate scheduler with a maximum learning rate of 10-2, a minimum learning rate of 10-6, and T = 2,500 training epochs. ...we choose TWU = 250. In addition, to avoid overfitting the hyperparameters of the priors, we pretrain the models and the hyperparameters with maximum a posteriori (MAP) estimation before the variational optimization for TWU = 2,500 epochs... All samples of each pair are utilized in the training process, i.e., we set the batch size to be equal to the number of available samples of each dataset.