Identifying Causal Direction via Variational Bayesian Compression
Authors: Quang-Duy Tran, Bao Duong, Phuoc Nguyen, Thin Nguyen
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on both synthetic and real-world benchmarks in cause-effect identification demonstrate the effectiveness of our proposed method, showing promising performance enhancements on several datasets in comparison to most related methods. |
| Researcher Affiliation | Academia | 1Deakin Applied Artificial Intelligence Initiative, Deakin University, Geelong, Australia. Correspondence to: Quang-Duy Tran <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Forward pass of fully-connected Bayesian layer with Gaussian variational posteriors |
| Open Source Code | No | The paper states that the implementation of fully-connected Bayesian layers is adapted from publicly available code (Footnote 3: 'Bayesian layers: https://github.com/Karen Ullrich/Tutorial_Bayesian Compression For DL'), but it does not provide an explicit statement or link for the open-sourcing of the complete COMIC methodology developed in this paper. |
| Open Datasets | Yes | Accessing the Benchmarks All the benchmarks are available in the repository of LOCI (Immer et al., 2023)11. (Footnote 11: 'See Fn. 9' which refers to '9LOCI: https://github.com/aleximmer/loci') |
| Dataset Splits | No | All samples of each pair are utilized in the training process, i.e., we set the batch size to be equal to the number of available samples of each dataset. This indicates that for each cause-effect pair, all available data samples within that pair are used for training, implying no separate training/validation/test splits are made within each pair for model training. |
| Hardware Specification | Yes | All the experiments are conducted on a workstation with an Intel Core i7 processor, 64 GB of memory, and 3 TB of storage, except for those involving GPLVM, which are executed on NVIDIA Tesla V100 GPUs. |
| Software Dependencies | No | The paper mentions software like Python, PyTorch, R, rpy2, Causal Discovery Toolkit (CDT), GPLVM, LOCI, and CAM, but it does not provide specific version numbers for these key software components as required for reproducibility. |
| Experiment Setup | Yes | Each neural network includes one hidden layer with Dh = 50 nodes with the hyperbolic tangent (tanh) as the activation function and a fully-connected output layer. The natural exponential function is chosen as the positive link function... The neural networks of COMIC are optimized to minimize the variational Bayesian objective with the Adam optimizer (Kingma & Ba, 2015) and the cosine learning rate scheduler with a maximum learning rate of 10-2, a minimum learning rate of 10-6, and T = 2,500 training epochs. ...we choose TWU = 250. In addition, to avoid overfitting the hyperparameters of the priors, we pretrain the models and the hyperparameters with maximum a posteriori (MAP) estimation before the variational optimization for TWU = 2,500 epochs... All samples of each pair are utilized in the training process, i.e., we set the batch size to be equal to the number of available samples of each dataset. |