A Robust Method to Discover Causal or Anticausal Relation

Authors: Yu Yao, Yang Zhou, Bo Han, Mingming Gong, Kun Zhang, Tongliang Liu

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Both theoretical analyses and empirical results on a variety of datasets demonstrate the effectiveness of our proposed method in determining the causal or anticausal direction of the data generative process. We evaluate Ro CA estimator across 22 datasets. This includes 2 synthetic datasets (syn Causal and syn Anticausal), 3 bivariate datasets (for checking pairwise causal relations with only two variables), 14 multivariate datasets, and 3 image datasets (CIFAR10 (Krizhevsky et al., 2009), CIFAR10N (Wei et al., 2022), and Clothing1M (Xiao et al., 2015)).
Researcher Affiliation Academia 1Sydney AI Centre, The University of Sydney, 2TMLR Group, Hong Kong Baptist University 3RIKEN Center for Advanced Intelligence Project, 4The University of Melbourne 5Carnegie Mellon University, 6Mohamed bin Zayed University of Artificial Intelligence
Pseudocode Yes E PSEDUOCODE OF OUR INSTANCE-DEPENDENT NOISE GENERATION. Algorithm 1 Generation of Instance-dependent Noisy Labels
Open Source Code No The paper does not contain any explicit statements about releasing source code, nor does it provide links to any code repositories.
Open Datasets Yes We evaluate Ro CA estimator across 22 datasets. This includes 2 synthetic datasets (syn Causal and syn Anticausal), 3 bivariate datasets (for checking pairwise causal relations with only two variables), 14 multivariate datasets, and 3 image datasets (CIFAR10 (Krizhevsky et al., 2009), CIFAR10N (Wei et al., 2022), and Clothing1M (Xiao et al., 2015)).
Dataset Splits Yes We generated two additional synthetic datasets, syn Causal and syn Anticausal, to validate our Ro CA estimator. Each dataset consists of 20,000 instances with five attributes and one label. ... 3 image datasets (CIFAR10 (Krizhevsky et al., 2009), CIFAR10N (Wei et al., 2022), and Clothing1M (Xiao et al., 2015)).
Hardware Specification No The paper does not explicitly describe any specific hardware used for running its experiments (e.g., GPU models, CPU types, or memory).
Software Dependencies No The paper mentions software components like "K-means clustering method" and "SPICE clustering method" but does not specify any version numbers for these or other software dependencies.
Experiment Setup Yes Specifically, for each instance in the dataset, we compute its ℓ1 norm. These computed norms are stored in a vector A. Subsequently, we generate a vector P of length m, where each element represents a flip rate sampled from a truncated normal distribution ψ. The mean of the truncated normal distribution is the expected noise level to be injected, the variance is set to 1, the lower limit is 0, and the upper limit is 1. ... we uniformly sample 20 average noise levels from 0 to 0.5. ... We use K-means clustering method (Likas et al., 2003) for non-image datasets and the SPICE clustering method (Niu et al., 2021) for image datasets to obtain the pseudo labels Y . ... the latent dimension of the VAE is set to 10, and a Res Net-18 is used as the backbone for feature extraction.