reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Counterfactually Invariant Predictors

Authors: Francesco Quinzan, Cecilia Casolo, Krikamol Muandet, Yucen Luo, Niki Kilbertus

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results demonstrate the effectiveness of CIP in enforcing counterfactual invariance across various simulated and real-world datasets including scalar and multi-variate settings. 4 Experiments Baselines. 4.1 Synthetic experiments 4.2 Image experiments 4.3 Fairness with continuous protected attributes
Researcher Affiliation	Collaboration	Francesco Quinzan Department of Computer Science, The University of Oxford Cecilia Casolo Techincal University of Munich Helmholtz Munich Munich Center for Machine Learning (MCML) Krikamol Muandet CISPA Helmholtz Center for Information Security Yucen Luo Citadel Securities
Pseudocode	No	The paper describes methods using mathematical formulations and figures for causal graphs but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code, nor does it provide a link to a code repository.
Open Datasets	Yes	In experiments, we build a semi-synthetic DGP assuming the graph in Fig. 1(e) for the UCI adult dataset (Kohavi & Becker, 1996). For concrete demonstration, we use the d Sprites dataset (Matthey et al., 2017) consisting of simple black and white images of different shapes (squares, ellipses, . . . ), sizes, orientations, and locations.
Dataset Splits	Yes	We generate 10k samples from the observational distribution in each setting and use an 80 to 20 train-test split. All metrics reported are on the test set.
Hardware Specification	No	The paper mentions training fully connected neural networks (MLPs) and CNNs but does not specify any hardware details like GPU models, CPU types, or memory.
Software Dependencies	No	The paper mentions using fully connected neural networks (MLPs) and CNNs but does not provide specific software names with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x).
Experiment Setup	Yes	For all synthetic experiments, we train fully connected neural networks (MLPs) with MSE loss Lmse( ˆY) as the predictive loss L in Eq. (1) for continuous outcomes Y. ... The HSCIC( ˆY, A W \| S) term is computed as in Eq. (2) using a Gaussian kernel with amplitude 1.0 and length scale 0.1. The regularization parameter λ for the ridge regression coefficients is set to λ = 0.01.