On Noise Abduction for Answering Counterfactual Queries: A Practical Outlook

Authors: Saptarshi Saha, Utpal Garain

TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We report experimental results on both synthetic and real-world German Credit Dataset, showcasing the promise and usefulness of the proposed exogenous noise identification.
Researcher Affiliation Academia Saptarshi Saha EMAIL Computer Vision and Pattern Recognition Unit Indian Statistical Institute, Kolkata Utpal Garain EMAIL Computer Vision and Pattern Recognition Unit Indian Statistical Institute, Kolkata
Pseudocode No The paper describes a "four-step procedure" for computing a counterfactual query in the SCM framework, which is presented as a numbered list within a paragraph in Section 5. It is not formatted as a distinct pseudocode or algorithm block.
Open Source Code Yes The code for reproducing the results is available at https://github.com/Saptarshi-Saha-1996/Noise-Abduction-for-Counterfactuals.
Open Datasets Yes We report experimental results on both synthetic and real-world German Credit Dataset, showcasing the promise and usefulness of the proposed exogenous noise identification. Dheeru Dua and Casey Graff. UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ ml.
Dataset Splits No The paper mentions generating 20000 data points for the synthetic dataset and distinguishes between 'seen' (training and validation) and 'unseen' (test) data for evaluation, stating: "By seen datapoints , we mean these are the datapoints used in training and validation. MSE in estimation of counterfactuals on unseen data points( test MSE) is given in Appendix C". However, it does not provide specific percentages or absolute counts for how the data was split into training, validation, and test sets for either the synthetic or German Credit datasets, nor does it refer to predefined splits with citations or detailed splitting methodologies.
Hardware Specification Yes Both models are trained for 1000 epochs using 12th Gen Intel(R) Core(TM) i9-12900KF CPU. All instances of both models are trained for 500 epochs using NVIDIA RTX A5000 GPU.
Software Dependencies No We use the Pyro (Bingham et al., 2019) probabilistic programming language (PPL) framework for the implementation of the flow-based SCM. Pyro is a PPL based on Py Torch (Paszke et al., 2019). Adam (Kingma & Ba, 2015) with batch-size 128, an initial learning rate of 10 3 is used for optimization purposes. Specific version numbers for Pyro, PyTorch, or Adam are not provided.
Experiment Setup Yes Adam (Kingma & Ba, 2015) with batch-size 128, an initial learning rate of 10 3 is used for optimization purposes. Both models are trained for 1000 epochs using 12th Gen Intel(R) Core(TM) i9-12900KF CPU. Adam (Kingma & Ba, 2015) with a batch-size of 64, an initial learning rate of 3 10 4, and weight decay of 10 4 are used in training. We use a staircase learning rate schedule with decay milestones at 50% and 75% of the training duration. All instances of both models are trained for 500 epochs using NVIDIA RTX A5000 GPU.