reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On Noise Abduction for Answering Counterfactual Queries: A Practical Outlook

Authors: Saptarshi Saha, Utpal Garain

TMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We report experimental results on both synthetic and real-world German Credit Dataset, showcasing the promise and usefulness of the proposed exogenous noise identification.
Researcher Affiliation	Academia	Saptarshi Saha EMAIL Computer Vision and Pattern Recognition Unit Indian Statistical Institute, Kolkata Utpal Garain EMAIL Computer Vision and Pattern Recognition Unit Indian Statistical Institute, Kolkata
Pseudocode	No	The paper describes a "four-step procedure" for computing a counterfactual query in the SCM framework, which is presented as a numbered list within a paragraph in Section 5. It is not formatted as a distinct pseudocode or algorithm block.
Open Source Code	Yes	The code for reproducing the results is available at https://github.com/Saptarshi-Saha-1996/Noise-Abduction-for-Counterfactuals.
Open Datasets	Yes	We report experimental results on both synthetic and real-world German Credit Dataset, showcasing the promise and usefulness of the proposed exogenous noise identification. Dheeru Dua and Casey Graff. UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ ml.
Dataset Splits	No	The paper mentions generating 20000 data points for the synthetic dataset and distinguishes between 'seen' (training and validation) and 'unseen' (test) data for evaluation, stating: "By seen datapoints , we mean these are the datapoints used in training and validation. MSE in estimation of counterfactuals on unseen data points( test MSE) is given in Appendix C". However, it does not provide specific percentages or absolute counts for how the data was split into training, validation, and test sets for either the synthetic or German Credit datasets, nor does it refer to predefined splits with citations or detailed splitting methodologies.
Hardware Specification	Yes	Both models are trained for 1000 epochs using 12th Gen Intel(R) Core(TM) i9-12900KF CPU. All instances of both models are trained for 500 epochs using NVIDIA RTX A5000 GPU.
Software Dependencies	No	We use the Pyro (Bingham et al., 2019) probabilistic programming language (PPL) framework for the implementation of the flow-based SCM. Pyro is a PPL based on Py Torch (Paszke et al., 2019). Adam (Kingma & Ba, 2015) with batch-size 128, an initial learning rate of 10 3 is used for optimization purposes. Specific version numbers for Pyro, PyTorch, or Adam are not provided.
Experiment Setup	Yes	Adam (Kingma & Ba, 2015) with batch-size 128, an initial learning rate of 10 3 is used for optimization purposes. Both models are trained for 1000 epochs using 12th Gen Intel(R) Core(TM) i9-12900KF CPU. Adam (Kingma & Ba, 2015) with a batch-size of 64, an initial learning rate of 3 10 4, and weight decay of 10 4 are used in training. We use a staircase learning rate schedule with decay milestones at 50% and 75% of the training duration. All instances of both models are trained for 500 epochs using NVIDIA RTX A5000 GPU.