No $D_{train}$: Model-Agnostic Counterfactual Explanations Using Reinforcement Learning
Authors: Xiangyu Sun, Raquel Aoki, Kevin H. Wilson
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the performance of NTD-CFE against four baselines on several datasets and find that, despite not having access to a training dataset, NTD-CFE finds CFEs that make significantly fewer and significantly smaller changes to the input time-series. These properties make CFEs more actionable, as the magnitude of change required to alter an outcome is vastly reduced. The code is available in the supplementary material. ... In this section, we provide qualitative examples and quantitative experiment results to demonstrate the effectiveness of NTD-CFE for multivariate data-series data. |
| Researcher Affiliation | Industry | Xiangyu Sun EMAIL RBC Borealis Raquel Aoki EMAIL RBC Borealis Kevin H. Wilson EMAIL RBC Borealis |
| Pseudocode | Yes | Algorithm 1 NTD-CFE. Best viewed in color. Typical RL code is colored in gray. |
| Open Source Code | Yes | The code is available in the supplementary material. |
| Open Datasets | Yes | Nine real-world multivariate time-series datasets are used for evaluation (Appendix A for details). ... Life Expectancy 1https://www.kaggle.com/datasets/vrec99/life-expectancy-2000-2015 ... NATOPS 2http://www.timeseriesclassification.com/description.php?Dataset=NATOPS ... PEMS-SF 3https://www.timeseriesclassification.com/description.php?Dataset=PEMS-SF ... Heartbeat 4http://www.timeseriesclassification.com/description.php?Dataset=Heartbeat ... e Ring5https://www.timeseriesclassification.com/description.php?Dataset=ERing ... Racket Sports 6https://www.timeseriesclassification.com/description.php?Dataset=Racket Sports ... Basic Motions 7https://www.timeseriesclassification.com/description.php?Dataset=Basic Motions ... Japanese Vowels 8https://www.timeseriesclassification.com/description.php?Dataset=Japanese Vowels ... Libras 9https://www.timeseriesclassification.com/description.php?Dataset=Libras |
| Dataset Splits | No | The paper does not explicitly provide specific training/test/validation split percentages or sample counts for the datasets used in their experiments. While it discusses 'invalid samples' which are testing samples, and mentions baselines requiring training datasets, it does not detail how the data was partitioned for training and evaluating the predictive models. |
| Hardware Specification | No | All the experiments are conducted on CPU and with 32GB of RAM. No specific CPU models or other detailed hardware specifications are provided. |
| Software Dependencies | No | The paper mentions 'Adam (Kingma & Ba, 2014) is used as the optimizer' and 'LSTM' as a model type, but it does not specify version numbers for any programming languages, libraries, or frameworks used for implementation (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | We use a unique set of hyperparameter values for NTD-CFE throughout the paper, unless otherwise stated, without fine-tuning them: proximity weight λpxmt = 0.001 maximum number of interventions per episode MT = 100 maximum number of episodes ME = 100 discount factor γ = 0.99 learning rate α = 0.0001 regularization weight λWD = 0.0 The RL policy network contains two hidden linear layers with 1000 and 100 neurons, respectively. Adam (Kingma & Ba, 2014) is used as the optimizer. |