Data-centric Prediction Explanation via Kernelized Stein Discrepancy
Authors: Mahtab Sarvmaili, Hassan Sajjad, Ga Wu
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct several qualitative and quantitative experiments to demonstrate various properties of HD-Explain and compare it with the existing example-based solutions. Datasets: We consider multiple disease classification tasks where diagnosis explanation is highly desired. We also introduced synthetic and benchmark classification datasets to deliver the main idea without the need for medical background knowledge. Concretely, we use CIFAR-10 (32 32 3), Brain Tumor (Magnetic Resonance Imaging, 128 128 3), Ovarian Cancer (Histopathology Images, 128 128 3) datasets, and SVHN (32 32 3). More details are listed in the Appendix F. |
| Researcher Affiliation | Academia | Mahtab Sarvmaili, Hassan Sajjad, Ga Wu Department of Computer Science Dalhousie University EMAIL |
| Pseudocode | Yes | L HD-EXPLAIN: EXPLANATION PROCESS The following algorithm shows the algorithm of HD-Explain in pseudocode. Algorithm 1 HD-Explain |
| Open Source Code | Yes | Source code is available at https://github.com/Mahtab Sarvmaili/HDExplain. |
| Open Datasets | Yes | Datasets: We consider multiple disease classification tasks where diagnosis explanation is highly desired. We also introduced synthetic and benchmark classification datasets to deliver the main idea without the need for medical background knowledge. Concretely, we use CIFAR-10 (32 32 3), Brain Tumor (Magnetic Resonance Imaging, 128 128 3), Ovarian Cancer (Histopathology Images, 128 128 3) datasets, and SVHN (32 32 3). More details are listed in the Appendix F. Table 2: Summary of datasets used in the paper. CIFAR-10 Classification Benchmark Image 60,000 32 32 3 10 No Yes Brain Tumor MRI Benchmark Image 7,023 128 128 3 4 Yes Yes |
| Dataset Splits | No | The paper describes how augmented test points were generated for evaluation, stating "We created 30 augmented test points for each training data point (> 10, 000 data points) in each dataset, resulting in more than 300, 000 independent runs." However, it does not explicitly provide the train/validation/test splits for the original datasets (e.g., CIFAR-10, SVHN) in terms of percentages, counts, or references to predefined splits, beyond mentioning "CIFAR-10 is a small benchmark data with 50000 training samples." |
| Hardware Specification | Yes | H HARDWARE SETUP We ran all our experiments on a machine equipped with a GTX 1080 Ti GPU, a second-generation Ryzen 5 processor, and 32 GB of memory. |
| Software Dependencies | No | The paper mentions using "Res Net-18 as the backbone model architecture" but does not specify versions for any programming languages, libraries, or frameworks used (e.g., Python, PyTorch, TensorFlow, CUDA versions). |
| Experiment Setup | No | The paper states "Our experiments use Res Net-18 as the backbone model architecture (with around 11 million trainable parameters) for all image datasets" and mentions data augmentations were conducted "including random cropping, rotation, shifting, horizontal flipping, and noise injection." However, it does not provide specific hyperparameters such as learning rate, batch size, optimizer details, number of epochs, or other system-level training configurations needed to reproduce the experiments. |