Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Reconciling Predictive and Statistical Parity: A Causal Approach
Authors: Drago Plecko, Elias Bareinboim
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we demonstrate the importance of our findings on a real-world example. We now apply our approach in the context of criminal justice using the COMPAS dataset (Angwin et al. 2016), and demonstrate empirically the trade-off between SP and PP. |
| Researcher Affiliation | Academia | Drago Pleหcko and Elias Bareinboim Department of Computer Science, Columbia University, New York, NY 10027 EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: Business Necessity Cookbook |
| Open Source Code | Yes | see https://github.com/dplecko/sp-to-pp/blob/main/sp-pp-compas.R for source code |
| Open Datasets | Yes | We now apply Alg. 1 to the COMPAS dataset (Angwin et al. 2016), as described in the following example. |
| Dataset Splits | No | The paper does not provide specific training/validation/test dataset splits. It mentions using the COMPAS dataset and bootstrap repetitions for confidence intervals, but not how the data was partitioned into these specific sets for model training and evaluation reproducibility. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments, such as GPU/CPU models or specific computational resources. |
| Software Dependencies | No | The paper mentions using the "fairadapt package" and "random forests" but does not specify version numbers for any software dependencies. |
| Experiment Setup | No | The paper does not provide specific details about the experimental setup, such as hyperparameter values, learning rates, batch sizes, or other system-level training settings for the models used. |