XAudit : A Learning-Theoretic Look at Auditing with Explanations
Authors: Chhavi Yadav, Michal Moshkovitz, Kamalika Chaudhuri
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically evaluate our proposed auditors on standard datasets. Our experiments show that unlike the worst case, typical anchors significantly reduce the number of queries needed to audit linear classifiers for feature sensitivity. Additionally, our proposed Anchor Augmentation technique helps reduce the query complexity over a no-anchor approach. Similarly, our experiments on decision trees demonstrate that the average number of queries to audit feature sensitivity is considerably lower than number of nodes in the tree. In this section we conduct experiments on standard datasets to test some aspects of feature sensitivity auditing. |
| Researcher Affiliation | Collaboration | Chhavi Yadav EMAIL UC San Diego Michal Moshkovitz EMAIL Bosch Center for AI Kamalika Chaudhuri EMAIL UC San Diego, Meta AI |
| Pseudocode | Yes | Algorithm 1 Alg LCc : Auditing Linear Classifiers using Counterfactuals; Algorithm 2 Alg LCa : Auditing Linear Classifiers using Anchors; Algorithm 3 Alg DT : Auditing Decision Trees using decision path explanations; Algorithm 4 A General Auditor for finite hypothesis classes; Algorithm 5 check_stopping_condition(); Algorithm 6 picking_next_query(); Algorithm 7 update_search_space(); Algorithm 8 Alg LCc : Auditing Linear Classifiers using Counterfactuals; Algorithm 9 Alg LCa : Auditing Linear Classifiers using Anchors; Algorithm 10 findpath(x, P); Algorithm 11 perturb(x, p) |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its own source code or a link to a code repository. It only mentions using 'the matlab code provided by Alabdulmohsin et al. (2015)' for some experiments. |
| Open Datasets | Yes | The datasets used in our experiments are Adult Income Becker and Kohavi (1996), Covertype Blackard (1998) and Credit Default Yeh (2016). |
| Dataset Splits | Yes | We use an 80-20 split to create the train-test sets. |
| Hardware Specification | No | All of our experiments on a CPU varying from a minute to 2-3 hours. |
| Software Dependencies | No | We learn both our linear and tree classifiers using scikit-learn. To run the anchor experiments, we use the matlab code provided by Alabdulmohsin et al. (2015). |
| Experiment Setup | Yes | We learn decision trees for each of the datasets using scikit-learn which implements CART Breiman (2017) to construct the tree. All of these use the Fo I to make predictions. We vary tree depth by fixing the max-depth hyperparameter. Then we freeze the tree and run Alg DT a 1000 times. We report an average of the total queries required to audit across the 1000 runs. We set the augmentation size (number of anchor points augmented) to a maximum of 30. |