Personalized Algorithmic Recourse with Preference Elicitation
Authors: Giovanni De Toni, Paolo Viappiani, Stefano Teso, Bruno Lepri, Andrea Passerini
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical evaluation on real-world datasets highlights how PEAR produces high-quality personalized recourse in only a handful of iterations. |
| Researcher Affiliation | Academia | Giovanni De Toni EMAIL Augmented Intelligence Center, Fondazione Bruno Kessler, Italy DISI, University of Trento, Italy Paolo Viappiani EMAIL LAMSADE, CNRS, Université Paris-Dauphine, PSL, France Stefano Teso EMAIL CIMe C & DISI, University of Trento, Italy Bruno Lepri EMAIL Augmented Intelligence Center, Fondazione Bruno Kessler, Italy Andrea Passerini EMAIL DISI, University of Trento, Italy |
| Pseudocode | Yes | A high-level overview of PEAR is given in Fig. 1 and the pseudo-code is listed in Algorithm 1. Algorithm 1 The PEAR algorithm: h : S {0, 1} is a classifier, s(0) S the initial state, A the available actions, p(w) the prior, T 1 the query budget, k 2 is the size of choice sets. Algorithm 2 Greedy procedure to efficiently compute a choice set O: s(t) S the current state, A the available actions, k 2 is the size of choice sets, D(t) the user choices so far. |
| Open Source Code | Yes | We implemented PEAR, the competitors and the black box classifiers using Python (>= 3.7) and Py Torch (Paszke et al., 2019). For reproducibility purposes, the code and the pre-trained models are freely available online4. 4https://github.com/unitn-sml/pear-personalized-algorithmic-recourse |
| Open Datasets | Yes | We evaluated our approach on two real-world datasets taken from the relevant literature: Give Me Some Credit (Kaggle, 2011) and Adult (Dua & Graff, 2017). |
| Dataset Splits | Yes | We then split the data into training (70%), validation (10%) and test (20%) sets. |
| Hardware Specification | Yes | All the experiments were run on a virtual machine running Cent OS 7.6.18 with 165 cores, and 25 GiB of RAM. |
| Software Dependencies | Yes | We implemented PEAR, the competitors and the black box classifiers using Python (>= 3.7) and Py Torch (Paszke et al., 2019). ... We used the original code for both FARE5 and CSCF6, with minimal modifications to make them compatible with our experimental setting. For FACE, we used the implementation available in the CARLA library (Pawelczyk et al., 2021). ... We also manually performed additional standard data engineering tasks, such as removing entries with null values or checking for potential outliers. After the data cleaning and preprocessing steps, we kept the following features for each dataset: ... For Adult, we adopted the same action set used by De Toni et al. (2023), while for Give Me Some Credit we devised the functions ourselves. ... We also manually performed additional standard data engineering tasks, such as removing entries with null values or checking for potential outliers. After the data cleaning and preprocessing steps, we kept the following features for each dataset: ... We one-hot encoded categorical features and we performed min-max normalization for the continuous features using scikit-learn (Pedregosa et al., 2011). |
| Experiment Setup | Yes | For PEAR, we vary the number of questions T to the user from 0 to 10. For T = 0, we initialize the weights with the expected value of the prior, EP (w)[w], that represents a user-independent population-based prior. Moreover, we employ two user response models, the noiseless model (Eq. (11)), to check the effectiveness of our approach in the best-case scenario where the user can perfectly express their preferences, and the logistic model (Eq. (10)), to challenge our approach in a more realistic scenario. ... In our experiments, we set the number of simulations to 15 and 10 for Adult and Give Me Some Credit, respectively. We also set the maximum intervention length to 6 and 8, for Adult and Give Me Some Credit, respectively. The value cact_cost and cpuct are also hyperparameters. We set them to 1 and 0.5 respectively, for both experiments. ... We optimize Eq. (14) via Adam and we set the learning rate to 0.001 for Adult, and 0.003 for Give Me Some Credit. ... During training, we set the number of simulations to 15 and 10, for Adult and Give Me Some Credit, respectively. The noise fraction is instead set to ϵP = 0.3 for both, with ηp = 0.3. At inference time, we add no noise (ϵP = 0) and the number of simulations is fixed at 5. ... We set the population size, p = 50, and the maximum number of generations, n = 25, for both Adult and Give Me Some Credit, to keep the computation time manageable. ... In both Adult and Give Credit, we pick only 10% of the total instances. We set the number of neighbours, k, to 50 and the distance threshold to ϵ = 1.0 for both datasets. |