Comparing Targeting Strategies for Maximizing Social Welfare with Limited Resources

Authors: Vibhhu Sharma, Bryan Wilder

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we use data from 5 real-world RCTs in a variety of domains to empirically assess such choices. We find that when treatment effects can be estimated with high accuracy (which we simulate by allowing the model to partially observe outcomes in advance), treatment effect based targeting substantially outperforms risk-based targeting, even when treatment effect estimates are biased. Moreover, these results hold even when the policymaker has strong normative preferences for assisting higher-risk individuals.
Researcher Affiliation Academia Vibhhu Sharma & Bryan Wilder Machine Learning Department Carnegie Mellon University Pittsburgh, PA 15232, USA EMAIL
Pseudocode No The paper describes methods and procedures in narrative text and mathematical formulas (e.g., Section 3.2, 3.3, Appendix A.3) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Reproducibility Statement: The supplementary material includes code including data preprocessing and experimentation for each of the datasets. We also detail our procedures in the Appendix A (dataset details) and in Section 4 (step by step experimental procedure).
Open Datasets Yes We conduct experiments on a variety of RCTs across different domains as detailed below: Targeting the Ultra Poor (TUP) in India ((Banerjee et al., 2021)): ... NSW (National Supported Work demonstration) Dataset ((Dehejia & Wahba, 1999; 2002; La Londe, 1986): ... Postoperative Pain Dataset: Patients undergoing operations like tracheal intubations often experience throat pain following treatment (Mchardy & Chung, 1999). ... Acupuncture Dataset: (Vickers et al., 2004) ... Tennessee s Student Teacher Achievement Ratio (STAR) project (Achilles et al., 2008):
Dataset Splits No A.1 EXPERIMENT DETAILS: Real Setting: We divide the RCT data into two splits such that one split is used for training nuisance functions and the other split is used entirely for evaluation. Semi-synthetic Setting: We divide the RCT into two splits such that we use each split to obtain treatment effect estimates for the other split and make maximal use of available data. While the paper mentions dividing data into 'two splits', it does not provide specific percentages, sample counts, or reproducible details for these splits.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments, such as GPU/CPU models, memory, or cloud resources.
Software Dependencies No The paper mentions using a 'random forest regressor' and 'kernel regression method' and 'doubly-robust estimator' but does not specify any software libraries (e.g., scikit-learn, PyTorch) or their version numbers.
Experiment Setup No The paper describes the general methodology for estimating treatment effects and baseline risk, including the use of doubly-robust estimators, random forest regressors, and kernel regression. It also describes how confounding is introduced and different welfare functions. However, it lacks specific hyperparameters for the machine learning models (e.g., number of trees in random forest, learning rates, kernel parameters) that would be necessary for exact reproduction of the experiments.