Locally Private Causal Inference for Randomized Experiments
Authors: Yuki Ohnishi, Jordan Awan
JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we present simulation studies to evaluate the performance of our proposed frequentist and Bayesian methodologies for various privacy budgets, resulting in useful suggestions for performing causal inference for privatized data. |
| Researcher Affiliation | Academia | Yuki Ohnishi EMAIL Department of Biostatistics Yale School of Public Health New Haven, CT 06510, USA Jordan Awan EMAIL Department of Statistics Purdue University West Lafayette, IN 47907, USA |
| Pseudocode | Yes | 4.2 Algorithm Outlines Equation (5) motivates the Gibbs sampling procedures to obtain the draws from the posterior distribution of θ. This section describes the key steps of the Gibbs sampler. Each step is derived from the corresponding components of (5). For inference of DPM parameters, denoted by θ = (µ, Σ, u), we adopt an approximated blocked Gibbs sampler based on the truncation of the stick-breaking representation (Ishwaran and Zarepour, 2000), due to its simplicity. In this algorithm, we set a conservatively large upper bound, K , on the number of components that units potentially belong to. Let Ci {1, ..., K} denote the latent class indicators with a multinomial distribution, Ci Multinomial(u) where u = (u1, ..., u K) denote the weights of all components of the DPM. More specific details about the DPM are provided in the Appendix. The algorithm proceeds as follows. 1. Given Yi(0), Yi(1), draw each Wi from P(Wi = 1| ) = r1 r0+r1 , where rw = P( Yi | Yi(w))P( Wi | Wi = w)P(Wi = w) for w = 0, 1. 2. Given µ, Σ, u, Ci and Wi, draw each Yi(0) and Yi(1) according to: P(Yi(Wi)| ) P(Yi(Wi) | µCi Wi, ΣCi Wi)P( Yi | Yi(Wi)) P(Yi(1 Wi)| ) P(Yi(1 Wi) | µCi 1 Wi, ΣCi 1 Wi). 3. Update model parameters via the blocked Gibbs sampler and calculate the estimands. |
| Open Source Code | No | The paper does not contain any explicit statements about making the source code available, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We then apply our methodologies to a real-world causal inference task. We analyzed a randomized experiment that examined the impact of a cash transfer program on students attendance rates (Barrera-Osorio et al., 2011). Conducted at San Cristobal in Colombia, the study recruited households with one to five school children, randomly assigning children to either participate in the cash transfer program or not with probability p = 0.628. |
| Dataset Splits | No | The paper describes data generation mechanisms for simulations and treatment assignment probabilities for real-world data, but it does not specify any training/test/validation splits or cross-validation strategies for model evaluation. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to conduct the experiments, such as CPU or GPU models, or memory specifications. |
| Software Dependencies | No | The paper describes the methodology and algorithms but does not specify any software dependencies with version numbers, such as programming languages, libraries, or frameworks used for implementation. |
| Experiment Setup | Yes | We ran the MCMC algorithm for 100, 000 iterations using a burn-in of 50, 000. The iteration numbers were chosen after experimentation to deliver stable results over multiple runs. |