reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Differentially Private Stochastic Expectation Propagation

Authors: Margarita Vinaroz, Mijung Park

TMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Furthermore, we test the DP-SEP algorithm on both synthetic and real-world datasets and evaluate the quality of posterior estimates at different levels of guaranteed privacy. ... 6 Experiments The purpose of this section is to evaluate the performance of DP-SEP on different tasks and datasets. First, we consider a Mixture of Gaussians for clustering problem on a synthetic dataset and test DP-SEP at different levels of privacy guarantees. In the second experiment, we consider a Bayesian neural network model for regression tasks and quantitatively compare our algorithm with other existing non-private methods for Bayesian inference.
Researcher Affiliation	Academia	Margarita Vinaroz EMAIL University of Tübingen International Max Planck Research School for Intelligent Systems (IMPRS-IS) Mi Jung Park EMAIL University of British Columbia CIFAR AI Chair at AMII
Pseudocode	Yes	Algorithm 1 EP ... Algorithm 2 SEP ... Algorithm 3 DP-SEP
Open Source Code	Yes	Our code is available at: https://github.com/mvinaroz/DP-SEP
Open Datasets	Yes	We also provide experimental results applied to a synthetic dataset for a mixture-of-Gaussian model and several real-world datasets for a Bayesian neural network model. ... The datasets used in the experiments are publicly available at the UCI machine learning repository.
Dataset Splits	Yes	We consider the 90% of the original dataset randomly subsampled without replacement as a training dataset and the remaining 10% as a test dataset. All the training datasets are normalized to have zero mean and unit variance on their input features and targets. ... for Year, where only one split is performed according to the recommendations of the dataset.
Hardware Specification	No	Acknowledgments We thank the support, computational resources, and services provided by the Canada CIFAR AI Chairs program (at AMII) and the Digital Research Alliance of Canada (Compute Canada).
Software Dependencies	No	For this, we use the auto-dp package (https://github.com/yuxiangw/autodp) to compute the privacy parameter σ given our choice of ϵ, δ values and the number of times we access data while running our algorithm. ... We derive the variational inference procedure in Sec. D in Appendix and implement the differentially private version of it in Py Torch.
Experiment Setup	Yes	For DP-SEP we set the clipping norm to C = 1. For SEP and DP-SEP, we fixed the damping value, γ = 1, i.e., γ/N = 1/1000. ... We set the number of hidden units to 100 for Year and Protein datasets and to 50 for the other four UCI datasets we used. ... We set the privacy budget to ϵ = 1 and δ = 10 5 in DP-SEP. In DP-VI experiments we fixed δ = 10 5 and set σ value to get a final ϵ 1. The detailed hyper-parameter setting for VI and DP-VI experiments can be found in Table 6 and Table 7 in Sec. D.1 in Appendix.