Variational Bayes In Private Settings (VIPS)
Authors: Mijung Park, James Foulds, Kamalika Chaudhuri, Max Welling
JAIR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically demonstrate the effectiveness of our method in CE and non-CE models including latent Dirichlet allocation, Bayesian logistic regression, and sigmoid belief networks, evaluated on real-world datasets. |
| Researcher Affiliation | Academia | Mijung Park EMAIL Max Planck Institute for Intelligent Systems, Department of Computer Science, University of Tübingen, Max-Planck-Ring 4, 72076 Tüinbingen, Germany James Foulds EMAIL Department of Information Systems, University of Maryland, Baltimore County, ITE 447, 1000 Hilltop Circle, Baltimore, MD 21250, USA Kamalika Chaudhuri EMAIL Department of Computer Science, University of California, San Diego, EBU3B 4110, University of California, San Diego, CA 92093, USA Max Welling EMAIL Amsterdam Machine Learning LAB (AMLAB), Informatics Institute, University of Amsterdam, Science Park 904 1098 XH Amsterdam, the Netherlands |
| Pseudocode | Yes | Algorithm 1 (Stochastic) Variational Bayes for CE family distributions, Algorithm 2 Private VIPS for CE family distributions, Algorithm 3 VIPS for LDA, Algorithm 4 (ϵtot, δtot)-DP VIPS for non-CE family with binomial likelihoods, Algorithm 5 VIPS for Bayesian logistic regression, Algorithm 6 VIPS for sigmoid belief networks |
| Open Source Code | Yes | Our code is available at https://github.com/mijungi/vips_code. |
| Open Datasets | Yes | We empirically demonstrate the effectiveness of our method in CE and non-CE models including latent Dirichlet allocation, Bayesian logistic regression, and sigmoid belief networks, evaluated on real-world datasets. ... We downloaded a random D = 400, 000 documents from Wikipedia to test our VIPS algorithm. ... We used the Stroke dataset, which was first introduced by Letham, Rudin, Mc Cormick & Madigan (2014) ... We used the Adult data (from the UCI data repository) ... We tested our VIPS algorithm for the SBN model on the binarized17 MNIST digit dataset |
| Dataset Splits | Yes | We randomly shuffled the data to make 5 pairs of training and test sets. For each set, we used 10, 069 patients records as training data and the rest as test data. ... We used 80% or original data for training and the rest for computing the AUC on test data. ... The MNIST digit dataset which contains 60,000 training images of ten handwritten digits (0 to 9)... we selected 100 randomly selected test images from 10, 000 test datapoints. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, memory, or cloud instance types) are explicitly mentioned in the paper for running experiments. The paper generally discusses 'stochastic learning' and 'big data setting' but without hardware specifications. |
| Software Dependencies | No | No specific software dependencies with version numbers are provided. The paper mentions 'Python implementation' for perplexity approximation but does not specify its version or any other library versions. |
| Experiment Setup | Yes | In our experiments, we use N = 500. ... We set a = 0.1 in our experiments... We used 50 topics and a vocabulary set of approximately 8000 terms. The algorithm was run for one epoch in each experiment. ... in which we varied the noise level σ {1.0, 1.1, 1.24, 1.5, 2} and the minibatch size S {5, 000, 10, 000, 20, 000}. ... In each training step, we randomly selected data samples from the training with a sampling rate 0.004. We ran each of these algorithms for 100 training steps with 20 different random initializations with varying levels of noise. ... The resulting ϵ values for a fixed value δ = 10 3 are ϵtot = {0.8, 0.05, 0.025} for σ = {1, 6, 12}, respectively. ... we considered a one-hidden layer SBN with 50 or 100 hidden units, i.e. K = {50, 100}. We varied the mini-batch size S = {400, 800, 1600, 3200}. For a fixed σ = 1, ... In all of these cases, we set δtotal = 10 4. |