Semiparametric Mean Field Variational Bayes: General Principles and Numerical Issues

Authors: David Rohde, Matt P. Wand

JMLR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental One of the main outcomes of our numerical investigations is that fitting exponential family density functions via natural fixed-point iteration has some attractive properties. We use two examples to elucidate the general principles and numerical issues. The first, Example 1, involves a Bayesian model with a single parameter and, hence, is such that mean field approximation is not required. The simplicity of Example 1 allows a deep appreciation of the various issues with minimal notational overhead. Example 2 is the Bayesian Poisson mixed model treated in Wand (2014) and benefits from semiparametric mean field variational Bayes methodology. It demonstrates issues with high-dimensional optimization problems that are intrinsic to practical implementation. ... Based on Figure 5, we anticipate that natural fixed-point iteration is also very good in higher dimensions, and this is corroborated by experiments for Example 2 described in Section 4.3. ... Figure 6: Trace plots of log p(y; q, ξ)[(β,u)] and ρ Dg Ex2 for the version of the Poisson mixed model given by (48) with sample sizes m = 30 and n = 5.
Researcher Affiliation Academia David Rohde EMAIL School of Mathematical and Physical Sciences University of Techonology Sydney P.O. Box 123, Broadway, 2007, Australia Matt P. Wand EMAIL School of Mathematical and Physical Sciences University of Techonology Sydney P.O. Box 123, Broadway, 2007, Australia
Pseudocode Yes Algorithm 1: Coordinate ascent algorithm for semiparametric mean field variational Bayes when Ξ is a finite parameter space. Algorithm 2: The general semiparametric mean field variational Bayes algorithm for restriction (8) with log p(D; q, ξ)[φ] defined with respect to factor graph of p(x, θ, φ) with stochastic nodes θ1, . . . , θM and φ. Algorithm 3: The fixed-point iteration algorithm in generic form. Algorithm 4: The Newton-Raphson algorithm in generic form. Algorithm 5: The nonlinear conjugate gradient method for maximization of the function f with the Polak-Ribière form of the β parameter.
Open Source Code No The paper does not contain any explicit statements about providing source code, nor does it provide links to any code repositories for the methodology described.
Open Datasets No We simulated data from the n = 20 version of the Gumbel random sample model (14) with the value of φ set to 0. ... We simulated data according to the following special case of the Poisson mixed model: yij|Ui Poisson {exp(β0 + β1 xij + Ui)} , Ui| σ2 N(0, σ2), 1 i m, 1 j n, β N(0, σ2 β I), σ2| a Inverse-Gamma( 1...
Dataset Splits No The paper uses simulated data for its examples and experiments, rather than pre-existing datasets that would typically have train/test/validation splits. For the Gumbel random sample, it states, "We simulated data from the n = 20 version of the Gumbel random sample model." For the Poisson mixed model, it says, "We simulated data according to the following special case of the Poisson mixed model...with sample sizes m = 30, n = 5." These describe data generation rather than dataset splitting.
Hardware Specification No The paper does not provide specific details about the hardware used to run the numerical experiments or simulations.
Software Dependencies Yes optimization of f SN Ex1 was accomplished using the Broyden-Fletcher-Goldfarb-Shanno quasi-Newton method via the optim() function in the R computing environment (R Development Core Team, 2016).
Experiment Setup Yes We simulated data from the n = 20 version of the Gumbel random sample model (14) with the value of φ set to 0. The hyperparameters were set to µφ = 0 and σ2 φ = 1010. ... The hyperparameters were set at σβ = A = 105 and the sample sizes were m = 30, n = 5. ... The intractable integral in f SN Ex1 was approximated using a trapezoidal quadrature scheme similar to that described in Appendix B.2 of Wand et al. (2011). The limits of the trapezoidal grid were increased until the ratio of the global maximum and minimum absolute values of the integrand fell below 10-20. The number of grid points was then doubled until the relative difference between two successive iterations was less than 10-20. Multiple start locations and simulated annealing were used to locate global optima.