Variational Particle Approximations
Authors: Ardavan Saeedi, Tejas D. Kulkarni, Vikash K. Mansinghka, Samuel J. Gershman
JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | DPVI is illustrated and evaluated via experiments on lattice Markov Random Fields, nonparametric Bayesian mixtures and block-models, and parametric as well as non-parametric hidden Markov models. Results include applications to real-world spike-sorting and relational modeling problems, and show that DPVI can offer appealing time/accuracy trade-offs as compared to multiple alternatives. In this section, we compare the performance of DPVI to several widely used approximate inference algorithms, including particle filtering, Gibbs sampling and variational methods. We first present a didactic example to illustrate how DPVI can sometimes succeed where particle filtering fails. We then apply DPVI to four popular but intractable probabilistic models: the Dirichlet process mixture model (DPMM; Antoniak, 1974; Escobar and West, 1995), the infinite HMM (i HMM; Beal et al., 2002; Teh et al., 2006), the infinite relational model (IRM; Kemp et al., 2006) and the Ising model. |
| Researcher Affiliation | Collaboration | Ardavan Saeedi* EMAIL Computer Science & Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge, MA 02139, USA Tejas D. Kulkarni* EMAIL Deep Mind, London Vikash K. Mansinghka EMAIL Department of Brain & Cognitive Sciences Massachusetts Institute of Technology Cambridge, MA 02139, USA Samuel J. Gershman EMAIL Department of Psychology and Center for Brain Science Harvard University Cambridge, MA 02138, USA |
| Pseudocode | Yes | Algorithm 1 Discrete particle variational inference 1: /*N is the number of latent variables */ 2: /*xk is the set of all latent variables for the kth particle: xk = {xk 1, . . . , xk N} */ 3: /*Mn is the support of latent variable xn */ 4: Input: initial particle approximation Q with K particles, tolerance ϵ 5: while |L[Q] L[Q ]| > ϵ do 6: for n = 1 to N do 8: for k = 1 to K do 9: Copy particle k: xk xk 10: for m = 1 to Mn do 11: Modify particle: xk n m 12: Score xk using Eq. 12 13: X X ( xk, f( xk)) 14: end for 15: end for 16: Select K unique particles from X with the largest scores 17: Construct new particle approximation Q (x) = PK k=1 wkδ[x, xk] 18: Compute variational bound L[Q ] using Eq. 10 19: end for 20: end while 21: return particle approximation Q |
| Open Source Code | No | The paper does not provide explicit access to source code or a statement about its availability. |
| Open Datasets | Yes | We used data collected from a multiunit recording from a human epileptic patient (Quiroga et al., 2004). We next analyzed a real-world data set, text taken from the beginning of Alice in Wonderland, with 31 observation symbols (letters). The animals data set analyzed in Kemp et al. (2006), was used for this task. This data set (Osherson et al., 1991) is a two type data set R : T1 T2 {0, 1} with animals and features as it types; it contains 50 animals and 85 features. |
| Dataset Splits | Yes | We also provide quantitative results by calculating the held-out log-likelihood on an independent test set of spike waveforms. We used the first 1000 characters for training, and the subsequent 4000 characters for test. Our data set contains 30000 edits from which we use 1000 data points as a held-out set. We removed 20% of the relations from the data set and computed the predictive loglikelihood for the held-out data. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment. |
| Experiment Setup | Yes | For illustration, we use the following parameters: α0 = 0.2, α1 = 0.1, β0 = 0.3, and β1 = 0.2. For the DPMM, we used a Normal likelihood with a Normal-Inverse-Gamma prior on the component parameters: ynd|xn = k N(mkd, σ2 kd), mkd N(0, σ2 kd/τ), σ2 kd IG(a, b), where d {1, 2} indexes observation dimensions and IG(a, b) denotes the Inverse Gamma distribution with shape a and scale b. We used the following hyperparameter values: τ = 25, a = 1, b = 1, α = 0.5. We used the following hyperparameter values: ν = D + 1, Λ0 = I, τ = 0.01, α = 0.1. We fixed the hyperparameters α and γ to 1 for both DPVI and the particle filtering. For all the inference schemes, we set the hyperparameters α and γ to 1. We set the hyperparameters α and β to 1. |