Particle Gibbs Split-Merge Sampling for Bayesian Inference in Mixture Models

Authors: Alexandre Bouchard-Côté, Arnaud Doucet, Andrew Roth

JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We investigate its performance experimentally on synthetic problems as well as on geolocation data. Our results show that for a given computational budget, the Particle Gibbs Split-Merge sampler empirically outperforms existing split merge methods. In this section, we demonstrate the performance of our methodology and compare it to standard alternatives. We use a series of synthetic datasets covering a large spectrum of cluster separateness, as well as real data coming from a geolocation application.
Researcher Affiliation Academia Department of Statistics University of British Columbia, Department of Statistics University of Oxford, United Kingdom, Department of Statistics and Ludwig Institute for Cancer Research University of Oxford, United Kingdom
Pseudocode Yes Algorithm 1, Algorithm 2, Algorithm 3, Algorithm 4, Algorithm 5
Open Source Code Yes The code and instructions allowing to reproduce the experiments is available at https://github.com/aroth85/pgsm.
Open Datasets Yes First, the four datasets from Franti and Virmajoki (2006) denoted S1 4. ... Finally, we used 64 (DIM064) and 128 (DIM128) dimensional Normal datasets from Fr anti et al. (2006)... The dataset, described in more detail in Fr anti et al. (2010)... The data is freely accessible from http://cs.joensuu.fi/sipu/datasets/.
Dataset Splits Yes To evaluate the performance of the samplers, we held-out a random but fixed 10% of each dataset.
Hardware Specification No Computing was supported by West Grid and Compute Canada.
Software Dependencies No We have implemented the following three Dirichlet Process (DP) clustering samplers in the same Python codebase
Experiment Setup Yes For subsequent experiments we used 20 particles. ... We used a threshold of 0.5 in all other experiments. ... We use a Gamma(1, 0.1) prior and resample the value of the concentration parameters α0 using a standard auxiliary variable method (Escobar and West, 1995). The value of α0 is initialized to 1.0.