reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Particle Gibbs Split-Merge Sampling for Bayesian Inference in Mixture Models

Authors: Alexandre Bouchard-Côté, Arnaud Doucet, Andrew Roth

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We investigate its performance experimentally on synthetic problems as well as on geolocation data. Our results show that for a given computational budget, the Particle Gibbs Split-Merge sampler empirically outperforms existing split merge methods. In this section, we demonstrate the performance of our methodology and compare it to standard alternatives. We use a series of synthetic datasets covering a large spectrum of cluster separateness, as well as real data coming from a geolocation application.
Researcher Affiliation	Academia	Department of Statistics University of British Columbia, Department of Statistics University of Oxford, United Kingdom, Department of Statistics and Ludwig Institute for Cancer Research University of Oxford, United Kingdom
Pseudocode	Yes	Algorithm 1, Algorithm 2, Algorithm 3, Algorithm 4, Algorithm 5
Open Source Code	Yes	The code and instructions allowing to reproduce the experiments is available at https://github.com/aroth85/pgsm.
Open Datasets	Yes	First, the four datasets from Franti and Virmajoki (2006) denoted S1 4. ... Finally, we used 64 (DIM064) and 128 (DIM128) dimensional Normal datasets from Fr anti et al. (2006)... The dataset, described in more detail in Fr anti et al. (2010)... The data is freely accessible from http://cs.joensuu.fi/sipu/datasets/.
Dataset Splits	Yes	To evaluate the performance of the samplers, we held-out a random but ﬁxed 10% of each dataset.
Hardware Specification	No	Computing was supported by West Grid and Compute Canada.
Software Dependencies	No	We have implemented the following three Dirichlet Process (DP) clustering samplers in the same Python codebase
Experiment Setup	Yes	For subsequent experiments we used 20 particles. ... We used a threshold of 0.5 in all other experiments. ... We use a Gamma(1, 0.1) prior and resample the value of the concentration parameters α0 using a standard auxiliary variable method (Escobar and West, 1995). The value of α0 is initialized to 1.0.