BayesPy: Variational Bayesian Inference in Python
Authors: Jaakko Luttinen
JMLR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This section demonstrates the key steps in using Bayes Py. An artificial Gaussian mixture dataset is created by drawing 500 samples from two 2-dimensional Gaussian distributions. 200 samples have mean [2, 2] and 300 samples have mean [0, 0]:...The speed of the packages were compared by using two widely used models: a Gaussian mixture model (GMM) and principal component analysis (PCA).3 Both models were run for small and large artificial datasets....The results are summarized in Table 1. |
| Researcher Affiliation | Academia | Jaakko Luttinen EMAIL Department of Computer Science Aalto University, Finland |
| Pseudocode | No | The paper includes Python code snippets demonstrating usage but does not contain structured pseudocode blocks or formal algorithms labeled as such. |
| Open Source Code | Yes | Bayes Py is an open-source Python software package for performing variational Bayesian inference. It is based on the variational message passing framework and supports conjugate exponential family models....The package is released under the MIT license....The latest development version is available at Git Hub2, which is also the platform used for reporting bugs and making pull requests. 2. https://github.com/bayespy/bayespy |
| Open Datasets | No | An artificial Gaussian mixture dataset is created by drawing 500 samples from two 2-dimensional Gaussian distributions. 200 samples have mean [2, 2] and 300 samples have mean [0, 0]...For GMM, the small model used 10 clusters for 200 observations with 2 dimensions, and the large model used 40 clusters for 2000 observations with 10 dimensions. For PCA, the small model used 10-dimensional latent space for 500 observations with 20 dimensions, and the large model used 40-dimensional latent space for 2000 observations with 100 dimensions. The scripts for running the experiments are available as supplementary material. The datasets used are artificial and their generation is either shown in an example or mentioned to be in supplementary material scripts; no pre-existing public dataset with concrete access information is provided. |
| Dataset Splits | No | An artificial Gaussian mixture dataset is created by drawing 500 samples from two 2-dimensional Gaussian distributions. 200 samples have mean [2, 2] and 300 samples have mean [0, 0]. The paper describes the composition of an artificial dataset and the characteristics of other artificial datasets but does not provide specific training/testing/validation splits. |
| Hardware Specification | Yes | The experiments were run on a quad-core (i7-4702MQ) Linux computer. |
| Software Dependencies | No | It requires Python 3 and a few popular packages: Num Py, Sci Py, matplotlib and h5py. While Python 3 specifies a major version, it lacks specific point release numbers and exact version numbers for the other listed packages. |
| Experiment Setup | Yes | We construct a mixture model for the data and assume that the parameters, the cluster assignments and the true number of clusters are unknown. The model uses a maximum number of five clusters but the effective number of clusters will be determined automatically:...The unknown cluster means and precision matrices are given Gaussian and Wishart prior distributions:...The cluster assignments are categorical variables, and the cluster probabilities are given a Dirichlet prior distribution:...Before running the VMP algorithm, the symmetry in the model is broken by a random initialization of the cluster assignments:...The VMP algorithm updates the variables in turns and is run for 200 iterations or until convergence: |