A Particle-Based Variational Approach to Bayesian Non-negative Matrix Factorization
Authors: Muhammad A Masood, Finale Doshi-Velez
JMLR 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On several real datasets, we obtain better particle approximations to the BNMF posterior in less time than baselines and demonstrate the significant role that multimodality plays in NMF-related tasks. Through our experiments, we demonstrate that: On a large number of real-world datasets, our particle-based posterior approximations consistently outperform baselines in terms of both posterior quality and computational running time. |
| Researcher Affiliation | Academia | Muhammad A Masood EMAIL Harvard John A. Paulson School of Engineering and Applied Science Cambridge, MA 02138, USA Finale Doshi-Velez EMAIL Harvard John A. Paulson School of Engineering and Applied Science Cambridge, MA 02138, USA |
| Pseudocode | Yes | Algorithm 1 Particle-based Variational Inference for BNMF using Q-Transform Input: Data {X}, Rank {RNMF}, # Factorizations M Step 1: Perform M repetitions of Algorithm 2 to get matrices {Qm A , Qm W }M m=1 or re-use them if previously constructed Step 2: Apply Q-Transform (Algorithm 3) to get Initializations {Am 0 , W m 0 }M m=1 Step 3: Apply NMF algorithm to get Factorizations {Am, W m}M m=1 Step 4: Apply Algorithm 5 using a given BNMF model to get weights {wm}M m=1 for approximate posterior Output: Discrete NMF Posterior {wm, Am, W m}M m=1 |
| Open Source Code | Yes | Code and demonstrations at https://github.com/dtak/Q-Transfer-Demo-public-/ |
| Open Datasets | Yes | Our datasets cover a range of different types and can be divided into three main categories (count data, grayscale face images and hyperspectral images). The Autism dataset is of interest to the medical community for understanding disease subtypes in the Autism spectrum and is not publicly available. The remaining datasets are public and are considered standard benchmark datasets for NMF. Table 1 provides a description of each dataset as well as the rank used and a citation. The Autism dataset is of interest to the medical community for understanding disease subtypes in the Autism spectrum and is not publicly available. The remaining datasets are public and are considered standard benchmark datasets for NMF. |
| Dataset Splits | Yes | In our experiments, we hold out ten percent of the observations and report performance on both provided and held-out observations. |
| Hardware Specification | No | No specific hardware details like GPU/CPU models or cloud resources are mentioned in the paper, only general statements about memory requirements for certain algorithms. |
| Software Dependencies | No | The paper mentions software like 'scikit-learn', 'CVXPY' (with SCS), and 'autograd' but does not provide specific version numbers for these software dependencies as used in the experimental setup. It refers to 'default settings of scikit-learn (Pedregosa et al., 2011)' and 'Splitting Conic Solver (SCS) in the convex optimization package CVXPY (Diamond and Boyd, 2016)'. |
| Experiment Setup | Yes | Model: exponential-Gaussian model parameters: We set the standard deviation σX to be equal to the empirical standard deviation of a reference NMF. The exponential parameter was set to one for each entry in the basis and weights matrices (λd,r = λr,n = 1). Model: SILF model parameters: ...To set the threshold parameter ϵ for each dataset, we use an empirical approach where we find a collection of 50 high-quality factorizations under default settings of scikit-learn (Pedregosa et al., 2011). The objective function is evaluated for each of them {fi}50 i=1 and ϵ = 1.2 maxi fi. We set the remaining SILF likelihood sensitivity parameters β = 0.1, C = 2. For the prior, we identically set the exponential parameter for each entry: λr,n = 1. Inference: Generating Q-transform matrices for transfer: ...transfer rank and SVD rank RT = RSVD = 3. We generated twenty sets of synthetic data Xs R12 12 + using non-negative matrices of rank RT with truncated Gaussian noise. For each synthetic dataset, we find five pairs of transformation matrices through random restarts. In all our experiments, the same set of Mmax = 100 pairs of transformation matrices {Qm A , Qm W }100 m=1 are applied to each of the real datasets. |