Prior Specification for Exposure-based Bayesian Matrix Factorization
Authors: Zicong Zhu, Issei Sato
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this study, we present an enhanced method for specifying priors in Bayesian matrix factorization models. We improve the estimators by implementing an exposure-based model to better simulate data scarcity. Our method demonstrates significant accuracy improvements in hyperparameter estimation during synthetic experiments. We also explore the feasibility of applying this method to real-world datasets and provide insights into how the model s behavior adapts to varying levels of data sparsity. [...] We conducted experiments on synthetic datasets, demonstrating that our new estimators outperform existing methods, especially as the dataset becomes sparser. |
| Researcher Affiliation | Academia | Zicong Zhu EMAIL Department of Computer Science The University of Tokyo Issei Sato EMAIL Department of Computer Science The University of Tokyo |
| Pseudocode | No | The paper describes the model definitions and derivations mathematically and textually, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code for the methodology described, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We conducted additional experiments on real-world datasets Movie Lens (Harper & Konstan, 2015), which has been widely studied for recommender systems. |
| Dataset Splits | No | We first generate the synthetic data with the following 3 steps repeatedly: (1) We sample the matrix P and Q with the prior hyperparameters for particular specifications; (2) We recover the fully dense matrix R by the product of P and Q; (3) We sample the Bernoulli variables Oij with different sparsity levels and multiply them with each entry of the dense matrix R to obtain the sparse observation matrix Y . [...] We selected three Movie Lens datasets with different sizes, from 100k records to 10m records. The datasets contain users ratings of different movies on a 5-star scale, with half-star increments (0.5 stars 5.0 stars). While the paper describes the generation of synthetic data and the characteristics of the MovieLens datasets, it does not specify explicit training/test/validation splits for its experiments or how the MovieLens data was partitioned for the evaluation of the estimators. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | Yes | We conduct the experiments with specifications A, D, and F because they are distinct from each other. The full specification setup defined by da Silva et al. (2023) is described in Table 4. In specification A, matrices P and Q share the same prior parameters, but their shape parameters are 10 times larger than their rate parameters. [...] Table 1: Hyperparameters Initialization for Different Specifications. Spec. a b c d µp σp µq σq E[R] V[R] A 10 1 10 1 10.0 3.16 10.0 3.16 2500.00 55000.00 D 0.1 1 0.1 1 0.1 0.32 0.1 0.32 0.25 0.55 F 1 1 0.1 0.1 1.0 1.0 1.0 3.16 25.00 550.00 [...] Table 2: Variables of Experiment Setups Prior Spec. K (Num. of Latent Factors) Pobs. (Parameter of Bernoulli distribution) [A, D, F] [25, 50, 75, 100, 125, 150] Group 1: [1.0, 0.98, 0.96, 0.94, 0.92, 0.90] Group 2: [0.5, 0.4, 0.3, 0.2, 0.1] Group 3: [0.05, 0.04, 0.03, 0.02, 0.01] |