Minorization-Maximization for Learning Determinantal Point Processes

Authors: Takahiro Kawashima, Hideitsu Hino

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our experiments on both synthetic and real-world datasets, our method outperforms existing methods in most settings.
Researcher Affiliation Academia Takahiro Kawashima EMAIL Department of Statistical Science The Graduate University for Advanced Studies, SOKENDAI, Hideitsu Hino EMAIL The Institute of Statistical Mathematics RIKEN Center for Advanced Intelligence Project
Pseudocode Yes Algorithm 1: Minorization-Maximization (MM)
Open Source Code Yes Our code is available at https: //github.com/ISMHino Lab/DPPMMEstimation.
Open Datasets Yes We apply our method to the Nottingham music dataset3, which was used in (Boulanger-Lewandowski et al., 2012; Osogami et al., 2018). ... 3Available at https://abc.sourceforge.net/NMD/. Amazon baby registry has served as a benchmark for learning methods of DPPs since (Gillenwater et al., 2014).
Dataset Splits No The paper describes generating synthetic datasets and collecting samples from existing datasets (Nottingham, Amazon Baby Registry). However, it does not specify any explicit training, validation, or test dataset splits (e.g., percentages, sample counts, or references to predefined splits).
Hardware Specification Yes All our experiments were run on a Linux Mint system with 32GB of RAM and an Intel Core i9-10900K CPU @ 3.70GHz.
Software Dependencies No We implemented all the experiments in Julia, and all our experiments were run on a Linux Mint system with 32GB of RAM and an Intel Core i9-10900K CPU @ 3.70GHz. The paper mentions Julia as the implementation language but does not specify its version or any other software dependencies with version numbers.
Experiment Setup Yes We provide the following two initialization schemes with reference to (Mariet & Sra, 2015): WISHART: We sample an initial value from the Wishart distribution as L(0) W(N, I)/N. BASIC: We uniformly sample v(0) ij U(0, 2/N) for i, j = 1, 2, . . . , N and initialize as L(0) = V (0)V (0) . We set the step size a = 1.3 for the fixed-point algorithm2 and the tolerance δ = 0.15 for the proposed MM algorithm. For Tacc < t, we use the default parameter a = 1 for the fixed-point algorithm, which monotonically increases the objective but no acceleration is applied, and the same way is used for the proposed MM. In Adam optimization, we employ the default values β1 = 0.999, β2 = 0.9 for the decay rates, and the machine epsilon ϵ = 10 8. The acceleration steps Tacc of the fixed-point and MM algorithms and the learning rate η of Adam are set to be different with the initialization schemes: Tacc = 5, η = 0.1 for WISHART initialization and Tacc = 10, η = 0.01 for BASIC initialization. In each experiment, we stop the learning when the criterion |f(L(t)) f(L(t 1))| / |f(L(t 1))| δtol is satisfied. We set δtol = 10 4 as the relative tolerance for all the experiments reported below.