Minorization-Maximization for Learning Determinantal Point Processes
Authors: Takahiro Kawashima, Hideitsu Hino
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments on both synthetic and real-world datasets, our method outperforms existing methods in most settings. |
| Researcher Affiliation | Academia | Takahiro Kawashima EMAIL Department of Statistical Science The Graduate University for Advanced Studies, SOKENDAI, Hideitsu Hino EMAIL The Institute of Statistical Mathematics RIKEN Center for Advanced Intelligence Project |
| Pseudocode | Yes | Algorithm 1: Minorization-Maximization (MM) |
| Open Source Code | Yes | Our code is available at https: //github.com/ISMHino Lab/DPPMMEstimation. |
| Open Datasets | Yes | We apply our method to the Nottingham music dataset3, which was used in (Boulanger-Lewandowski et al., 2012; Osogami et al., 2018). ... 3Available at https://abc.sourceforge.net/NMD/. Amazon baby registry has served as a benchmark for learning methods of DPPs since (Gillenwater et al., 2014). |
| Dataset Splits | No | The paper describes generating synthetic datasets and collecting samples from existing datasets (Nottingham, Amazon Baby Registry). However, it does not specify any explicit training, validation, or test dataset splits (e.g., percentages, sample counts, or references to predefined splits). |
| Hardware Specification | Yes | All our experiments were run on a Linux Mint system with 32GB of RAM and an Intel Core i9-10900K CPU @ 3.70GHz. |
| Software Dependencies | No | We implemented all the experiments in Julia, and all our experiments were run on a Linux Mint system with 32GB of RAM and an Intel Core i9-10900K CPU @ 3.70GHz. The paper mentions Julia as the implementation language but does not specify its version or any other software dependencies with version numbers. |
| Experiment Setup | Yes | We provide the following two initialization schemes with reference to (Mariet & Sra, 2015): WISHART: We sample an initial value from the Wishart distribution as L(0) W(N, I)/N. BASIC: We uniformly sample v(0) ij U(0, 2/N) for i, j = 1, 2, . . . , N and initialize as L(0) = V (0)V (0) . We set the step size a = 1.3 for the fixed-point algorithm2 and the tolerance δ = 0.15 for the proposed MM algorithm. For Tacc < t, we use the default parameter a = 1 for the fixed-point algorithm, which monotonically increases the objective but no acceleration is applied, and the same way is used for the proposed MM. In Adam optimization, we employ the default values β1 = 0.999, β2 = 0.9 for the decay rates, and the machine epsilon ϵ = 10 8. The acceleration steps Tacc of the fixed-point and MM algorithms and the learning rate η of Adam are set to be different with the initialization schemes: Tacc = 5, η = 0.1 for WISHART initialization and Tacc = 10, η = 0.01 for BASIC initialization. In each experiment, we stop the learning when the criterion |f(L(t)) f(L(t 1))| / |f(L(t 1))| δtol is satisfied. We set δtol = 10 4 as the relative tolerance for all the experiments reported below. |