Accelerating Nuclear-norm Regularized Low-rank Matrix Optimization Through Burer-Monteiro Decomposition
Authors: Ching-pei Lee, Ling Liang, Tianyun Tang, Kim-Chuan Toh
JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on real-world large-scale problems of recommendation systems, regularized kernel estimation, and molecular conformation confirm that BM-Global can indeed effectively escapes spurious local minima at which existing BM approaches are stuck, and is a magnitude faster than state-of-the-art algorithms for low-rank matrix optimization problems involving a nuclear-norm regularizer. |
| Researcher Affiliation | Academia | Ching-pei Lee EMAIL Institute of Statistical Mathematics Ling Liang EMAIL Department of Mathematics University of Maryland at College Park Tianyun Tang EMAIL Institute of Operations Research and Analytics National University of Singapore Kim-Chuan Toh EMAIL Department of Mathematics and Institute of Operations Research and Analytics National University of Singapore |
| Pseudocode | Yes | Algorithm 1: BM-Global Algorithm 2: Lm SVD(Lt, Ut,0, pt) Algorithm 3: The APG method for solving (QSDP) |
| Open Source Code | Yes | Based on this research, we have released an open-source package of the proposed BM-Global at https://www.github.com/leepei/BM-Global/. |
| Open Datasets | Yes | movielens100k: https://www.kaggle.com/prajitdatta/movielens-100k-dataset. (We used the split from ua). movielens10m: https://www.kaggle.com/smritisingh1997/movielens-10m-dataset. (We used the split from ra). Netflix: https://www.kaggle.com/netflix-inc/netflix-prize-data. Yahoomusc: the R2 one at https://webscope.sandbox.yahoo.com/catalog.php?datatype=r. We consider problems with dissimilarity measures dij for 1 i, j n collected in Duin and Pekalska (2009).4 In our experiments, the data dij are scaled to the interval [0, 1], and the Data available at http://prtools.tudelft.nl/Guide/37Pages/distools.html. In this experiment, we consider the molecules from the Protein Data Bank (see https:// www.rcsb.org/) with given noisy and sparse distance data to simulate distances measurable by nuclear magnetic resonance (NMR) experiments. |
| Dataset Splits | Yes | For all data sets, We use their original training/test split. movielens100k: https://www.kaggle.com/prajitdatta/movielens-100k-dataset. (We used the split from ua). movielens10m: https://www.kaggle.com/smritisingh1997/movielens-10m-dataset. (We used the split from ra). elements of the index set Ωare randomly selected such that |Ω| n/20. select 25% of them to generate our index set Ω. |
| Hardware Specification | Yes | Experiments on the first four data sets are conducted on an Amazon AWS EC2 c6i.4xlarge instance with an 8-core Intel Xeon Ice Lake processor and 32GB memory. For the larger yahoo-music data set, an m6i.4xlarge instance that has the same processor but with 64GB memory is used. Experiments for this part are conducted on a Linux PC with an Intel Xeon E5-2650 processor and 96GB memory. |
| Software Dependencies | No | All algorithms are implemented in MATLAB and C/C++. We apply the poly MF-SS method of (Wang et al., 2017) in the matrix completion problem and Manopt (Boumal et al., 2014) for manifold optimization in the nonlinear semidefinite programming problem. We hence only compare BM-Global with the efficient and robust QSDP solver, QSDPNAL (Li et al., 2018). We borrow most parts of the implementation of Liu et al. (2013) but impose some minor modifications to adapt for our purposes. First, as we shall see in the following subsection, instead of using a randomly generated initial point Ut,0, we use a more sophisticated initialization scheme. Second, the implementation of Liu et al. (2013) terminates by following a two-level strategy, but in our implementation, we simply terminate the algorithm as long as the difference between the eigenvalues of Lt,i and Lt,i 1 is small. |
| Experiment Setup | Yes | For the value of λ on the real-world data, we follow the values provided by Hsieh and Olsen (2014) that were obtained through cross-validation, while the final k is the rank of the final output of our algorithm, obtained by running our algorithm with the given λ till the objective cannot be further improved. The value of λ for the toy example is from some simple tuning to make the final rank not too far away from that of other data sets. we simply assign a fixed step size αt α in our implementation. In particular, we test the setting of alternating between x consecutive inexact proximal gradient steps and y consecutive iterations of the BM phase solver, with x {1, . . . , 5} and y {1, . . . , 8}, for our fixed step variant with α 1.99. More Details and the results are shown in the supplementary materials. Our result indicates that there is no definite best performer in all cases. But in general, x = 1 and y = 3 seems to be a rather robust choice. We set wij = 1 for all (i, j) Ωand λ = n/10. In our tests, we set τ = 0.1. To measure the accuracy of the estimated positions, we record the root mean square deviation (RMSD): ... Moreover, we let λ = 10 n/ P (i,j) Ωd2 ij. |