Robust PCA by Manifold Optimization

Authors: Teng Zhang, Yi Yang

JMLR 2018 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Simulations and real data examples confirm the competitive performance of our method. In this section, we test the performance of the proposed algorithms by simulated data sets and real data sets.
Researcher Affiliation Academia Teng Zhang EMAIL Department of Mathematics University of Central Florida 4000 Central Florida Blvd Orlando, FL 32816, USA Yi Yang EMAIL Department of Mathematics and Statistics Mc Gill University 805 Sherbrooke Street West Montreal, QC H3A0B9, Canada
Pseudocode Yes Algorithm 1 Gradient descent on the manifold under the fully observed setting. [...] Algorithm 2 Gradient descent on the manifold under the partially observed setting.
Open Source Code Yes In addition, we implement our method in an efficient and user-friendly R package morpca, which is available at https://github.com/emeryyi/morpca. The MATLAB implementation of our algorithm used in this section is available at https://sciences.ucf.edu/math/tengz/.
Open Datasets Yes The data set is originally from http://perception.i2r.a-star.edu.sg/bk_model/bk_index.html, and is available at https://sciences.ucf.edu/math/tengz/.
Dataset Splits No For simulated data sets, we generate L by UΣVT , where U Rn1 r and V Rn2 r are random matrices that i.i.d. sampled from N(0, 1), and Σ Rn1 r is an diagonal matrix. As for S Rn1 n2, each entry is sampled from N(0, 100) with probability q, and is zero with probability 1 q. That is, q represents the level of sparsity in the sparse matrix S . It measures the overall corruption level of Y and is associated with the corruption level γ (γ measures the row and column-wise corruption level). For the partially observed case, we assume that each entry of Y is observed with probability p. [...] We adopt the public data set Shoppingmall studied in Yi et al. (2016),1 A few frames are visualized in the first column of Figure 7. There are 1000 frames in this video sequence, represented by a matrix of size 81920 1000, where each column corresponds to a frame of the video and each row corresponds to a pixel of the video. We apply our algorithms with r = 3 and γ = 0.1, p = 0.5 for the partially observed case, the step size η = 0.7. We stop the algorithm after 100 iterations. Figure 7 shows that our algorithms obtain desirable low-rank approximations within 100 iterations. Explanation: The paper describes how synthetic data is generated and the characteristics of the real-world dataset. However, it does not specify explicit training, validation, or test splits, as the task is robust principal component analysis for signal recovery, not a traditional supervised learning problem requiring such splits for model generalization evaluation.
Hardware Specification No No specific hardware details (like GPU/CPU models, memory, or cloud instances) are mentioned in the paper for running the experiments.
Software Dependencies No In addition, we implement our method in an efficient and user-friendly R package morpca, which is available at https://github.com/emeryyi/morpca. The MATLAB implementation of our algorithm used in this section is available at https://sciences.ucf.edu/math/tengz/. Explanation: The paper mentions "R package morpca" and "MATLAB implementation" but does not provide specific version numbers for R, MATLAB, or any other software libraries or dependencies.
Experiment Setup Yes In simulations, we let [n1, n2] = [500, 600], r = 3, Σ = I, and q = 0.02. For the partially observed case, we let p = 0.2. The first simulation investigates the following questions: [...] As a result, we use the step size η = 0.7 for Algorithm 1 and 0.7/p for Algorithm 2 for the following simulations. The second simulation concerns the choice of γ. We test γ = cγ for a few choices of c (γ can be calculated from the zero pattern of S). [...] Following these observations, we use γ = 1.5γ as the default choice of the following experiments, which is also used in Yi et al. (2016). [...] We apply our algorithms with r = 3 and γ = 0.1, p = 0.5 for the partially observed case, the step size η = 0.7. We stop the algorithm after 100 iterations.