reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Robust PCA by Manifold Optimization

Authors: Teng Zhang, Yi Yang

JMLR 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Simulations and real data examples conﬁrm the competitive performance of our method. In this section, we test the performance of the proposed algorithms by simulated data sets and real data sets.
Researcher Affiliation	Academia	Teng Zhang EMAIL Department of Mathematics University of Central Florida 4000 Central Florida Blvd Orlando, FL 32816, USA Yi Yang EMAIL Department of Mathematics and Statistics Mc Gill University 805 Sherbrooke Street West Montreal, QC H3A0B9, Canada
Pseudocode	Yes	Algorithm 1 Gradient descent on the manifold under the fully observed setting. [...] Algorithm 2 Gradient descent on the manifold under the partially observed setting.
Open Source Code	Yes	In addition, we implement our method in an eﬃcient and user-friendly R package morpca, which is available at https://github.com/emeryyi/morpca. The MATLAB implementation of our algorithm used in this section is available at https://sciences.ucf.edu/math/tengz/.
Open Datasets	Yes	The data set is originally from http://perception.i2r.a-star.edu.sg/bk_model/bk_index.html, and is available at https://sciences.ucf.edu/math/tengz/.
Dataset Splits	No	For simulated data sets, we generate L by UΣVT , where U Rn1 r and V Rn2 r are random matrices that i.i.d. sampled from N(0, 1), and Σ Rn1 r is an diagonal matrix. As for S Rn1 n2, each entry is sampled from N(0, 100) with probability q, and is zero with probability 1 q. That is, q represents the level of sparsity in the sparse matrix S . It measures the overall corruption level of Y and is associated with the corruption level γ (γ measures the row and column-wise corruption level). For the partially observed case, we assume that each entry of Y is observed with probability p. [...] We adopt the public data set Shoppingmall studied in Yi et al. (2016),1 A few frames are visualized in the ﬁrst column of Figure 7. There are 1000 frames in this video sequence, represented by a matrix of size 81920 1000, where each column corresponds to a frame of the video and each row corresponds to a pixel of the video. We apply our algorithms with r = 3 and γ = 0.1, p = 0.5 for the partially observed case, the step size η = 0.7. We stop the algorithm after 100 iterations. Figure 7 shows that our algorithms obtain desirable low-rank approximations within 100 iterations. Explanation: The paper describes how synthetic data is generated and the characteristics of the real-world dataset. However, it does not specify explicit training, validation, or test splits, as the task is robust principal component analysis for signal recovery, not a traditional supervised learning problem requiring such splits for model generalization evaluation.
Hardware Specification	No	No specific hardware details (like GPU/CPU models, memory, or cloud instances) are mentioned in the paper for running the experiments.
Software Dependencies	No	In addition, we implement our method in an eﬃcient and user-friendly R package morpca, which is available at https://github.com/emeryyi/morpca. The MATLAB implementation of our algorithm used in this section is available at https://sciences.ucf.edu/math/tengz/. Explanation: The paper mentions "R package morpca" and "MATLAB implementation" but does not provide specific version numbers for R, MATLAB, or any other software libraries or dependencies.
Experiment Setup	Yes	In simulations, we let [n1, n2] = [500, 600], r = 3, Σ = I, and q = 0.02. For the partially observed case, we let p = 0.2. The ﬁrst simulation investigates the following questions: [...] As a result, we use the step size η = 0.7 for Algorithm 1 and 0.7/p for Algorithm 2 for the following simulations. The second simulation concerns the choice of γ. We test γ = cγ for a few choices of c (γ can be calculated from the zero pattern of S). [...] Following these observations, we use γ = 1.5γ as the default choice of the following experiments, which is also used in Yi et al. (2016). [...] We apply our algorithms with r = 3 and γ = 0.1, p = 0.5 for the partially observed case, the step size η = 0.7. We stop the algorithm after 100 iterations.