reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Completing Any Low-rank Matrix, Provably

Authors: Yudong Chen, Srinadh Bhojanapalli, Sujay Sanghavi, Rachel Ward

JMLR 2015 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We now study the performance of the two-phase sampling procedure outlined in Algorithm 1 through numerical experiments. For this, we consider rank-5 matrices of size 500 500 of the form M = DUV D, where the elements of the matrices U and V are i.i.d. Gaussian N(0, 1) and D is a diagonal matrix with power-law decay, Dii = i α, 1 i 500. We refer to such constructions as power-law matrices. The parameter α adjusts the leverage scores (and hence the coherence level) of M with α = 0 being maximal incoherence µ0 = Θ(1) and α = 1 corresponding to maximal coherence µ0 = Θ(n). Figure 1 plots the number of samples required for successful recovery (y-axis) for different values of α (x-axis) and β = 2/3 using Algorithm 1 with the initial samples Ω
Researcher Affiliation	Academia	Yudong Chen EMAIL Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94704, USA Srinadh Bhojanapalli EMAIL Sujay Sanghavi EMAIL Department of Electrical and Computer Engineering The University of Texas at Austin Austin, TX 78712, USA Rachel Ward EMAIL Department of Mathematics and ICES The University of Texas at Austin Austin, TX 78712, USA
Pseudocode	Yes	Algorithm 1 Two-phase sampling for coherent matrix completion input Rank parameter r, sample budget m, and parameter β [0, 1] Step 1: Obtain the initial set Ωby sampling uniformly without replacement such that \|Ω\| = βm. Compute best rank-r approximation to PΩ(M), U Σ V , and its leverage scores { µi} and { νj}. Step 2: Generate set of (1 β)m new samples Ωby sampling without replacement with distribution (6). Set ˆ M = arg min X X s.t PΩ Ω(X) = PΩ Ω(M). output Completed matrix ˆ M.
Open Source Code	No	The paper does not provide an explicit statement about open-sourcing their code or a link to a code repository for the methodology described in this paper. It mentions using 'Augmented Lagrangian Method (ALM) based solver in Lin et al. (2009)' for the convex optimization program, which is a third-party tool.
Open Datasets	No	For this, we consider rank-5 matrices of size 500 500 of the form M = DUV D, where the elements of the matrices U and V are i.i.d. Gaussian N(0, 1) and D is a diagonal matrix with power-law decay, Dii = i α, 1 i 500. We refer to such constructions as power-law matrices.
Dataset Splits	No	The paper describes a two-phase sampling procedure where a proportion of samples are drawn uniformly and the remaining are drawn according to estimated leverage scores. This refers to the sampling of observed elements for matrix completion, not a standard train/test/validation split of a dataset for model evaluation.
Hardware Specification	No	The paper describes numerical experiments but does not provide any specific details about the hardware (e.g., GPU models, CPU models, memory) used to conduct these experiments.
Software Dependencies	No	We use the Augmented Lagrangian Method (ALM) based solver in Lin et al. (2009) to solve the convex optimization program (1).
Experiment Setup	Yes	We now study the performance of the two-phase sampling procedure outlined in Algorithm 1 through numerical experiments. For this, we consider rank-5 matrices of size 500 500 of the form M = DUV D, where the elements of the matrices U and V are i.i.d. Gaussian N(0, 1) and D is a diagonal matrix with power-law decay, Dii = i α, 1 i 500. We refer to such constructions as power-law matrices. The parameter α adjusts the leverage scores (and hence the coherence level) of M with α = 0 being maximal incoherence µ0 = Θ(1) and α = 1 corresponding to maximal coherence µ0 = Θ(n). Figure 1 plots the number of samples required for successful recovery (y-axis) for different values of α (x-axis) and β = 2/3 using Algorithm 1 with the initial samples Ω... Successful recovery is defined as when at least 95% of trials have relative errors in the Frobenius norm M ˆ M F / M F not exceeding 0.01.