reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On the Computational and Statistical Complexity of Over-parameterized Matrix Sensing

Authors: Jiacheng Zhuo, Jeongyeol Kwon, Nhat Ho, Constantine Caramanis

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that when the number of samples n is suﬃciently large, Ft F t X 2 F converges to a ﬁnal statistical error of O kdσ2/n after O(σr d) number of iterations, namely sub-linearly... Furthermore, the number of iterations O(σr d) is needed in the over-parameterized setting as the local curvature of the loss function (1) around the global maxima is not quadratic and therefore the FGD only converges sub-linearly to the global maxima; see the simulations in Figure 1 for an illustration. In the simulations, we consider the dimension d = 20, the true rank r = 3, and the number of samples n = 200. We ﬁrst generate random orthonormal matrices U and V such that the union of their column spaces is Rd. We set D S to be a diagonal matrix, with its (1, 1), (2, 2), (3, 3) entries be 1, 0.9, 0.8 respectively, and zero elsewhere. Hence X = UD SU . The upper triangle entries of the sensing matrices Ai are sampled from standard Gaussian distribution, and we ﬁll the lower triangle entries accordingly such that Ai are symmetric. We further assume that there is no observation noise, so that we have a better understanding of the convergence behavior of the algorithm. Let {Ft}t be the sequence generated by the FGD method as in equation (3) with η = 0.1. The simulation results are shown in Figure 1.
Researcher Affiliation	Academia	Jiacheng Zhuo EMAIL Department of Computer Science University of Texas Austin, TX 78712, USA; Jeongyeol Kwon EMAIL Wisconsin Institute for Discovery University of Wisconsin-Madison Madison, WI 53705, USA; Nhat Ho EMAIL Department of Statistics and Data Sciences University of Texas Austin, TX 78712, USA; Constantine Caramanis EMAIL Department of Electrical and Computer Engineering University of Texas Austin, TX 78712, USA
Pseudocode	No	The paper does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block. It describes the Factorized Gradient Descent (FGD) method using mathematical equations and textual descriptions, for example: 'Ft+1 = Ft ηGn t , where Gn t = L(Ft) = 1 D Ai, Ft F t E yi Ai Ft, (3) where η is the step size and Gn t denotes the gradient evaluated at iteration t with n i.i.d. samples.'
Open Source Code	No	The paper does not provide any statement regarding the release of source code, nor does it include links to a code repository or mention code in supplementary materials.
Open Datasets	No	In the simulations, we consider the dimension d = 20, the true rank r = 3, and the number of samples n = 200. We ﬁrst generate random orthonormal matrices U and V such that the union of their column spaces is Rd. We set D S to be a diagonal matrix, with its (1, 1), (2, 2), (3, 3) entries be 1, 0.9, 0.8 respectively, and zero elsewhere. Hence X = UD SU . The upper triangle entries of the sensing matrices Ai are sampled from standard Gaussian distribution, and we ﬁll the lower triangle entries accordingly such that Ai are symmetric. We further assume that there is no observation noise...
Dataset Splits	No	The paper describes the generation of synthetic data for simulations: 'In the simulations, we consider the dimension d = 20, the true rank r = 3, and the number of samples n = 200. We ﬁrst generate random orthonormal matrices U and V... The upper triangle entries of the sensing matrices Ai are sampled from standard Gaussian distribution...' However, it does not provide details on training, validation, or test splits, as the experiments are simulations to analyze convergence properties rather than model training on partitioned data.
Hardware Specification	No	The paper describes a simulation setup and presents results from these simulations, but it does not specify any particular hardware (e.g., GPU, CPU models, memory) used to conduct these simulations or experiments.
Software Dependencies	No	The paper focuses on theoretical analysis and simulation of the Factorized Gradient Descent (FGD) method. It does not provide any specific software names with version numbers, such as programming languages, libraries, or solvers used in its implementation or analysis.
Experiment Setup	Yes	In the simulations, we consider the dimension d = 20, the true rank r = 3, and the number of samples n = 200... Let {Ft}t be the sequence generated by the FGD method as in equation (3) with η = 0.1.