Analyzing Tensor Power Method Dynamics in Overcomplete Regime

Authors: Animashree Anandkumar, Rong Ge, Majid Janzamin

JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We present a novel analysis of the dynamics of tensor power iterations in the overcomplete regime where the tensor CP rank is larger than the input dimension. Finding the CP decomposition of an overcomplete tensor is NP-hard in general. We consider the case where the tensor components are randomly drawn, and show that the simple power iteration recovers the components with bounded error under mild initialization conditions. We apply our analysis to unsupervised learning of latent variable models, such as multi-view mixture models and spherical Gaussian mixtures. Given the third order moment tensor, we learn the parameters using tensor power iterations. We prove it can correctly learn the model parameters when the number of hidden components k is much larger than the data dimension d, up to k = o(d1.5). We initialize the power iterations with data samples and prove its success under mild conditions on the signal-to-noise ratio of the samples. Our analysis significantly expands the class of latent variable models where spectral methods are applicable.
Researcher Affiliation Academia Animashree Anandkumar EMAIL Department of Electrical Engineering and Computer Science University of California, Irvine Engineering Hall, Room 4408 Irvine, CA 92697, USA; Rong Ge EMAIL Department of Computer Science Duke University 308 Research Drive (LSRC Building), Room D226 Durham, NC 27708, USA; Majid Janzamin EMAIL Department of Electrical Engineering and Computer Science University of California, Irvine Engineering Hall, Room 4406 Irvine, CA 92697, USA
Pseudocode Yes Algorithm 1 Learning multiview mixture model via tensor power iterations Require: 1) Third order moment tensor T Rd d d in (8), 2) n samples of z1 in multiview mixture model as z(τ) 1 , τ [n], and 3) number of iterations N. 1: for τ = 1 to n do 2: Initialize unit vectors x(1) τ z(τ) 1 / z(τ) 1 . 3: for t = 1 to N do 4: Tensor power updates (see (5) for the definition of the multilinear form): x(t+1) τ = T I, x(t) τ , x(t) τ T I, x(t) τ , x(t) τ , (9) 7: return the output of Procedure 2 with input n x(N+1) τ : τ [n] o as estimates xj.
Open Source Code No The paper does not contain any explicit statements about releasing source code or links to code repositories for the methodology described.
Open Datasets No The paper is theoretical and analyzes models like "multiview mixture models" and "spherical Gaussian mixtures" using randomly drawn tensor components and Gaussian priors for analysis. It does not describe experiments run on publicly available datasets or provide access information for any specific dataset.
Dataset Splits No The paper does not describe empirical experiments involving datasets, therefore, there is no mention of training/test/validation splits.
Hardware Specification No The paper focuses on theoretical analysis and does not describe any experimental setup that would require hardware specifications.
Software Dependencies No The paper focuses on theoretical analysis and does not mention any specific software or library versions used for implementation or experimentation.
Experiment Setup No The paper is a theoretical analysis of tensor power method dynamics and does not describe any specific experimental setup, hyperparameters, or training configurations for empirical evaluation.