Online Nonnegative CP-dictionary Learning for Markovian Data

Authors: Hanbaek Lyu, Christopher Strohmeier, Deanna Needell

JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we show that our algorithm converges much faster than standard algorithms for nonnegative tensor factorization tasks on both synthetic and real-world data. Also, we demonstrate the utility of our algorithm on a diverse set of examples from image, video, and time-series data, illustrating how one may learn qualitatively different CP-dictionaries from the same tensor data by exploiting the tensor structure in multiple ways.
Researcher Affiliation Academia Hanbaek Lyu EMAIL Department of Mathematics University of Wisconsin Madison, WI 53709, USA Christopher Strohmeier EMAIL Department of Mathematics University of California, Los Angeles, CA 90095, USA Deanna Needell EMAIL Department of Mathematics University of California, Los Angeles, CA 90025, USA
Pseudocode Yes Algorithm 1 Online CP-Dictionary Learning (online CPDL)... Algorithm 2 Online CP-Dictionary Learning (Bounded Memory Implementation)... Algorithm 3 Intermediate Aggregation... Algorithm 4 Coding... Algorithm 5 Loading matrix update... Algorithm 6 Alternating Least Squares for NCPD... Algorithm 7 Multiplicative Update for NCPD
Open Source Code Yes All codes are available at https://github.com/Hanbaek Lyu/Online CPDL
Open Datasets Yes X20News R40 5000 26 0 (41.6MB) is a tensor representing semi-synthetic text data based on 20 Newsgroups dataset Rennie (2008)... XTwitter R90 5000 1000 0 (3.6GB) is an anonymized Twitter text data related to the COVID-19 pandemic from Feb. 1 to May 1 of 2020. The original data was collected in Kassab et al. (2021)... XHeadlines R203 7000 700 0 (8.0GB) is a tensor derived in Kassab et al. (2021) from news headlines published over a period of 17 years sourced from the Australian news source ABC Kulkarni (2018)... The original video is due to Barson et al. Barson et al. (2020)... we apply our algorithm to a weather dataset obtained from Beniaguev (2017).
Dataset Splits Yes From the given data tensor X, we obtain a sequence of tensors X1, . . . , XT obtained by subsampling 1/5 of the coordinates from the last mode... To this end, we sample 50-frame (2 sec. long) clips uniformly at random for 200 times... From this large data tensor, we sequentially extracted smaller (36 24 4) = (cities time measurements) tensors by dividing time into overlapping segments of length 24 hours, with overlap size 4 hours.
Hardware Specification No In order to make a fair comparison, we compare the reconstruction error against CPU times with the same machine, not against iteration counts, since a single iteration may have different computational costs across different algorithms. (No specific hardware model/details are provided for the 'same machine').
Software Dependencies No The coding step in (15) is a convex problem and can be easily solved by a number of known algorithms (e.g., LARS Efron et al. (2004), LASSO Tibshirani (1996), and featuresign search Lee et al. (2007)). (This lists algorithms, not specific software libraries with version numbers.)
Experiment Setup Yes We applied our online CPDL algorithm (Algorithm 1 for 400 iterations with λ = 1) to various reshapings of such patches to learn three separate dictionaries, each consisting of 24 atoms... In Figure 2, OCPDL (β) for for β {0.75, 1} denotes Algorithm 1 with weights wt = c t β/ log t (with w1 = c )... In all cases, the weights satisfy (A3) so the algorithm is guaranteed to converge to the stationary points of the objective function almost surely by Theorem 1. The constant c is chosen from {1, 10, 100, 1000} for β {0.75, 1}. Initial loading matrices for Algorithm 1 are chosen with i.i.d. entries drawn from the uniform distribution on [0, 1]... sequentially fed into the online CPDL algorithm with wt = c /t, λ = 2, and c = 105... to learn a single CP-dictionary atom (R = 1).