Optimal Analysis of Subset-Selection Based L_p Low-Rank Approximation
Authors: Chen Dan, Hong Wang, Hongyang Zhang, Yuchen Zhou, Pradeep K. Ravikumar
NeurIPS 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study the low rank approximation problem of any given matrix A over Rn m and Cn m in entry-wise ℓp loss, that is, finding a rank-k matrix X such that A X p is minimized. Unlike the traditional ℓ2 setting, this particular variant is NP-Hard. We show that the algorithm of column subset selection, which was an algorithmic foundation of many existing algorithms, enjoys approximation ratio (k + 1)1/p for 1 p 2 and (k + 1)1 1/p for p 2. This improves upon the previous O(k + 1) bound for p 1 [1]. We complement our analysis with lower bounds; these bounds match our upper bounds up to constant 1 when p 2. At the core of our techniques is an application of Riesz-Thorin interpolation theorem from harmonic analysis, which might be of independent interest to other algorithmic designs and analysis more broadly. |
| Researcher Affiliation | Academia | Chen Dan Carnegie Mellon University EMAIL Hong Wang Princeton University EMAIL Hongyang Zhang Toyota Technological Institute at Chicago EMAIL Yuchen Zhou University of Wisconsin, Madison EMAIL Pradeep Ravikumar Carnegie Mellon University EMAIL |
| Pseudocode | Yes | Algorithm 1 A cp,k approximation to problem (1) by column subset selection. Algorithm 2 [1] SELECTCOLUMNS (k, A): Selecting O(k log m) columns of A. Algorithm 3 [1]An algorithm that transforms an O(k log m)-rank matrix factorization into a k-rank matrix factorization without inflating the error too much. |
| Open Source Code | No | The paper does not provide any statement or link regarding the availability of open-source code for the methodology described. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments with specific datasets. Therefore, no information about publicly available training datasets or access is provided. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments with dataset splits. Therefore, no information about validation splits is provided. |
| Hardware Specification | No | The paper is theoretical and does not describe any experiments. Therefore, no hardware specifications for running experiments are mentioned. |
| Software Dependencies | No | The paper is theoretical and does not describe any experiments requiring specific software dependencies with version numbers for reproducibility. |
| Experiment Setup | No | The paper is theoretical and does not describe any experiments or their setup. Therefore, no specific experimental setup details like hyperparameters or training configurations are provided. |