Single Pass PCA of Matrix Products
Authors: Shanshan Wu, Srinadh Bhojanapalli, Sujay Sanghavi, Alexandros G. Dimakis
NeurIPS 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | in addition we also provide results from an Apache Spark implementation1 that shows better computational and statistical performance on real-world and synthetic evaluation datasets. |
| Researcher Affiliation | Academia | Shanshan Wu The University of Texas at Austin EMAIL Srinadh Bhojanapalli Toyota Technological Institute at Chicago EMAIL Sujay Sanghavi The University of Texas at Austin EMAIL Alexandros G. Dimakis The University of Texas at Austin EMAIL |
| Pseudocode | Yes | Algorithm 1 SMP-PCA: Streaming Matrix Product PCA |
| Open Source Code | Yes | The source code is available at [18].S. Wu, S. Bhojanapalli, S. Sanghavi, and A. Dimakis. Github repository for "single-pass pca of matrix products". https://github.com/wushanshan/Matrix Product PCA, 2016. |
| Open Datasets | Yes | We test our algorithm on synthetic datasets and three real datasets: SIFT10K [9], NIPS-BW [11], and URL-reputation [12]. |
| Dataset Splits | No | The paper mentions using specific datasets (SIFT10K, NIPS-BW, URL-reputation) but does not provide explicit details on how these datasets were split into training, validation, or test sets, nor does it specify proportions or sample counts for each split. |
| Hardware Specification | Yes | using a 150GB synthetic dataset on m3.2xlarge Amazon EC2 instances6. ... 6Each machine has 8 cores, 30GB memory, and 2 80GB SSD. |
| Software Dependencies | Yes | We implement our SMP-PCA in Apache Spark 1.6.2 [19]. |
| Experiment Setup | Yes | For all rest experiments, unless otherwise speciļ¬ed, we set r = 5, T = 10, and m as 4nr log n. |