Stochastic Approximation for Canonical Correlation Analysis
Authors: Raman Arora, Teodor Vanislavov Marinov, Poorya Mianjy, Nati Srebro
NeurIPS 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide experimental results for our proposed methods, in particular we compare capped-MSG which is the practical variant of Algorithm 1 with capping as defined in equation (10), and MEG (Algorithm 2 in the Appendix), on a real dataset, Mediamill [19], consisting of paired observations of videos and corresponding commentary. We compare our algorithms against CCALin of [8], ALS CCA of [24]2, and SAA, which is denoted by batch in Figure 1. |
| Researcher Affiliation | Academia | Raman Arora Dept. of Computer Science Johns Hopkins University Baltimore, MD 21204 EMAIL Teodor V. Marinov Dept. of Computer Science Johns Hopkins University Baltimore, MD 21204 EMAIL Poorya Mianjy Dept. of Computer Science Johns Hopkins University Baltimore, MD 21204 EMAIL Nathan Srebro TTI-Chicago Chicago, Illinois 60637 EMAIL |
| Pseudocode | Yes | Algorithm 1 Matrix Stochastic Gradient for CCA (MSG-CCA) Input: Training data {(xt, yt)}T t=1, step size , auxiliary training data {(x0 i=1 Output: M |
| Open Source Code | Yes | We make our implementation of the proposed algorithms and existing competing techniques available online1. 1https://www.dropbox.com/sh/dkz4zgkevfyzif3/AABK9JlUvIUYtHvLPCBXLlpha?dl=0 |
| Open Datasets | Yes | We provide experimental results for our proposed methods, in particular we compare capped-MSG which is the practical variant of Algorithm 1 with capping as defined in equation (10), and MEG (Algorithm 2 in the Appendix), on a real dataset, Mediamill [19], consisting of paired observations of videos and corresponding commentary. |
| Dataset Splits | No | The paper mentions 'Training data' for Algorithm 1 and 'training dataset' in the Problem Formulation section but does not specify any particular train/validation/test splits (e.g., percentages, sample counts, or citations to predefined splits) needed for reproduction. It only states the total number of samples n = 10,000 for Mediamill. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. It only mentions 'CPU runtime' as a metric. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | For both MSG and MEG we set the step size at iteration t to be t = 0.1 p. The target dimensionality in our experiments is k 2 {1, 2, 4}. To ensure that the problem is well-conditioned, we add λI for λ = 0.1 to the empirical estimates of the covariance matrices on Mediamill dataset. |