Simultaneous Dimensionality Reduction: A Data Efficient Approach for Multimodal Representations Learning
Authors: Eslam Abdelaleem, Ahmed Roman, K. Michael Martini, Ilya Nemenman
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using numerical experiments, we demonstrate that linear SDR methods consistently outperform linear IDR methods and yield higherquality, more succinct reduced-dimensional representations with smaller datasets. |
| Researcher Affiliation | Academia | Eslam Abdelaleem EMAIL Department of Physics Emory University |
| Pseudocode | No | The paper describes algorithms like PCA, PLS, CCA, and rCCA using mathematical formulations and optimization problems (e.g., Eq. 6, 7, 8, 15) but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks with structured, code-like steps. |
| Open Source Code | No | The text states: "We used Python and the scikit-learn (Pedregosa et al., 2011) library for performing PCA, PLS, and CCA, while the cca-zoo (Chapman & Wang, 2021) library was used for r CCA." This indicates the use of existing libraries, not the release of the authors' own implementation code. |
| Open Datasets | Yes | To analyze linear DR methods on nonlinear data, we followed the same procedure as in Fig. 6 for a dataset inspired by the noisy MNIST dataset (Le Cun et al., 1998; Wang et al., 2015; 2016; Abdelaleem et al., 2023). |
| Dataset Splits | Yes | For every numerical experiment, we generate training and test data sets (Xtrain, Ytrain) and (Xtest, Ytest) according to Eqs. (1-2)3. ... This resulted in a total dataset size of 56k images for training and 7k images for testing. |
| Hardware Specification | No | The simulations were parallelized and run on Amazon Web Services (AWS) servers of various instance types. |
| Software Dependencies | No | We used Python and the scikit-learn (Pedregosa et al., 2011) library for performing PCA, PLS, and CCA, while the cca-zoo (Chapman & Wang, 2021) library was used for r CCA. For PCA, SVD was performed with default parameters. For PLS, the PLS Canonical method was used with the NIPALS algorithm. For both PLS and CCA, the tolerance was set to 10 4 with a maximum convergence limit of 5000 iterations. For r CCA, regularization parameters were set as c1 = c2 = 0.1. All other parameters not explicitly here were set to their default values. |
| Experiment Setup | Yes | For PLS, the PLS Canonical method was used with the NIPALS algorithm. For both PLS and CCA, the tolerance was set to 10 4 with a maximum convergence limit of 5000 iterations. For r CCA, regularization parameters were set as c1 = c2 = 0.1. |