Inverse Kernel Decomposition
Authors: Chengrui Li, Anqi Wu
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In the experiment section, we compare IKD against four eigen-decomposition-based and four optimizationbased dimensionality reduction methods using synthetic datasets and four real-world datasets, and we can summarize four contributions of IKD: As an eigen-decomposition-based method, IKD achieves more reasonable latent representations than other eigen-decomposition-based methods with better classification accuracy in downstream classification tasks. |
| Researcher Affiliation | Academia | Chengrui Li EMAIL School of Computational Science & Engineering Georgia Institute of Technology Anqi Wu EMAIL School of Computational Science & Engineering Georgia Institute of Technology |
| Pseudocode | Yes | Algorithm 1 Inverse kernel decomposition |
| Open Source Code | Yes | Open-source IKD implementation in Python can be accessed at https://github.com/Jerry Soybean/ikd. |
| Open Datasets | Yes | We compare IKD against alternatives on four real-world datasets: Single-cell q PCR (PRC) (Guo et al., 2010): Normalized measurements of 48 genes of a single cell at 10 different stages. There are 437 data points in total, resulting in X R437 48. Hand written digits (digits) (Dua & Graff, 2017): It consists 1797 grayscale images of hand written digits. Each one is an 8 8 image, resulting in X R1797 64. COIL-20 (Nene et al., 1996): It consists 1440 grayscale photos. For each one of the 20 objects in total, 72 photos were taken from different angles. Each one is a 128 128 image, resulting in X R1440 16384. Fashion MNIST (F-MNIST) (Xiao et al., 2017): It consists of 70000 grayscale images of 10 fashion items (clothing, bags, etc). We use a subset of it, resulting in X R3000 784. |
| Dataset Splits | Yes | Specifically, we apply 5-fold cross-validation k-NN (k {5, 10, 20}) on the estimated {2, 3, 5, 10}-dimensional latent to evaluate the performance of each method on each dataset. |
| Hardware Specification | No | The paper discusses running times but does not specify any particular hardware (e.g., GPU/CPU models, memory) used for the experiments. |
| Software Dependencies | No | The paper mentions "IKD implementation in Python", "GPLVM module in the GPy package (GPy, since 2012)", "sklearn", and "official UMAP package (Mc Innes et al., 2018)" but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | For each trial, we generate the true latent variables from Zm,1:T N 0, 6e |i j|, m {1, ..., M}, (12) where M is the latent dimensionality, varying across different datasets. Then, we generate the noiseless data from GP, sinusoidal, and Gaussian bump mapping functions respectively. Afterward, i.i.d. Gaussian noise is added to form the final noisy observations X. ... In each trial, we generate a 3D latent Z R1000 3 (i.e., M = 3) according to Eq. 12, and generate X R1000 N according to Eq. 1 with σ2 = 1 and l = 3. Then Gaussian noise is added: xt,n xt,n + εt,n, (t, n) {1, ..., 1000} {1, ..., N}, where noise εt,n N(0, 0.052). |