Inverse Kernel Decomposition

Authors: Chengrui Li, Anqi Wu

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In the experiment section, we compare IKD against four eigen-decomposition-based and four optimizationbased dimensionality reduction methods using synthetic datasets and four real-world datasets, and we can summarize four contributions of IKD: As an eigen-decomposition-based method, IKD achieves more reasonable latent representations than other eigen-decomposition-based methods with better classification accuracy in downstream classification tasks.
Researcher Affiliation Academia Chengrui Li EMAIL School of Computational Science & Engineering Georgia Institute of Technology Anqi Wu EMAIL School of Computational Science & Engineering Georgia Institute of Technology
Pseudocode Yes Algorithm 1 Inverse kernel decomposition
Open Source Code Yes Open-source IKD implementation in Python can be accessed at https://github.com/Jerry Soybean/ikd.
Open Datasets Yes We compare IKD against alternatives on four real-world datasets: Single-cell q PCR (PRC) (Guo et al., 2010): Normalized measurements of 48 genes of a single cell at 10 different stages. There are 437 data points in total, resulting in X R437 48. Hand written digits (digits) (Dua & Graff, 2017): It consists 1797 grayscale images of hand written digits. Each one is an 8 8 image, resulting in X R1797 64. COIL-20 (Nene et al., 1996): It consists 1440 grayscale photos. For each one of the 20 objects in total, 72 photos were taken from different angles. Each one is a 128 128 image, resulting in X R1440 16384. Fashion MNIST (F-MNIST) (Xiao et al., 2017): It consists of 70000 grayscale images of 10 fashion items (clothing, bags, etc). We use a subset of it, resulting in X R3000 784.
Dataset Splits Yes Specifically, we apply 5-fold cross-validation k-NN (k {5, 10, 20}) on the estimated {2, 3, 5, 10}-dimensional latent to evaluate the performance of each method on each dataset.
Hardware Specification No The paper discusses running times but does not specify any particular hardware (e.g., GPU/CPU models, memory) used for the experiments.
Software Dependencies No The paper mentions "IKD implementation in Python", "GPLVM module in the GPy package (GPy, since 2012)", "sklearn", and "official UMAP package (Mc Innes et al., 2018)" but does not provide specific version numbers for these software components.
Experiment Setup Yes For each trial, we generate the true latent variables from Zm,1:T N 0, 6e |i j|, m {1, ..., M}, (12) where M is the latent dimensionality, varying across different datasets. Then, we generate the noiseless data from GP, sinusoidal, and Gaussian bump mapping functions respectively. Afterward, i.i.d. Gaussian noise is added to form the final noisy observations X. ... In each trial, we generate a 3D latent Z R1000 3 (i.e., M = 3) according to Eq. 12, and generate X R1000 N according to Eq. 1 with σ2 = 1 and l = 3. Then Gaussian noise is added: xt,n xt,n + εt,n, (t, n) {1, ..., 1000} {1, ..., N}, where noise εt,n N(0, 0.052).