On Probabilistic Embeddings in Optimal Dimension Reduction

Authors: Ryan Murray, Adam Pickarski

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical This work considers a generalized version of multidimensional scaling, which seeks to construct a map from high to low dimension which best preserves pairwise inner products or norms. We investigate the variational properties of this problem, leading to the following insights: 1) Particle-wise descent methods implemented in standard libraries can produce non-deterministic embeddings, 2) A probabilistic formulation leads to solutions with interpretable necessary conditions, and 3) The globally optimal solutions to the relaxed, probabilistic problem is only minimized by deterministic embeddings. This progression of results mirrors the classical development of optimal transportation, and in a case relating to the Gromov-Wasserstein distance actually gives explicit insight into the structure of the optimal embeddings, which are parametrically determined and discontinuous on smooth surfaces. Our results also imply that a standard computational implementation for this problem learns sub-optimal mappings, and we discuss how the embeddings learned in that context have highly misleading clustering structure, underscoring the delicate nature of solving this problem computationally.
Researcher Affiliation Academia Ryan Murray EMAIL Department of Mathematics North Carolina State University Raleigh, NC 27695 USA Adam Pickarski EMAIL Division of Mathematics North Carolina State University Raleigh, NC 27695 USA
Pseudocode No The paper describes mathematical proofs and theoretical concepts. It does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code generating this figure is available on the first author s github.
Open Datasets No We consider the problem of embedding a particular point cloud in R2 into R. The point cloud that we choose has 1,000 points placed at (0, .2), as well as 250 points placed randomly upon the unit circle. No concrete access information for a public dataset is provided.
Dataset Splits No The paper describes a theoretical analysis and illustrates concepts with a synthetic point cloud example. There is no mention of dataset splits like training, validation, or test sets.
Hardware Specification No The paper mentions using "the built-in algorithm for metric multidimensional scaling in Scikit-learn (Pedregosa et al., 2011)" for an example, but does not provide any specific hardware details such as CPU, GPU models, or memory specifications.
Software Dependencies No When we utilize the built-in algorithm for metric multidimensional scaling in Scikit-learn (Pedregosa et al., 2011)... The paper mentions "Scikit-learn" but does not specify a version number or other software dependencies with their versions.
Experiment Setup No For the numerical example, the paper describes the point cloud generation and the use of Scikit-learn's MDS algorithm with an initial guess, but it does not provide specific experimental setup details such as hyperparameters, optimization settings, or other training configurations.