Generalized Dimension Reduction Using Semi-Relaxed Gromov-Wasserstein Distance

Authors: Ranthony A. Clark, Tom Needham, Thomas Weighill

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply our computational framework to analyze ensembles of political redistricting plans for states with two Congressional districts, achieving an effective visualization of the ensemble as a distribution on a circle which can be used to characterize typical neutral plans, and to flag outliers. Using experiments on the MNIST dataset, we show that SRGW+GD matches or outperforms SMACOF MDS for Euclidean target spaces. We demonstrate its effectiveness for embeddings into spheres on a dataset of rotated MNIST images, and a set of GPS coordinates of cities.
Researcher Affiliation Academia 1 Department of Mathematics, Duke University, Durham NC 2 Department of Mathematics, Florida State University, Tallahassee FL 3 Department of Mathematics & Statistics, University of North Carolina at Greensboro, Greensboro NC EMAIL, EMAIL, EMAIL
Pseudocode No The algorithm is straightforward: in short, we solve a sr GW problem to embed X into a predefined finite subset of Y , then use this as initialization for gradient descent of the MDS functional (1), with Y as the target metric space, which could be more general than Rm. We now provide some details.
Open Source Code Yes Code github.com/thomasweighill/srgw-embedding
Open Datasets Yes We use the MNIST dataset consisting of 70,000 28 28 grayscale images of handwritten digits... We use a list of the 20 largest cities2, with the geodesic distance on the Earth between every pair of cities as ground truth... 2https://simplemaps.com/data/world-cities, CC-BY 4.0 license... For each of these states we obtained Census blockgroups from (Manson et al. 2023) and generated an ensemble of 1,000 redistricting plans using the Re Com algorithm (De Ford, Duchin, and Solomon 2021).
Dataset Splits No The paper mentions splitting the MNIST dataset into ten smaller datasets (MNIST0 to MNIST9) based on digits, but does not provide explicit train/test/validation splits for the purpose of model training or evaluation. The experiments described involve embedding these entire datasets and calculating distortion.
Hardware Specification No The paper does not specify any particular hardware (e.g., GPU/CPU models, memory) used for running the experiments. It only mentions 'compute time' in relation to performance.
Software Dependencies No The computation of dsr GW,2(X, S) is approximated via the sr GW implementation in the Python Optimal Transport package (Flamary et al. 2021)... In our examples below, we use the Adam optimizer (Kingma and Ba 2014)... with SRGW+GD is between 3 and 14 times faster then SMACOF MDS (using scikit-learn).
Experiment Setup Yes Let X = {x1, . . . , xn} and initialize a set of n points (yi)1 i n in Y via yi = f(xi). We then consider the distortion function Y n R defined by (y1, . . . , yn) 7 1 i,j=1 (d X(xi, xj) d Y (yi, yj))2 (8) (cf. the MDS functional (1)) and use a gradient-based method on Y to find a local minimum (ˆy1, ˆy2, . . . , ˆyn). Our embedding is then given by ˆf(xi) = ˆyi. We refer to this method as SRGW+GD. In our examples below, we use the Adam optimizer (Kingma and Ba 2014). In practice, the desired embedding space Y may only be known up to certain hyperparamters, such as scale. Our main example below will be when Y is a circle of unknown radius. If Y depends on a scale factor (or multiple scale factors), we add the scale factor as an additional variable in (8). ... GD denotes minimizing (8) with a random initalization and the Adam optimizer. We use 10 random initializations and report the min and max distortion over all trials.