Statistical and Computational Guarantees of Kernel Max-Sliced Wasserstein Distances

Authors: Jie Wang, March Boedihardjo, Yao Xie

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide numerical examples to demonstrate the good performance of our scheme for high-dimensional two-sample testing. ... This section presents experiment results for KMS 2-Wasserstein distance that is solved using SDR with first-order algorithm and rank reduction (denoted as SDR-Efficient). Baseline approaches include the block coordinate descent (BCD) algorithm [75], which finds stationary points of KMS 2-Wasserstein, and using interior point method (IPM) by off-the-shelf solver cvxpy [19] for solving SDR relaxation (denoted as SDR-IPM). ... We first compare our approach to baseline methods in terms of running time and solution quality. ... Then we validate the performance of KMS 2-Wasserstein distance for high-dimensional two-sample testing using both synthetic and real datasets. ... We evaluate the performance of the KMS Wasserstein distance in detecting human activity transitions... Finally, we examine the performance of various statistical divergences in generative modeling...
Researcher Affiliation Academia 1School of Artificial Intelligence, The Chinese University of Hong Kong, Shenzhen, Shenzhen, China 2School of Data Science, The Chinese University of Hong Kong, Shenzhen, Shenzhen, China 3Department of Mathematics, Michigan State University, East Lansing, USA 4School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, USA. Correspondence to: Yao Xie <EMAIL>.
Pseudocode Yes Algorithm 1 Inexact Mirror Ascent for solving (SDR), Algorithm 2 Stochastic Gradient-based Algorithm with Katyusha Momentum for solving OT [82], Algorithm 3 Round to Γn ([1, Algorithm 2]), Algorithm 4 Rank reduction algorithm for (SDR)
Open Source Code No The text discusses the source code of a third-party tool or platform that the authors used, but does not provide their own implementation code for the core methodology described in this paper. Specifically, it mentions: 'we use the exact algorithm adopted from https://pythonot.github.io/ to solve the inner OT; whereas for large sample size, we use the approximation algorithm adopted from https://github.com/Yiling Xie27/PDASGD to solve this subproblem. For the baseline BCD approach, we implement it using the code from github.com/Walter Baby Rudin/KPW_Test/tree/main.'
Open Datasets Yes MNIST [15] and CIFAR-10 [40] with changes in distribution abundance. ... The MSRC-12 Kinect gesture dataset [21] contains sequences of human body movements...
Dataset Splits Yes (I) We first do the 50%-50% training-testing data split such that xn = x Tr x Te and yn = y Tr y Te.
Hardware Specification Yes All experiments were conducted on a Mac Book Pro with an Intel Core i9 2.4GHz and 16GB memory.
Software Dependencies No The paper mentions using specific software packages like 'cvxpy [19]' and 'POT [20]' and refers to code repositories for implementations, but it does not provide specific version numbers for these or any other ancillary software components (e.g., 'Python 3.8, PyTorch 1.9, and CUDA 11.1').
Experiment Setup Yes Unless otherwise stated, error bars are reproduced using 20 independent trials. Throughout the experiments, we specify the kernel as Gaussian, with the bandwidth being the median of pairwise distances between data points. ... The type-I error is controlled within 0.05 for all methods. ... L = 500 times. ... We employ a sliding window approach [81] with a false alarm rate of 0.01... We specify fθ as a 4-layer feed-forward neural-net with leaky relu activation, and θ denotes its weight parameters. We train the optimization algorithm in 30 epoches.