Partial Optimal Transport for Support Subset Selection

Authors: Bilal Riaz, Yuksel Karahan, Austin J. Brockmeier

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the usefulness of this support subset selection in applications such as color transfer, partial point cloud alignment, and semi-supervised machine learning, where a part of data is curated to have reliable labels and another part is unlabeled or has unreliable labels. Our experiments show that optimal transport under the relaxed constraint can improve the performance of these applications by allowing for more flexible alignment between distributions.
Researcher Affiliation Academia Bilal Riaz EMAIL Department of Electrical & Computer Engineering University of Delaware Yüksel Karahan EMAIL Department of Electrical & Computer Engineering University of Delaware Austin J. Brockmeier EMAIL Department of Electrical & Computer Engineering Department of Computer & Information Sciences University of Delaware
Pseudocode Yes Algorithm 1: (SS-Entropic) Fast proximal gradient algorithm to solve the dual problem 17 of the entropically regularized support subset selection problem 13 Algorithm 2: (SS-Bregman) Inexact-Bregman Proximal Point Algorithm to approximately solve 6 via 25
Open Source Code Yes Code Availability Code for the paper is available at https://github.com/Bilal092/support-subset-selection.
Open Datasets Yes We applied subset selection to PU learning using the experimental settings adapted from the work of Chapel et al. (2020), who explored using the PU-Wasserstein (PUW) 9 and the partial Gromov-Wasserstein distance (PGW) on various UCI, MNIST, colored-MNIST, and Caltech-office data sets. Results for the Stanford bunny and armadillo (Turk & Levoy, 1994; Krishnamurthy & Levoy, 1996) are shown in Figure 5.
Dataset Splits Yes For the UCI, MNIST, and colored-MNIST data sets, we randomly draw m = 400 positive points and n = 800 unlabeled points. For Caltech-office data sets, we randomly sampled m = 100 positive points from the first domain and n = 100 unlabeled data points from the second domain. In order to evaluate our approach we used MNIST, Fashion-MNIST (FMNIST), and CIFAR-10. We split training data sets into 80/20 proportions for training and validation. We further split the training part into a reliably labeled and unreliably labeled parts. Labels for the unreliably labeled part are generated by uniformly corrupting the true labels to other classes depending on the noise level.
Hardware Specification Yes Model training for MNIST and Fashion-MNIST are done on a desktop system containing Intel Core-i7 9700 CPU, with 32 GB memory and NVIDIA Ge Force RTX 2070 GPU. Res Net-18 based models for CIFAR-10 are trained using Lambda-labs cloud resources with 30 v CPUs, 200 GB memory, and NVIDIA A10 GPUs.
Software Dependencies No The paper mentions "Py Torch based automatic differentiation for gradient evaluation (Paszke et al., 2017) and the Adam optimizer (Kingma & Ba, 2014)" and "We used Py Torch framework for our experiments." and "We used POT-toolbox Flamary et al. (2021) to solve the semi-relaxed optimal transport problems." It also mentions CVXPY (Diamond & Boyd, 2016; Agrawal et al., 2018). However, specific version numbers for these software packages are not provided in the text.
Experiment Setup Yes SS-Entropic is ran for 10,000 iterations with γ = 0.1. SS-Bregman is ran for max-outer-iter = 100, max-inner-iter = 100 and λ = 0.1. For subset-selection, c parameter is varied uniformly on logarithmic scale with 32 steps between 1 and 32. For both squared ℓ2-norm and total-variation penalties, we varied regularization parameter ρ with 32 uniform steps on logarithmic scale between 10 3 and 103. For each of our experiments, the subset selection transport underlying the loss done is found via SS-Bregman with λ = 0.1, max-outer-iter = max-inner-iter = 20 with a batch-size of 512. We used Py Torch framework for our experiments. A Res Net-18 model architecture is used on the CIFAR-10 data set. We trained the Res Net-18 for 180 epochs using Adam optimizer with an initial learning rate of 0.001, which is scheduled to be halved after every 60 epochs. The model architectures containing two convolutional layers for MNIST and FMNIST are given in Appendix F. The neural network classification models for MNIST are trained using stochastic gradient descent with a learning rate of 0.001, whereas models for FMNIST are trained using Adam with a learning rate of 0.001 and weight decay 1e-4.