Efficient Gradient Flows in Sliced-Wasserstein Space
Authors: Clément Bonet, Nicolas Courty, François Septier, Lucas Drumetz
TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we show that by approximating sliced-Wasserstein gradient flows using the SW-JKO scheme (17), we are able to minimize functionals as well as Wasserstein gradient flows approximated by the JKO-ICNN scheme and with a better computational complexity. We first evaluate the ability to learn the stationary density for the Fokker-Planck equation (8) in the Gaussian case, and in the context of Bayesian Logistic Regression. Then, we evaluate it on an Aggregation equation. Finally, we use SW as a functional with image datasets as target, and compare the results with Sliced-Wasserstein flows introduced in (Liutkus et al., 2019). |
| Researcher Affiliation | Academia | Clément Bonet EMAIL Université Bretagne Sud, CNRS, LMBA, Vannes, France Nicolas Courty EMAIL Université Bretagne Sud, CNRS, IRISA, Vannes, France François Septier EMAIL Université Bretagne Sud, CNRS, LMBA, Vannes, France Lucas Drumetz EMAIL IMT Atlantique, CNRS, Lab-STICC, Brest, France |
| Pseudocode | Yes | Algorithm 1 SW-JKO with Generative Models |
| Open Source Code | No | For the JKO-ICNN scheme, we use our own implementation. ... The datasets are loaded using the code of Mokrov et al. (2021) (https://github.com/PetrMokrov/Large-Scale-Wasserstein-Gradient-Flows). ... We choose the same AE as Liutkus et al. (2019) which is available at https://github.com/aliutkus/swf/blob/master/code/networks/autoencoder.py. ... For SWF in the latent space, we used our own implementation. ... We use the code of Dai & Seljak (2021) (available at https://github.com/biweidai/SINF) -- The paper describes its own implementation for some components and refers to external codebases, but it lacks an explicit statement or a direct link to the source code for the main methodology (SW-JKO scheme) described in the paper. |
| Open Datasets | Yes | Applied on images such as MNIST (Le Cun & Cortes, 2010), Fashion MNIST (Xiao et al., 2017) or Celeb A (Liu et al., 2015)... |
| Dataset Splits | Yes | We split the dataset between train set and test set with a 4:1 ratio. |
| Hardware Specification | Yes | Here, the training took around 5 hours on a RTX 2080 TI (for 100 steps), versus 20 minutes for the FCNN and 10 minutes for 1000 particles (for 200 steps). |
| Software Dependencies | No | Our experiments were conducted using Py Torch (Paszke et al., 2019). We use the scipy implementation (Virtanen et al., 2020) gaussian_kde. -- Specific version numbers for PyTorch and SciPy are not provided. |
| Experiment Setup | Yes | We choose τ = 0.1 and performed 80 SW-JKO steps. ... We use an Adam optimizer (Kingma & Ba, 2014) with a learning rate of 10^-4 for Real NVP (except for the 1st iteration where we take a learning rate of 5 * 10^-3) and of 5 * 10^-3 for JKO-ICNN. At each inner optimization step, we start from a deep copy of the last neural network, and optimize Real NVP for 200 epochs and ICNNs for 500 epochs, with a batch size of 1024. ... Table 4 details hyperparameters (e.g., nl, nh, lr, JKO steps, Iters by step, τ, batch size) for various datasets. |