Cramer-Wold Auto-Encoder

Authors: Szymon Knop, Przemysław Spurek, Jacek Tabor, Igor Podolak, Marcin Mazur, Stanisław Jastrzębski

JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we empirically validate the proposed CWAE7 model on standard benchmarks for generative models CELEB A, CIFAR-10, MNIST, and Fashion MNIST. We compare the proposed CWAE model with WAE-MMD (Tolstikhin et al., 2017) and SWAE (Kolouri et al., 2018). As we shall see, our results match, or even exceed, those of WAE-MMD and SWAE, while using a closed-form cost function. The rest of this section is structured as follows. In Subsection 7.2 we report the results of the standard qualitative tests, as well as visual investigations of the latent space. In Subsection 7.3 we will turn our attention to quantitative tests using Fréchet Inception Distance and other metrics (Heusel et al., 2017).
Researcher Affiliation Academia Faculty of Mathematics and Computer Science Jagiellonian University, Kraków, Poland Center of Data Science / Department of Radiology New York University, New York, United States
Pseudocode No The paper does not contain any explicitly labeled pseudocode or algorithm blocks. It primarily uses mathematical derivations and textual descriptions for its methods.
Open Source Code Yes 7. The code is available at https://github.com/gmum/cwae.
Open Datasets Yes In this section we empirically validate the proposed CWAE7 model on standard benchmarks for generative models CELEB A, CIFAR-10, MNIST, and Fashion MNIST.
Dataset Splits No The paper mentions using a "test set" and "validation data-set" (e.g., in Section 7.2 and Figure 8 caption) for standard datasets like MNIST, Fashion-MNIST, CIFAR-10, and CELEB A, but it does not explicitly state the percentages, sample counts, or provide citations for the specific splits used to reproduce the data partitioning.
Hardware Specification No The paper makes general statements such as "The times may differ for computer architectures with more/less memory on a GPU card" (Figure 10 caption) but does not provide specific hardware details like exact GPU/CPU models, processor types, or memory amounts used for running experiments.
Software Dependencies No The paper mentions the use of the "Adam optimiser (Kingma and Ba, 2014)" but does not specify any software libraries (e.g., PyTorch, TensorFlow) or their version numbers that would be required to replicate the experiment.
Experiment Setup Yes For WAE, SWAE and CWAE models and each data-set we performed a grid search over parameter λ {1, 5, 10, 100} and learning rate values from {0.01, 0.005, 0.001, 0.0005, 0.0001}. All models were trained on a 128-element mini-batches. For every model, we report results for a configuration that achieved the lowest value of FID Score. All networks were trained with the Adam optimiser (Kingma and Ba, 2014). The hyper-parameters used were learning rate = 0.001, β1 = 0.9, β2 = 0.999, ϵ = 1e-8. MNIST and CIFAR 10 models were trained for 500 epochs, CELEB A for 55.