reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Cramer-Wold Auto-Encoder

Authors: Szymon Knop, Przemysław Spurek, Jacek Tabor, Igor Podolak, Marcin Mazur, Stanisław Jastrzębski

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we empirically validate the proposed CWAE7 model on standard benchmarks for generative models CELEB A, CIFAR-10, MNIST, and Fashion MNIST. We compare the proposed CWAE model with WAE-MMD (Tolstikhin et al., 2017) and SWAE (Kolouri et al., 2018). As we shall see, our results match, or even exceed, those of WAE-MMD and SWAE, while using a closed-form cost function. The rest of this section is structured as follows. In Subsection 7.2 we report the results of the standard qualitative tests, as well as visual investigations of the latent space. In Subsection 7.3 we will turn our attention to quantitative tests using Fréchet Inception Distance and other metrics (Heusel et al., 2017).
Researcher Affiliation	Academia	Faculty of Mathematics and Computer Science Jagiellonian University, Kraków, Poland Center of Data Science / Department of Radiology New York University, New York, United States
Pseudocode	No	The paper does not contain any explicitly labeled pseudocode or algorithm blocks. It primarily uses mathematical derivations and textual descriptions for its methods.
Open Source Code	Yes	7. The code is available at https://github.com/gmum/cwae.
Open Datasets	Yes	In this section we empirically validate the proposed CWAE7 model on standard benchmarks for generative models CELEB A, CIFAR-10, MNIST, and Fashion MNIST.
Dataset Splits	No	The paper mentions using a "test set" and "validation data-set" (e.g., in Section 7.2 and Figure 8 caption) for standard datasets like MNIST, Fashion-MNIST, CIFAR-10, and CELEB A, but it does not explicitly state the percentages, sample counts, or provide citations for the specific splits used to reproduce the data partitioning.
Hardware Specification	No	The paper makes general statements such as "The times may differ for computer architectures with more/less memory on a GPU card" (Figure 10 caption) but does not provide specific hardware details like exact GPU/CPU models, processor types, or memory amounts used for running experiments.
Software Dependencies	No	The paper mentions the use of the "Adam optimiser (Kingma and Ba, 2014)" but does not specify any software libraries (e.g., PyTorch, TensorFlow) or their version numbers that would be required to replicate the experiment.
Experiment Setup	Yes	For WAE, SWAE and CWAE models and each data-set we performed a grid search over parameter λ {1, 5, 10, 100} and learning rate values from {0.01, 0.005, 0.001, 0.0005, 0.0001}. All models were trained on a 128-element mini-batches. For every model, we report results for a configuration that achieved the lowest value of FID Score. All networks were trained with the Adam optimiser (Kingma and Ba, 2014). The hyper-parameters used were learning rate = 0.001, β1 = 0.9, β2 = 0.999, ϵ = 1e-8. MNIST and CIFAR 10 models were trained for 500 epochs, CELEB A for 55.