reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Probabilistic Autoencoder

Authors: Vanessa M Boehm, Uros Seljak

TMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We back this claim empirically through ablation studies. Specifically, we compare the performance of the PAE to that of equivalent VAEs in a number of tasks which we think are specifically relevant for practical applications: data compression (reconstruction quality), data generation, anomaly detection and probabilistic data denoising and imputation.
Researcher Affiliation	Academia	Vanessa Böhm EMAIL Berkeley Center for Cosmological Physics Department of Physics University of California Berkeley, CA, USA Lawrence Berkeley National Laboratory Uroš Seljak EMAIL Berkeley Center for Cosmological Physics Department of Physics University of California Berkeley, California, USA Lawrence Berkeley National Laboratory
Pseudocode	No	The paper describes the PAE training process and various models using narrative text and mathematical equations, but it does not contain any explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code	Yes	We make all of our code publicly available.1 1https://github.com/VMBoehm/PAE-ablation
Open Datasets	Yes	We perform our ablation studies on the Fashion-MNIST (Xiao et al., 2017) data set... As outlier data sets we use MNIST (Lecun et al., 1998) and Omniglot (Lake et al., 2015)... We further train a PAE model on the higher dimensional Celeb-A (Liu et al., 2015) data set (Appendix A).
Dataset Splits	Yes	We perform our ablation studies on the Fashion-MNIST (Xiao et al., 2017) data set, which we split into 50,000 training examples, 10,000 validation and 10,000 test samples.
Hardware Specification	No	The paper discusses various training parameters, model architectures, and optimization methods (e.g., ADAM optimizer), but it does not specify any particular hardware components like GPU models, CPU types, or memory used for running the experiments.
Software Dependencies	No	The paper mentions the use of the ADAM optimizer and refers to various flow architectures like Real NVP, Neural Spline Flow, and GLOW. However, it does not provide specific software dependencies or library versions (e.g., Python, PyTorch/TensorFlow, CUDA versions) used for the implementation.
Experiment Setup	Yes	Training of the encoder/decoder pair: We used the same encoder and decoder architecture for all of our experiments. We further fixed the latent space dimensionality to 40, the number of training steps to 300,000 and used a learning rate schedule in which we keep the learning rate constant at the initial value up to training step 100,000, then reduce it linearly down to 1/10 of the initial rate over 50,000 steps. For the remaining 150,000 steps we restart the learning rate and repeat the annealing scheme. We used the ADAM optimizer (Kingma & Ba, 2015) with parameters β1 = 0.9, β2 = 0.999, ϵ = 10 7 in all of our trainings... The batch size, initial learning rate, sample size in the stochastic evaluation of the ELBO as well as the drop out rate were optimized... The values we obtained through this procedure are outlined in table 2...