Probabilistic Autoencoder
Authors: Vanessa M Boehm, Uros Seljak
TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We back this claim empirically through ablation studies. Specifically, we compare the performance of the PAE to that of equivalent VAEs in a number of tasks which we think are specifically relevant for practical applications: data compression (reconstruction quality), data generation, anomaly detection and probabilistic data denoising and imputation. |
| Researcher Affiliation | Academia | Vanessa Böhm EMAIL Berkeley Center for Cosmological Physics Department of Physics University of California Berkeley, CA, USA Lawrence Berkeley National Laboratory Uroš Seljak EMAIL Berkeley Center for Cosmological Physics Department of Physics University of California Berkeley, California, USA Lawrence Berkeley National Laboratory |
| Pseudocode | No | The paper describes the PAE training process and various models using narrative text and mathematical equations, but it does not contain any explicitly labeled "Pseudocode" or "Algorithm" blocks. |
| Open Source Code | Yes | We make all of our code publicly available.1 1https://github.com/VMBoehm/PAE-ablation |
| Open Datasets | Yes | We perform our ablation studies on the Fashion-MNIST (Xiao et al., 2017) data set... As outlier data sets we use MNIST (Lecun et al., 1998) and Omniglot (Lake et al., 2015)... We further train a PAE model on the higher dimensional Celeb-A (Liu et al., 2015) data set (Appendix A). |
| Dataset Splits | Yes | We perform our ablation studies on the Fashion-MNIST (Xiao et al., 2017) data set, which we split into 50,000 training examples, 10,000 validation and 10,000 test samples. |
| Hardware Specification | No | The paper discusses various training parameters, model architectures, and optimization methods (e.g., ADAM optimizer), but it does not specify any particular hardware components like GPU models, CPU types, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions the use of the ADAM optimizer and refers to various flow architectures like Real NVP, Neural Spline Flow, and GLOW. However, it does not provide specific software dependencies or library versions (e.g., Python, PyTorch/TensorFlow, CUDA versions) used for the implementation. |
| Experiment Setup | Yes | Training of the encoder/decoder pair: We used the same encoder and decoder architecture for all of our experiments. We further fixed the latent space dimensionality to 40, the number of training steps to 300,000 and used a learning rate schedule in which we keep the learning rate constant at the initial value up to training step 100,000, then reduce it linearly down to 1/10 of the initial rate over 50,000 steps. For the remaining 150,000 steps we restart the learning rate and repeat the annealing scheme. We used the ADAM optimizer (Kingma & Ba, 2015) with parameters β1 = 0.9, β2 = 0.999, ϵ = 10 7 in all of our trainings... The batch size, initial learning rate, sample size in the stochastic evaluation of the ELBO as well as the drop out rate were optimized... The values we obtained through this procedure are outlined in table 2... |