Neural Empirical Bayes

Authors: Saeed Saremi, Aapo Hyvärinen

JMLR 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We introduce two algorithmic frameworks based on this machinery: (i) a walk-jump sampling scheme that combines Langevin MCMC (walks) and empirical Bayes (jumps), and (ii) a probabilistic framework for associative memory, called NEBULA, defined a la Hopfield by the gradient flow of the learned energy to a set of attractors. We finish the paper by reporting the emergence of very rich creative memories as attractors of NEBULA for highly-overlapping spheres. We tested the algorithm on the handwritten digit database for σ = 0.3 and σ = 0.15. The results are shown in Figure 5 and 6 respectively.
Researcher Affiliation Academia Saeed Saremi EMAIL Redwood Center for Theoretical Neuroscience University of California Berkeley, CA 94720-3198, USA NNAISENSE Inc., Austin, TX Aapo Hyv arinen EMAIL University College London, UK Universit e Paris-Saclay, Inria, France University of Helsinki, Finland
Pseudocode No The paper describes algorithmic steps for "DEEN" as "DEEN: minimize L(θ) with stochastic gradient descent and return θ" and for "Walk-jump sampling" as bullet points with equations on page 7. However, these are not presented within a formally labeled "Pseudocode" or "Algorithm" block.
Open Source Code No The paper does not contain any explicit statements about releasing code, a link to a code repository, or mention of code in supplementary materials for the methodology described.
Open Datasets Yes estimated from 107 pairs from the handwritten digit database (Le Cun et al., 1998). Here, we report such experiments for MNIST
Dataset Splits No The paper refers to the "MNIST test set" (e.g., Figure 7: "The top row are Xi from the MNIST test set.") and implies a training set for DEEN, but does not specify explicit percentages, sample counts, or a detailed methodology for how the training, validation, or test splits were performed.
Hardware Specification No The paper does not specify any particular GPU models, CPU types, or other hardware details used for running the experiments.
Software Dependencies No The automatic differentiation (Baydin et al., 2018) was implemented in Py Torch (Paszke et al., 2017). We used the Adam optimizer (Kingma and Ba, 2014). While PyTorch and Adam are mentioned, specific version numbers for PyTorch or other libraries are not provided.
Experiment Setup Yes The denoising results are a significant improvement over (Saremi et al., 2018). This was due to the use of Conv Nets, instead of a fully connected network (more on that below) and the use of a smooth Re LU activation function σ(z, β) = z/(1 + exp( βz)), where the default was β = 1. All experiments reported in the paper were performed in a fixed wide Conv Net architecture with the expanding channels = (256, 512, 1024), without pooling, and with a bottleneck layer of size 10. All hidden layers were activated with the activation function σ( , β = 1), and the readout layer was linear. We used the Adam optimizer (Kingma and Ba, 2014). For walk-jump sampling, the step size was δ = σ/100.