Neural Implicit Manifold Learning for Topology-Aware Density Estimation

Authors: Brendan Leigh Ross, Gabriel Loaiza-Ganem, Anthony L. Caterini, Jesse C. Cresswell

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments on synthetic and natural data, we show that our model can learn manifold-supported distributions with complex topologies more accurately than pushforward models. In this section we demonstrate the efficacy of EBIMs on a diverse range of topologically non-trivial data. Our code is written in Py Torch (Paszke et al., 2019). We use GPy Torch (Gardner et al., 2018) for conjugate gradients and the marching cubes algorithm of Yatagawa (2021) to plot 2D implicit manifolds in 3D. We generate synthetic data with Pyro (Bingham et al., 2019). Network architectures, hyperparameter settings, and further experimental details can be found in Appendix B. Quantitative comparisons of density estimates are challenging when manifolds are unknown: likelihood values, the usual way to compare density estimators, are uninformative for different learned manifolds. Any test datapoint that is not within the model manifold would result in a model likelihood of 0 because model densities are strictly constrained to their manifolds. Instead, we grade model densities on the basis of the Wasserstein-1 distance to the ground truth (Table 1), which is a rigorous way to measure the distance between two distributions on possibly non-overlapping submanifolds (Arjovsky et al., 2017).
Researcher Affiliation Industry Brendan Leigh Ross EMAIL Layer 6 AI Gabriel Loaiza-Ganem EMAIL Layer 6 AI Anthony L. Caterini EMAIL Layer 6 AI Jesse C. Cresswell EMAIL Layer 6 AI
Pseudocode Yes Algorithm 1 Efficient Constrained Langevin Monte Carlo
Open Source Code No Our code is written in Py Torch (Paszke et al., 2019). We use GPy Torch (Gardner et al., 2018) for conjugate gradients and the marching cubes algorithm of Yatagawa (2021) to plot 2D implicit manifolds in 3D. We generate synthetic data with Pyro (Bingham et al., 2019). (Section 4) The paper mentions the tools and frameworks used but does not explicitly state that the source code for the methodology described in this paper is openly available or provide a direct link to a repository.
Open Datasets Yes In experiments on synthetic and natural data, we show that our model can learn manifold-supported distributions with complex topologies more accurately than pushforward models. (Abstract) Figure 1: In the top row, our EBIM method is depicted on simulated circular data from a von Mises distribution. Following Mathieu & Nickel (2020), we model a dataset of global flood events from the Dartmouth Flood Observatory (Brakenridge, 2010), embedded on a sphere representing the Earth. In Figure 6, we compare an EBIM with a pushforward EBM using an open-source amino acid dataset available from the Num Pyro software package (Phan et al., 2019). Image generation In this section we show that EBIMs can be scaled to higher-dimensional data manifolds: MNIST (Le Cun et al., 1998) and Fashion MNIST (Xiao et al., 2017).
Dataset Splits No We sampled 1000 points from a von Mises distribution on a unit circle (B.1 Motivating example). We sampled 1000 points from a balanced mixture of two von Mises distributions (B.1 Von Mises mixture). For image datasets (MNIST, Fashion MNIST), it mentions using the datasets but does not specify how they were split into training, validation, or test sets; it refers to 'samples' (Figure 7) and 'training set' (Table 3) without providing specific split information. The paper does not explicitly provide details about training, validation, and test splits for the datasets used.
Hardware Specification Yes All low-dimensional experiments were performed on an Intel Xeon Silver 4114 CPU.
Software Dependencies No Our code is written in Py Torch (Paszke et al., 2019). We use GPy Torch (Gardner et al., 2018) for conjugate gradients and the marching cubes algorithm of Yatagawa (2021) to plot 2D implicit manifolds in 3D. We generate synthetic data with Pyro (Bingham et al., 2019). Network architectures, hyperparameter settings, and further experimental details can be found in Appendix B. The paper mentions several software packages (PyTorch, GPyTorch, marching cubes by Yatagawa, Pyro, Adam optimizer, SiLU activations, labml.ai) and provides citations with years for them. However, it does not provide specific version numbers (e.g., Python 3.8, PyTorch 1.9) for these components, which are required for a reproducible description.
Experiment Setup Yes Network architectures, hyperparameter settings, and further experimental details can be found in Appendix B. For all experiments, we use feedforward networks with Si LU activations (Hendrycks & Gimpel, 2016; Ramachandran et al., 2017). All models are trained with the Adam optimizer (Kingma & Ba, 2015) with the default Py Torch parameters, except for the learning rate which is set as described below. Motivating example (Figure 1) ... The MDF was trained for 300 epochs with a batch size of 50, a learning rate of 0.01, η = 1, α = 0.3, and β = 10. Langevin dynamics with run with ε = 0.1 and a step size of 10. ... The energy function ... It was trained for 20 epochs with a batch size of 50, a learning rate of 0.01, gradients clipped to a norm of 1, and energy magnitudes regularized with a coefficient of 0.1. Langevin dynamics at each training step were run for 10 steps with ε = 0.3, a step size of 1, and energy gradients clamped to maximum values of 0.1 at each step. (Appendix B.1)