Neural Sampling from Boltzmann Densities: Fisher-Rao Curves in the Wasserstein Geometry
Authors: Jannis Chemseddine, Christian Wald, Richard Duong, Gabriele Steidl
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate by numerical examples that our model provides a well-behaved flow field which successfully solves the above sampling task. 4 EXPERIMENTS In this section we apply the different approaches to common sampling problems. We compare the performance of using the linear, learned or gradient flow interpolation. Table 1: Comparison of effective sample size (ESS), negative log likelihood (NLL), and the energy distance for different interpolations. Table 2: Comparison of effective sample size (ESS), negative log likelihood (NLL), and energy distance for different interpolations, evaluated for the 8-dimensional and 16-dimensional experiments with m = 4. |
| Researcher Affiliation | Academia | Jannis Chemseddine, Christian Wald, Richard Duong & Gabriele Steidl Institute of Mathematics TU Berlin Straße des 17. Juni 136 Berlin, Germany EMAIL |
| Pseudocode | Yes | C ALGORITHMS Algorithm 1 Learning vθ t , Cθ t in (17) for xi sampled from trajectories. Algorithm 2 Learning f θ1 t , vθ2 t , Cθ3 t in (18) for xi sampled from trajectories. Algorithm 3 Learning ψθ1 t , Cθ2 t in (19) as done in Section 4. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a direct link to a code repository. It mentions using PyTorch and torchdiffeq, but these are third-party tools, not the authors' own implementation code for the described methodology. |
| Open Datasets | No | The paper describes generating synthetic datasets for its experiments, such as a "mixture of 40 evenly weighted Gaussians in 2 dimensions" and an "8 and 16-dimensional many well distribution." No public dataset names, direct access links, DOIs, or formal citations for public datasets are provided. |
| Dataset Splits | No | The paper deals with sampling tasks and generating samples from distributions, rather than using pre-existing datasets with traditional training/validation/test splits. It describes how samples are drawn or generated for the learning process (e.g., "sampling from uniform domains", "sampling along the trajectory", "sample 4096 particles at random uniform time points"), but these are not dataset splits in the conventional sense for fixed datasets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory configurations. It mentions adjusting "the number of iterations such that all methods ran approximately the same time on the same hardware," but no hardware specifications are given. |
| Software Dependencies | No | The paper mentions several software components, including "Pytorch Paszke et al. (2019)", "torchdiffeq Chen (2018) package", and "Geom Loss library Feydy et al. (2019)". However, it does not specify version numbers for any of these libraries (e.g., PyTorch 1.9, torchdiffeq 0.2). |
| Experiment Setup | Yes | For the linear and learned interpolation we use 50 time steps along which the loss is computed and the gradients are accumulated with a batch size of 256. For the gradient flow interpolation we sample 4096 particles at random uniform time points and therefore do not accumulate gradients. We use a linear time schedule β(t) := βmin + t (βmax βmin) and the associated SDE ... As done in Song et al. (2021) we choose βmin = 0.1 and βmax = 20. The target distribution consists of a mixture of 40 evenly weighted Gaussians in 2 dimensions. The means are distributed uniformly over [-40, 40]^2. We evaluate the methods by generating 5 * 10^4 samples and log weight and computing the effective sample size, negative log likelihood and energy distance. We report mean and standard deviation over 10 evaluation runs. |