Denoising Levy Probabilistic Models
Authors: Dario Shariatian, Umut Simsekli, Alain Oliviero Durmus
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that DLPM provides (i) better coverage of the tails of the data distribution, (ii) improved generation of unbalanced datasets, and (iii) faster computation times, requiring fewer backward steps. |
| Researcher Affiliation | Academia | 1INRIA Department of Computer Science, PSL Research University, Paris, France 2École Polytechnique CMAP, IP Paris, Palaiseau, France EMAIL EMAIL |
| Pseudocode | Yes | A ALGORITHMS FOR DLPM AND DLIM In this section, we explicitly provide the algorithms needed to train and sample from the DLPM and DLIM generative methods. Algorithm 2 DLPM training simplified loss Algorithm 3 Stochastic sampling (DLPM) Algorithm 4 Deterministic sampling (DLIM) |
| Open Source Code | No | The paper mentions using a third-party implementation: "We use a U-Net following the implementation of Nichol & Dhariwal (2021) available in https://github.com/openai/improved-diffusion." However, there is no explicit statement or link provided for the authors' own source code for the methodology described in this paper. |
| Open Datasets | Yes | In our experiments on images, we make use of the dataset CIFAR10 LT (long tail), that has been introduced in Yoon et al. (2023) as an unbalanced modification of the CIFAR10 dataset. We work on the MNIST and the CIFAR10_LT dataset. |
| Dataset Splits | Yes | For our 2D datasets, we use 32000 datapoints for training, a batch size of 1024, and 25000 points for evaluation. CIFAR10_LT consists of the CIFAR10 images were artificial class unbalance has been introduced. The specific class counts we use are [5000, 2997, 1796, 1077, 645, 387, 232, 139, 83, 50]. |
| Hardware Specification | Yes | All the training and experiments are conducted on four NVIDIA RTX8000 GPU and four NVIDIA V100 GPU |
| Software Dependencies | No | The paper mentions "Py Torch" for experiments and "Adam" as the optimizer, but no specific version numbers for any software dependencies are provided. |
| Experiment Setup | Yes | For our 2D datasets, we use 32000 datapoints for training, a batch size of 1024, and 25000 points for evaluation. We train each model for 10000 steps. ... The optimizer is Adam (Kingma & Ba (2017)) with learning rate 5e-3. We train MNIST for 120000 steps with batch size 256 with a time horizon T = 1000, and CIFAR_LT for 400000 steps with batch size 100 with a time horizon T = 4000. The optimizer is Adam (Kingma & Ba (2017)) with learning rate 1e-3 for MNIST and 2e-4 for CIFAR10_LT. We use the Step LR scheduler which scales the learning rate by γ = .99 every N = 1000 steps for CIFAR10_LT and N = 400 for MNIST. We use an exponential moving average with a rate of 0.99 for MNIST and 0.9999 for CIFAR10_LT. |