Soft Diffusion: Score Matching with General Corruptions
Authors: Giannis Daras, Mauricio Delbracio, Hossein Talebi, Alex Dimakis, Peyman Milanfar
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show experimentally that our framework works for general linear corruption processes, such as Gaussian blur and masking. Our method outperforms all linear diffusion models on Celeb A-64 achieving FID score 1.85. We also show computational benefits compared to vanilla denoising diffusion. |
| Researcher Affiliation | Collaboration | Giannis Daras EMAIL UT Austin Mauricio Delbracio EMAIL Google Research Hossein Talebi EMAIL Google Research Alexandros G. Dimakis EMAIL UT Austin Peyman Milanfar EMAIL Google Research |
| Pseudocode | Yes | Algorithm 1 Naive Sampler Algorithm 2 Momentum Sampler |
| Open Source Code | No | The paper does not provide an explicit statement of code release, a link to a code repository for the methodology described, or indicate that code is in supplementary materials. It does provide anonymous URLs for schedules of blur, masking, and noise parameters, but these are data files, not the full source code for the model and sampling algorithms. |
| Open Datasets | Yes | We evaluate our method in Celeb A-64 and CIFAR-10. |
| Dataset Splits | No | The paper states: "We train our networks on Celeb A-64 and CIFAR-10... We use 50000 samples to evaluate the FID, as it is typically done in prior work." While it mentions training and evaluation, it does not specify the exact split percentages or sample counts for training, validation, and test sets for either dataset, nor does it cite a standard split being used. |
| Hardware Specification | Yes | We train our models on 16 v2-TPUs. |
| Software Dependencies | No | The paper mentions using "Adam optimizer" and states learning rate, beta values, and epsilon for it. However, it does not specify the version of Adam, nor does it mention the version of the deep learning framework (e.g., TensorFlow, PyTorch) or any other libraries used. |
| Experiment Setup | Yes | Hyperparameters. For our trainings, we use Adam optimizer with learning rate 2e 4, β1 = 0.9, β2 = 0.999, ϵ = 1e 8. We additionally use gradient clipping for gradient norms bigger than 1. For the learning rate scheduling, we use 5000 steps of linear warmup. We use batch size 128 and we train for 1 2M iterations (based on observed FID performance). |