Boomerang: Local sampling on image manifolds using diffusion models
Authors: Lorenzo Luzi, Paul M Mayer, Josue Casco-Rodriguez, Ali Siahkoohi, Richard Baraniuk
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present three applications for local sampling using Boomerang. First, we provide a framework for constructing privacy-preserving datasets having controllable degrees of anonymity. Second, we show that using Boomerang for data augmentation increases generalization performance and outperforms state-of-the-art synthetic data augmentation. Lastly, we introduce a perceptual image enhancement framework powered by Boomerang, which enables resolution enhancement. |
| Researcher Affiliation | Academia | Lorenzo Luzi EMAIL Rice University Paul M. Mayer EMAIL Rice University Josue Casco-Rodriguez EMAIL Rice University Ali Siahkoohi EMAIL Rice University Richard G. Baraniuk EMAIL Rice University |
| Pseudocode | Yes | Algorithm 1 Boomerang local sampling, given a diffusion model fϕ(x, t) Input: x0, t Boom, {αt}T t=1, {βt}T t=1 Output: x 0 ϵ N(0, I) x t Boom αt Boomx0 + 1 αt Boomϵ for t = t Boom, ..., 1 do if t > 1 then βt = 1 αt 1 1 αt βt η N(0, βt I) else η = 0 end if x t 1 fϕ(x t, t) + η end for return x 0 |
| Open Source Code | Yes | A Boomerang Colab demo is available at https://colab.research.google.com/drive/1PV5Z6b14HYZNx1l HCa EVh Id-Y4ba KXwt. |
| Open Datasets | Yes | To show the versatility of Boomerang anonymization, we apply it to several datasets such as the LFWPeople (Huang et al., 2007), Celeb A-HQ (Karras et al., 2018), CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009), FFHQ (Karras et al., 2019), and ILSVRC2012 (Image Net) (Russakovsky et al., 2015) datasets. |
| Dataset Splits | No | The paper mentions using well-known datasets such as CIFAR-10, CIFAR-100, Image Net-200, and Image Net for classification tasks. It presents results like 'Top-1 Test Accuracy' and 'Top-5 Test Accuracy' (Table 1, Table 2), implying the use of test sets. However, it does not explicitly state the specific training/validation/test splits used for these datasets in its own experimental setup, nor does it cite a reference for the specific splits used in this work. It implicitly assumes standard splits but does not provide the explicit details required. |
| Hardware Specification | Yes | These times are reported for a single Nvidia Ge Force GTX Titan X GPU |
| Software Dependencies | No | The paper mentions various models and frameworks like Stable Diffusion, Patched Diffusion, DLSM, Fast DPM, Style GAN-XL, Res Net-18, VGG-Face, Facenet, and Alex Net. However, it does not provide specific version numbers for any of these software components, programming languages, or libraries, which is necessary for reproducibility. |
| Experiment Setup | Yes | When generating Boomerang samples for data anonymization or augmentation, we pick t Boom so that the Boomerang samples look visually different than the original samples. With the Fast DPM model we use t Boom/T = 40/100 = 40%3; with Patched Diffusion, we use t Boom/T = 75/250 = 30%; and with DLSM, we use t Boom/T = 250/1000 = 25%. We then randomly choose to use the training data or the Boomerang-generated data with probability 0.5 at each epoch. We use Res Net-18 (He et al., 2016) for our experiments. Empirical tests showed that setting t Boom 100 on the Patched Diffusion Model (out of T = 250) produced a good balance between sharpness and the features of the ground-truth image, as seen in Appendix A.2. |