Enhancing Diffusion-Based Image Synthesis with Robust Classifier Guidance
Authors: Bahjat Kawar, Roy Ganz, Michael Elad
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments on the highly challenging and diverse Image Net dataset, our scheme introduces significantly more intelligible intermediate gradients, better alignment with theoretical findings, as well as improved generation results under several evaluation metrics. Furthermore, we conduct an opinion survey whose findings indicate that human raters prefer our method s results. |
| Researcher Affiliation | Academia | Bahjat Kawar EMAIL Computer Science Department Technion, Israel Roy Ganz EMAIL Electrical Engineering Department Technion, Israel Michael Elad EMAIL Computer Science Department Technion, Israel |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. It describes methodologies in narrative text and mathematical equations, but no explicitly labeled 'Algorithm' or 'Pseudocode' sections. |
| Open Source Code | Yes | Our code and trained robust time-dependent classifier models are available at https: //github.com/bahjat-kawar/enhancing-diffusion-robust. |
| Open Datasets | Yes | In experiments on the highly challenging and diverse Image Net dataset... To that end, we train an adversarially robust time-dependent classifier on the diverse Image Net dataset (Deng et al., 2009)... We conduct an ablation study on CIFAR-10 (Krizhevsky et al., 2009) and report the results and implementation details in Appendix E. |
| Dataset Splits | No | The paper mentions using the Image Net training set for classifier training and performing evaluation on generated images, but it does not specify explicit training/validation/test splits with percentages or sample counts for the main ImageNet experiments or for the CIFAR-10 ablation study beyond mentioning a 'validation set'. |
| Hardware Specification | Yes | Training is performed on two NVIDIA A40 GPUs. |
| Software Dependencies | No | The paper states: 'We base our implementation on the publicly available code provided by (Dhariwal & Nichol, 2021). For implementing the adversarial training scheme, we adapt code from (Madry et al., 2018)...' However, it does not provide specific version numbers for any software libraries or dependencies. |
| Experiment Setup | Yes | We train the classifier for 240k iterations, using a batch size of 128, a weight decay of 0.05, and a linearly annealed learning rate starting with 3 10 4 and ending with 6 10 5... We use the gradient-based PGD attack to perturb the noisy images xt. The attack is restricted to the threat model {xt + δ | | δ 2 0.5}, and performed using a step size of 0.083, and a maximum number of 7 iterations... we utilize 250 diffusion steps out of the trained 1000... For the classifier guidance scale, we sweep across values s {0.25, 0.5, 1, 2} and find that s = 1 produces better results... |