Enhancing Diffusion-Based Image Synthesis with Robust Classifier Guidance

Authors: Bahjat Kawar, Roy Ganz, Michael Elad

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments on the highly challenging and diverse Image Net dataset, our scheme introduces significantly more intelligible intermediate gradients, better alignment with theoretical findings, as well as improved generation results under several evaluation metrics. Furthermore, we conduct an opinion survey whose findings indicate that human raters prefer our method s results.
Researcher Affiliation Academia Bahjat Kawar EMAIL Computer Science Department Technion, Israel Roy Ganz EMAIL Electrical Engineering Department Technion, Israel Michael Elad EMAIL Computer Science Department Technion, Israel
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. It describes methodologies in narrative text and mathematical equations, but no explicitly labeled 'Algorithm' or 'Pseudocode' sections.
Open Source Code Yes Our code and trained robust time-dependent classifier models are available at https: //github.com/bahjat-kawar/enhancing-diffusion-robust.
Open Datasets Yes In experiments on the highly challenging and diverse Image Net dataset... To that end, we train an adversarially robust time-dependent classifier on the diverse Image Net dataset (Deng et al., 2009)... We conduct an ablation study on CIFAR-10 (Krizhevsky et al., 2009) and report the results and implementation details in Appendix E.
Dataset Splits No The paper mentions using the Image Net training set for classifier training and performing evaluation on generated images, but it does not specify explicit training/validation/test splits with percentages or sample counts for the main ImageNet experiments or for the CIFAR-10 ablation study beyond mentioning a 'validation set'.
Hardware Specification Yes Training is performed on two NVIDIA A40 GPUs.
Software Dependencies No The paper states: 'We base our implementation on the publicly available code provided by (Dhariwal & Nichol, 2021). For implementing the adversarial training scheme, we adapt code from (Madry et al., 2018)...' However, it does not provide specific version numbers for any software libraries or dependencies.
Experiment Setup Yes We train the classifier for 240k iterations, using a batch size of 128, a weight decay of 0.05, and a linearly annealed learning rate starting with 3 10 4 and ending with 6 10 5... We use the gradient-based PGD attack to perturb the noisy images xt. The attack is restricted to the threat model {xt + δ | | δ 2 0.5}, and performed using a step size of 0.083, and a maximum number of 7 iterations... we utilize 250 diffusion steps out of the trained 1000... For the classifier guidance scale, we sweep across values s {0.25, 0.5, 1, 2} and find that s = 1 produces better results...