DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models
Authors: XiMing Xing, Chuang Wang, Haitao Zhou, Jing Zhang, Qian Yu, Dong Xu
NeurIPS 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that Diff Sketcher achieves greater quality than prior work. The code and demo of Diff Sketcher can be found at https://ximinng.github.io/Diff Sketcher-project/. |
| Researcher Affiliation | Academia | Ximing Xing Beihang University EMAIL Chuang Wang Beihang University EMAIL Haitao Zhou Beihang University EMAIL Jing Zhang Beihang University EMAIL Qian Yu Beihang University EMAIL Dong Xu The University of Hong Kong EMAIL |
| Pseudocode | No | The paper describes algorithms and processes but does not include a formally labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | The code and demo of Diff Sketcher can be found at https://ximinng.github.io/Diff Sketcher-project/. |
| Open Datasets | Yes | Recent breakthroughs in text-to-image generation have been driven by diffusion models [23, 28, 30, 31] trained on billions of image-text pairs [34]. [34] Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, et al. Laion5b: An open large-scale dataset for training next generation image-text models. ar Xiv preprint ar Xiv:2210.08402, 2022. |
| Dataset Splits | No | The paper leverages a pre-trained latent diffusion model and an optimization-based approach for sketch synthesis. It does not mention traditional dataset splits (e.g., train/validation/test percentages or counts) for its own training or evaluation data. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using a 'DDIM solver', 'torch.transforms', and 'Adam optimizers' but does not provide specific version numbers for these software components or the underlying programming language/frameworks. |
| Experiment Setup | Yes | Specifically, given a text prompt, we use a DDIM solver [38] to sample a raster image from the latent diffusion model in 100 steps with classifier-free guidance [12], using a scale of ω = 7.5. For classifier-free guidance, we set ω = 100... we set the learning rate of the control point optimizer to 1.0 and the color optimizer to 0.1. ...we use layers 3 and 4 of the Res Net101 CLIP model. ...we sample a noise level t from the uniform distribution U(0.05, 0.95). |