Target-Driven Distillation: Consistency Distillation with Target Timestep Selection and Decoupled Guidance
Authors: Cunzheng Wang, Ziyuan Guo, Yuxuan Duan, Huaxia Li, Nemo Chen, Xu Tang, Yao Hu
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments 4.1 Dateset 4.2 Backbone 4.3 Metrics 4.4 Performance Comparison 4.5 Ablation Study |
| Researcher Affiliation | Collaboration | Cunzheng Wang1, 2*, Ziyuan Guo2, Yuxuan Duan2, 3, Huaxia Li2, Nemo Chen2, Xu Tang2, Yao Hu2 1School of Software Technology, Zhejiang University 2Xiaohongshu 3Shanghai Jiao Tong University EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: Target-Driven Distillation |
| Open Source Code | No | The paper does not contain any explicit statement about code release nor provides a link to a code repository. |
| Open Datasets | Yes | We use a subset of the Laion-5B (Schuhmann et al. 2022) dataset for training. All images in the dataset have an aesthetic score above 5.5, totaling approximately 260 million. Additionally, we evaluate the performance using the COCO2014 (Lin et al. 2014) validation set, split into 30k (COCO-30K) and 2k (COCO-2K) captions for assessing different metrics. We also benchmark performance using the Parti Prompts dataset (Yu et al. 2022) |
| Dataset Splits | Yes | Additionally, we evaluate the performance using the COCO2014 (Lin et al. 2014) validation set, split into 30k (COCO-30K) and 2k (COCO-2K) captions for assessing different metrics. |
| Hardware Specification | No | The paper does not specify any particular hardware used for running the experiments (e.g., GPU models, CPU models, memory details). |
| Software Dependencies | No | We employ SDXL (Podell et al. 2023) as the backbone for our experiments, specifically we train a Lo RA (Hu et al. 2021) through distillation. While specific models are mentioned, no version numbers for general software dependencies (like Python, PyTorch, CUDA) are provided. |
| Experiment Setup | Yes | We maintained consistent settings with a batch size of 128, a learning rate of 5e-06, and trained for a total of 15,000 steps. ...we conducted experiments with a batch size of 128, Kmin = 4, Kmax = 8, and η = 0.3, training two models with empty prompt ratios of 0 and 0.2. After 15k steps, we performed inference using a CFG scale of 3 |