Jump Your Steps: Optimizing Sampling Schedule of Discrete Diffusion Models
Authors: Yong-Hyun Park, Chieh-Hsin Lai, Satoshi Hayakawa, Yuhta Takida, Yuki Mitsufuji
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments across image, piano note, and text generation show that JYS significantly improves sampling quality, establishing it as a versatile framework for enhancing DDM performance for fast sampling. 4 EXPERIMENTS In this section, we evaluate the Jump Your Steps (JYS) sampling schedule across various datasets and models. We compare the JYS schedule with the uniform sampling schedule, which sets all intervals to the same size. |
| Researcher Affiliation | Collaboration | Yonghyun Park & Chieh-Hsin Lai Sony AI Tokyo, Japan EMAIL Satoshi Hayakawa Sony Group Corporation Tokyo, Japan Yuhta Takida Sony AI Tokyo, Japan Yuki Mitsufuji Sony AI, Sony Group Corporation New York, USA Work done during an internship at SONY AI. |
| Pseudocode | Yes | C ALGORITHM In this section, we present the main algorithm for Jump your steps (JYS). C.1 KLUB COMPUTATION Please refer to Algorithm 1. ... Algorithm 1: Computation of KLUB(Qs t u Qs u) ... Algorithm 2: Jump Your Steps ... Algorithm 3: k-Gillespie s Algorithm with Corrector Steps |
| Open Source Code | Yes | The code is available at https://github.com/sony/jys. |
| Open Datasets | Yes | Countdown dataset (Section 4.1), CIFAR-10 (image), Lakh Pianoroll (piano note), Open Web Text. For Lakh Pianoroll: (Raffel, 2016; Dong et al., 2018). |
| Dataset Splits | No | The paper uses pretrained models for CIFAR-10, Piano Note, and Text Modeling, and for the synthetic Countdown dataset, it describes sample generation rules rather than training/test/validation splits. No explicit dataset split information is provided for reproducing the experiments. |
| Hardware Specification | Yes | We measured the time required for the JYS sampling schedule optimization on a practical setup using a single 24GB NVIDIA RTX 3090 GPU (Figure 11). |
| Software Dependencies | No | The paper mentions using 'GPT-2 tokenizer' and 'GPT-2 large' for evaluation, and refers to algorithms like 'τ-leaping', 'k-Gillespie', and specific models like 'SEDD'. However, it does not provide specific version numbers for any software libraries, frameworks, or dependencies used to implement the methodology. |
| Experiment Setup | Yes | Golden Section The golden section search was stopped if the difference between the newly optimized t and the previous t was smaller than T/2048. The maximum number of iterations was set to 32, but usually, the iterations were completed within 8 steps. KLUB computation was done with num samples = 2048. For CIFAR10, KLUB computation was done with num samples = 1024. For Monophonic Music, KLUB computation was performed with 2048 samples. For Text, with num samples = 256. |