Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis
Authors: Zikun Zhang, Zixiang Chen, Quanquan Gu
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework. We introduce a discrete-time sampling algorithm in the general state space [S]d that utilizes score estimators at predefined time points. We derive convergence bounds for the Kullback Leibler (KL) divergence and total variation (TV) distance between the generated sample distribution and the data distribution, considering both scenarios with and without early stopping under reasonable assumptions. Notably, our KL divergence bounds are nearly linear in the dimension d, aligning with state-of-the-art results for diffusion models. Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function, which are essential for characterizing the discrete-time sampling process. |
| Researcher Affiliation | Academia | Zikun Zhang School of Mathematical Sciences Fudan University EMAIL Zixiang Chen Department of Computer Science University of California, Los Angeles EMAIL Quanquan Gu Department of Computer Science University of California, Los Angeles EMAIL |
| Pseudocode | Yes | For completeness, we formalize the practical sampling procedure in Algorithm 1, presented in Appendix D. ... Algorithm 1 Generative Reverse Process Simulation through Uniformization Input: Learned discrete score function ˆs T kh (k = 0, 1, , K 1), total time T, discretization step h > 0, and δ = T Kh 0 1: Draw z0 πd 2: for k = 0, 1, , K 1 do 3: Set λk = maxx X {ˆs T kh(x)x} 4: Draw M Poisson(λkh) 5: Set y0 = zk 6: for j = 0, 1, , M 1 do 7: Set yj+1 = y\i j ˆyi, w.p. ˆs T kh(yj)i,ˆyi λk , 1 i d, ˆyi = yi j, ˆyi [S] yj, w.p. 1 Pd i=1 P ˆs T kh(yj)i,ˆyi λk 8: end for 9: Set zk+1 = y M 10: end for Output: A sample z K from p T δ |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating the provision of open-source code for the described methodology. |
| Open Datasets | No | The paper is a theoretical work focusing on convergence analysis and does not involve experimental evaluation on any specific datasets. Therefore, it does not provide information about open datasets. |
| Dataset Splits | No | The paper is theoretical and does not include empirical experiments with datasets; therefore, no dataset splits are provided. |
| Hardware Specification | No | The paper is theoretical and does not describe any experimental setup or computations that would require specific hardware specifications. |
| Software Dependencies | No | The paper is theoretical and does not describe any experimental setup or implementation details that would require specific software dependencies or versions. |
| Experiment Setup | No | The paper is a theoretical analysis of diffusion models and does not include an experimental section with specific setup details, hyperparameters, or training configurations. |