reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Unified Discrete Diffusion for Categorical Data

Authors: Lingxiao Zhao, Xueying Ding, Lijun Yu, Leman Akoglu

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments and ablations demonstrate the significant improvement, and we open-source our code at: https: //github.com/Lingxiao Shawn/USD3. ... We conduct an additional experiment on the trained USD3 to show that the derived MCMC further improves the image generation quality. We apply MCMC sampling with various hyper-parameters for 50, 000 sampled images at the last 10% of the sampling phase (last 100 timesteps), for both discreteand continuous-time USD3. The result is shown in Table 4.
Researcher Affiliation	Academia	Lingxiao Zhao EMAIL Heinz College Carnegie Mellon University Pittsburgh, PA 15213, USA Xueying Ding EMAIL Heinz College Carnegie Mellon University Pittsburgh, PA 15213, USA Lijun Yu EMAIL School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213, USA Leman Akoglu EMAIL Heinz College Carnegie Mellon University Pittsburgh, PA 15213, USA
Pseudocode	Yes	Algorithm 1 USD3 Unified Training: Red: discrete-time step, and blue: continuous-time step. Algorithm 2 USD3 Unified Sampling Algorithm 3 The MCMC correcting algorithm at time t
Open Source Code	Yes	Extensive experiments and ablations demonstrate the significant improvement, and we open-source our code at: https: //github.com/Lingxiao Shawn/USD3.
Open Datasets	Yes	Lakh Pianoroll Datasets. We evaluate monophonic music generation on Piano, the cleaned Lakh pianoroll dataset (Raffel, 2016; Dong et al., 2017), containing 6, 000 training and 973 evaluation (or test) sequences of 256 notes each. ... VQCIFAR10 Dataset. For the image generation task, we train all the models on CIFAR10 images. ... We apply random flipping and cropping to the 64 × 64 Image Net dataset (Deng et al., 2009). ... Text8. For unconditional text generation, we measure USD3 s performance against a list of diffusion and autoregressive models, on text8 dataset.
Dataset Splits	Yes	Lakh Pianoroll Datasets. We evaluate monophonic music generation on Piano, the cleaned Lakh pianoroll dataset (Raffel, 2016; Dong et al., 2017), containing 6, 000 training and 973 evaluation (or test) sequences of 256 notes each.
Hardware Specification	Yes	We run our results with 1 A6000 GPU. ... In Table 8, we have calculated the time required for baselines and our models to sample 50,000 VQ-encoded images with 1000 timesteps on a single A6000 GPU.
Software Dependencies	No	The paper mentions 'Adam optimizer (β1 = 0, β2 = 0.99)' but does not provide specific software library names with version numbers for implementation details, such as PyTorch or TensorFlow versions, or CUDA versions.
Experiment Setup	Yes	For training USD3 on the Piano dataset, we use a similar Diffusion Transformer architecture described in SEDD (12 layers, each with 12 heads, input dimension of 768 and max MLP dimension of 3072) . We apply a batch size of 64, a learning rate of 2e-4, with a warmup of the first 5000 steps. We adopt a consine learning rate scheduler with EMA = 0.999. The result is over 800, 000 steps. ... In both continuousand discretetime diffusion, we use a cosine scheduler with α = 0.008.