reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Discrete Copula Diffusion

Authors: Anji Liu, Oliver Broadrick, Mathias Niepert, Guy Van den Broeck

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically validate the proposed method, Discrete Copula Diffusion (DCD), on language modeling tasks (Sec. 6.1 and 6.2) and antibody sequence infilling tasks (Sec. 6.3). For all tasks, we evaluate whether DCD can effectively reduce the number of diffusion steps while maintaining strong performance. Specifically, since DCD combines two pretrained models: a discrete diffusion model and an autoregressive copula model, we examine whether DCD outperforms each individual model. Figure 3: Generative perplexity ( ) with different numbers of denoising steps. Table 1: Evaluation of text infilling performance using the MAUVE score ( ) with 5 prompt masks.
Researcher Affiliation	Academia	Anji Liu1,2, Oliver Broadrick1, Mathias Niepert2, Guy Van den Broeck1 1Department of Computer Science, University of California, Los Angeles, USA 2Institute for Artificial Intelligence, University of Stuttgart, Germany EMAIL EMAIL
Pseudocode	Yes	Algorithm 1 Draw samples from a discrete diffusion model with the help of a copula model. Algorithm 2 DCD with Autoregressive Copula Models and Using Autoregressive Sampling
Open Source Code	Yes	Code is available at https://github.com/liuanji/Copula-Diffusion.
Open Datasets	Yes	We first compare the quality of unconditional samples generated by models trained on either Web Text (Radford et al., 2019) or Open Web Text (Gokaslan & Cohen, 2019), which contain web content extracted from URLs shared on Reddit with a minimum number of upvotes. ... We use the same set of 2,000 text sequences from the validation set of Wiki Text103 (Merity et al., 2022). ... We adopt NOC-D (Gruver et al., 2023), which is a discrete diffusion model trained on 104K antibody sequences from the Observed Antibody Space dataset (Ruffolo et al., 2023).
Dataset Splits	Yes	For all methods, we use the same set of 2,000 text sequences from the validation set of Wiki Text103 (Merity et al., 2022). After applying the prompt mask, we generate 5 samples for each prompt, resulting in a total number of 10,000 samples.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions using specific models like SEDD, GPT-2-small, and NOC-D but does not provide specific version numbers for software libraries, programming languages, or other dependencies that would be required for replication.
Experiment Setup	Yes	We adopt the log-linear noise schedule suggested by the SEDD paper. See Appendix G.1 for more details. ... The model is trained with 50 epochs using the default settings (e.g., learning rate and its schedule). ... The GPT model has 6 layers, an embedding size of 512, and 16 attention heads. The model is trained for 10 epochs with the default settings in the nano GPT repository. ... we set β =0.1 for this task.