reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Improved Sampling Algorithms for Lévy-Itô Diffusion Models

Authors: Vadim Popov, Assel Yermekova, Tasnima Sadekova, Artem Khrapov, Mikhail Kudinov

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the benefits of using these SDEs at inference in terms of generated samples quality on image generation task and verify that samples diversity does not suffer if we generate data with the proposed SDEs. We train a L evy-Itˆo text-to-speech model on a highly imbalanced dataset and evaluate its performance for speakers with different amount of training data. Section 5 is titled "EXPERIMENTS" and includes tables with metrics such as FID, coverage, and speaker similarity.
Researcher Affiliation	Industry	Vadim Popov, Assel Yermekova, Tasnima Sadekova, Huawei Noah s Ark Lab EMAIL Artem Khrapov & Mikhail Kudinov Huawei Noah s Ark Lab EMAIL, EMAIL
Pseudocode	No	The paper describes methods and equations verbally and mathematically but does not include any clearly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code	No	The paper does not contain an explicit statement about releasing code, a link to a code repository, or a mention of code in supplementary materials for the described methodology.
Open Datasets	Yes	We train 3 L evy-Itˆo models with α = 1.8, 1.5 and 1.2 on CIFAR10 with the same architecture as in the mentioned paper... We train text-to-speech models on extremely imbalanced dataset consisting of 16.6 hours (1000 minutes) of an English female speaker (Ito, 2017) and 10 minutes of an English male speaker with id 9017 from Bakhturina et al. (2021).
Dataset Splits	Yes	We train 3 L evy-Itˆo models with α = 1.8, 1.5 and 1.2 on CIFAR10... The model we use for CIFAR10 experiments is NCSN++(deep) (Yoon et al., 2023; Song et al., 2021c) with 8 residual blocks... Imbalanced CIFAR10 contained 5000, 2997, 1796, 1077, 645, 387, 232, 139, 83 and 50 images belonging to classes airplane , automobile , bird , cat , deer , dog , frog , horse , ship and truck correspondingly. It is the same setting as that used in Yoon et al. (2023). Figure 4 shows performance of different models and different solvers depending on η... FID on CIFAR10 test set containing 10k images.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, or other accelerators) used for running the experiments.
Software Dependencies	No	The paper mentions several software components and models (NCSN++, Montreal Forced Aligner, Hi Fi-GAN, CAM++ speaker verification model) but does not provide specific version numbers for these or other software dependencies (e.g., programming languages, libraries, frameworks).
Experiment Setup	Yes	The model we use for CIFAR10 experiments is NCSN++(deep) (Yoon et al., 2023; Song et al., 2021c) with 8 residual blocks. We train 3 models for α = 1.8, 1.5 and 1.2 with batch size 128 and learning rate 0.0001 for 250k iterations. Diffusion models tend to overfit on CIFAR10 so we choose the best checkpoint in terms of FID on the test set (100k, 150k and 180k iterations for α = 1.8, 1.5 and 1.2 respectively).