VCT: Training Consistency Models with Variational Noise Coupling

Authors: Gianluigi Silvestri, Luca Ambrogioni, Chieh-Hsin Lai, Yuhta Takida, Yuki Mitsufuji

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on multiple image datasets demonstrate significant improvements: our method surpasses baselines, achieves state-of-the-art FID among non-distillation CT approaches on CIFAR-10, and matches So TA performance on Image Net 64 64 with only two sampling steps.
Researcher Affiliation Collaboration 1One Planet Research Center, imec-the Netherlands, Wageningen, the Netherlands 2Donders Institute for Brain, Cognition and Behaviou, Nijmegen, the Netherlands 3Sony AI, Tokyo, Japan 4Sony Group Corporation. Correspondence to: Gianluigi Silvestri <EMAIL, EMAIL>.
Pseudocode Yes Algorithm 1 Variational Consistency Training (VCT) ... Algorithm 2 Multistep Variational Consistency Sampling
Open Source Code Yes Code is available at https://github.com/sony/vct.
Open Datasets Yes We evaluate the models on the image datasets Fashion-MNIST (Xiao et al., 2017), CIFAR-10 (Krizhevsky et al., 2009), FFHQ 64 64 (Karras et al., 2019) and (classconditional) Image Net 64 64 (Deng et al., 2009).
Dataset Splits No The paper mentions standard datasets like Fashion-MNIST, CIFAR-10, FFHQ, and Image Net, but it does not explicitly state the training/test/validation splits (e.g., percentages or sample counts) used for these datasets in the main text or appendices. It refers to using existing baselines' settings but does not detail the splits within the paper.
Hardware Specification Yes GPU types H100
Software Dependencies No The paper discusses model architectures like DDPM++ and EDM2-S but does not provide specific software dependencies with version numbers (e.g., PyTorch 1.9, CUDA 11.1).
Experiment Setup Yes We report the training details for our models in Tables 4 and 5. Note that the baselines are the ones from our reimplementation. The models have the same number of parameters and training hyperparameters regardless of the transition kernel used. ... Minibatch size, Iterations, Dropout probability, Optimizer, Learning rate, EMA rate