reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Robust Representation Consistency Model via Contrastive Denoising

Authors: jiachen lei, Julius Berner, Jiongxiao Wang, Zhongzhu Chen, Chaowei Xiao, Zhongjie Ba, Kui Ren, Jun Zhu, anima anandkumar

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on various datasets and achieve state-of-the-art performance with minimal computation budget during inference. For example, our method outperforms the certified accuracy of diffusionbased methods on Image Net across all perturbation radii by 5.3% on average, with up to 11.6% at larger radii, while reducing inference costs by 85 on average. Codes are available at: https://github.com/jiachenlei/rRCM.
Researcher Affiliation	Collaboration	1Zhejiang University, 2NVIDIA, 3UW Madison, 4Amazon, 5Shengshu, 6Tsinghua University, 7Caltech
Pseudocode	Yes	Algorithm 1 r RCM Pre-training Pseudocode
Open Source Code	Yes	Codes are available at: https://github.com/jiachenlei/rRCM.
Open Datasets	Yes	In this section, we evaluate our r RCM model on two datasets: Image Net (Deng et al., 2009) and CIFAR10 (Krizhevsky et al., 2009).
Dataset Splits	Yes	Certification. We follow the settings of Carlini et al. (2022). Specifically, on both Image Net and CIFAR10, we certify a subset that contains 500 images from their test set with confidence 99.9%.
Hardware Specification	Yes	We measure the inference latency of all methods on a single A800 GPU.
Software Dependencies	No	The paper mentions 'x Formers (https://github.com/facebookresearch/xformers)' and 'DPM-Solver (Lu et al., 2022)' but does not specify version numbers for these software components. It also implicitly uses a deep learning framework, likely PyTorch, but no version is stated.
Experiment Setup	Yes	Pre-training During pre-training, we adopt the definition of diffusion models proposed in EDM (Karras et al., 2022) and refer to the implementation of consistency models (Song et al., 2023), including noise schedule, input scaling, time embedding strategy, and time discretization strategy. As for data augmentation strategies, we adopt those utilized in Mo Co-v3 (Chen et al., 2021). The temperature value τ in (9) is set to 0.2 for all experiments. By default, we pre-train r RCM-B and r RCM-B-Deep for 600k steps with a batch size of 4096 on the Image Net dataset. We pre-train r RCM-B for 300k steps on the CIFAR10 dataset, with a batch size of 2048. Subsequently, we fine-tune our r RCM models separately at various noise levels σ {0.25, 0.5, 1.0}. In specific, for both Image Net and CIFAR-10, we set η1 in (12) to 10 at the noise level of 0.25 , and to 20 for noise levels 0.5 and 1.0. In all experiments, η2 in (12) is fixed as 0.5. To enhance training stability, we apply a dynamic EMA schedule for the target model utilized when computing the contrastive loss. Specifically, we gradually increase the EMA rate from 0.99 to 0.9999 following a pre-defined sigmoid schedule...We present hyper-parameters used in our pre-training experiments in Table 4 and the data augmentation strategies in Table 5. Fine-tuning We fine-tune the pre-trained model following the implementation in (Jeong & Shin, 2020) at three different noise levels σ [0.25, 0.5, 1.0], and report the best results at each perturbation radius. We tune the pre-trained model for 150 epochs on Image Net and 100 epochs on CIFAR10.