reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Positive-Unlabeled Diffusion Models for Preventing Sensitive Data Generation

Authors: Hiroshi Takahashi, Tomoharu Iwata, Atsutoshi Kumagai, Yuuki Yamanaka, Tomoya Yamashita

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through experiments across various datasets and settings, we demonstrated that our approach can prevent the generation of sensitive images without compromising image quality. 5 EXPERIMENTS We used the following image datasets: MNIST (Le Cun et al., 1998), CIFAR10 (Krizhevsky et al., 2009), STL10 (Coates et al., 2011), and Celeb A (Liu et al., 2015). ...We used a custom-defined metric called non-sensitive rate and the Fréchet Inception Distance (FID) score (Heusel et al., 2017) as evaluation metrics. Table 2: Comparison of non-sensitive rates for diffusion models with from-scratch training.
Researcher Affiliation	Industry	Hiroshi Takahashi NTT Corporation Tomoharu Iwata NTT Corporation Atsutoshi Kumagai NTT Corporation Yuuki Yamanaka NTT Corporation Tomoya Yamashita NTT Corporation Corresponding author: EMAIL
Pseudocode	Yes	Algorithm 1 Positive-Unlabeled Diffusion Model with Stochastic Gradient Descent
Open Source Code	Yes	The code is available at https://github.com/takahashihiroshi/pudm.
Open Datasets	Yes	We used the following image datasets: MNIST (Le Cun et al., 1998), CIFAR10 (Krizhevsky et al., 2009), STL10 (Coates et al., 2011), and Celeb A (Liu et al., 2015).
Dataset Splits	Yes	The training data consist of unlabeled data U and labeled sensitive data S, where U include both normal and sensitive data. Meanwhile, the test data contain only normal data. For example, in MNIST, if we treat even numbers as normal, the unlabeled training data U include both even and odd numbers, the sensitive training data S include only odd numbers, and the test data include only even numbers. The number of data points in each dataset is shown in Table 1.
Hardware Specification	Yes	We used two machines for the experiments: one with Intel Xeon Platinum 8360Y CPU, 512GB of memory, and NVIDIA A100 SXM4 GPU, and the other with Intel Xeon Gold 6148 CPU, 384GB of memory, and NVIDIA V100 SXM2 GPU.
Software Dependencies	No	Our implementations are based on Diffusers (von Platen et al., 2022). We optimized these models using Adam W (Loshchilov, 2017) and a cosine scheduler with warmup.
Experiment Setup	Yes	We optimized these models using Adam W (Loshchilov, 2017) and a cosine scheduler with warmup. We set the learning rate to 10^-4, and set the warmup steps to 500. The batch size was 128 for MNIST and CIFAR10, 32 for STL10, and 16 for Celeb A. The number of epochs was set to 100 for from-scratch training and 20 for fine-tuning. We set the number of steps T during training to 1,000. For sampling, we used the denoising diffusion probabilistic model (DDPM) scheduler (Ho et al., 2020) in the from-scratch training, while we used the denoising diffusion implicit model (DDIM) scheduler (Song et al., 2020a) in the fine-tuning the pre-trained models, as the pre-trained models are large and time-consuming to sample. We set the sampling steps to 1,000 in the DDPM, and to 50 in the DDIM. For the proposed method, we set β = 0.1.