reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Aligned Datasets Improve Detection of Latent Diffusion-Generated Images

Authors: Anirudh Sundara Rajan, Utkarsh Ojha, Jedidiah Schloesser, Yong Jae Lee

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments to assess our method. We train on images generated by the original LDM model (Rombach et al., 2022), and test on images generated by later versions of Stable Diffusion as well as newer latent models such as Playground (Li et al., 2024), Kandinsky (Razzhigaev et al., 2023), Pixel Art-α (Chen et al., 2023) and Latent Consistency models (Luo et al., 2023).
Researcher Affiliation	Academia	Anirudh Sundara Rajan* Utkarsh Ojha* Jedidiah Schloesser Yong Jae Lee University of Wisconsin-Madison EMAIL, EMAIL
Pseudocode	No	The paper describes the process mathematically with a formula F = { ϕdec(ϕenc(x)) \| x R} but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	For implementation details, visit our project page: anisundar18.github.io/Aligned Forensics/. We also intend to release our pre-trained checkpoints, datasets, and code to ensure reproducibility, with all resources made publicly available on Git Hub.
Open Datasets	Yes	Similar to Corvi et al. (2022), we use a combination of MS COCO (Lin et al., 2015) and LSUN (Yu et al., 2016) as our real dataset, totaling 179257 images. ... For the real set, we randomly select 500 images from the Redcaps dataset (Desai et al., 2021). For fake images, we generate 500 images using SD 1.5 (prompts pertain to object categories from CIFAR (Krizhevsky et al., 2010)). ... The Real set contains real images from multiple sources; 1000 images from Red Caps (Desai et al., 2021), 800 images from LAION-Aesthetics (Schuhmann et al., 2022), 1000 images from whichfaceisreal (whi) and 200 images from Wiki Art (wik).
Dataset Splits	Yes	Similar to Corvi et al. (2022), we use a combination of MS COCO (Lin et al., 2015) and LSUN (Yu et al., 2016) as our real dataset, totaling 179257 images. We reconstruct them using the autoencoder of the LDM model proposed by Rombach et al. (2022) to get the same number of fake images. ... We create a test set of real and fake images of increasing/decreasing resolutions. For the real set, we randomly select 500 images from the Redcaps dataset (Desai et al., 2021). For fake images, we generate 500 images using SD 1.5... We use the validation set provided by Corvi et al. (2022) for our training. ... The dataset consists of 6000 real images and 6000 images for each of the respective categories.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU/CPU models or processor types used for running its experiments.
Software Dependencies	No	The paper mentions using Adam optimizer and data augmentations like 'random resized crop' (referencing torchvision), but it does not specify version numbers for any software libraries, frameworks, or languages used.
Experiment Setup	Yes	We optimize using Adam (Kingma & Ba, 2015) with an initial learning rate set to 0.0001. The rest of the training details can be found in Appendix A.1.1. ADDITIONAL TRAINING DETAILS: We train on 96 x 96 crops of the whole image using a batch size of 128. The data augmentations include random JPG compression and blur from the pipeline proposed by Wang et al. (2020). Following Gragnaniello et al. (2021), grayscale, cutout and random noise are also used as augmentations. Finally, in order to make the network invariant towards resizing, the random resized crop was added. For our method as well as Corvi, we train the model using two different random seeds and report the average reading. We use the validation set provided by Corvi et al. (2022) for our training. Just like our training set, the real images come from COCO/LSUN and the fake images are generated at 256 x 256 using LDM. During training, if the validation accuracy does not improve by 0.1% in 10 epochs the learning rate is dropped by 10x. The training is terminated at learning rate 10 6.