reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

InCoDe: Interpretable Compressed Descriptions For Image Generation

Authors: Armand Comas, Aditya Chattopadhyay, Feliu Formosa, Changyu Liu, OCTAVIA CAMPS, Rene Vidal

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments, we demonstrate the efficacy of our proposed framework both qualitatively and quantitatively. Our work contributes to the ongoing quest to enhance both controllability and interpretability in the generation process. ... In this section we empirically evaluate the performance of In Co De and provide analysis of its capabilities. In particular, we study (i) its effectiveness in capturing the semantic content of an image by evaluating the Querier’s ability to select queries that maximize information gain, as well as the faithfulness of the generated image to the provided representations; and (ii) its editing and compositional capabilities by evaluating its ability to modify or generate an image consistent with a desired set of attributes.
Researcher Affiliation	Academia	1Northeastern University 2Johns Hopkins University 3University of Pennsylvania EMAIL EMAIL EMAIL
Pseudocode	No	The paper describes algorithms such as Information Pursuit and the In Co De framework through mathematical formulations and descriptive text (e.g., Equation 1, and the detailed explanation of Encoder, Decoder, and Generator operations), but it does not present them in a structured pseudocode block or a clearly labeled algorithm section.
Open Source Code	Yes	Code available at github.com/Armand Com/In Co De.
Open Datasets	Yes	(iii) We collected two new datasets along with sets of binary queries and answers about their content. ... These datasets are a key contribution of this work, filling a gap where no existing datasets meet the specific requirements of our task, and have been made publicly available. ... Link to datasets provided in https://github.com/Armand Com/In Co De.
Dataset Splits	Yes	MNIST: Training corpus consists of 60k 1 32 32 greyscale images of handwritten single digits. ... Celeb A: It consists of 50k 3 64 64 images of celebrity faces, divided into 34-1k-15k for training, validation and testing. ... Clevr: It consists of 8k (partitioned as 7k-1k-1k for training), validation and test. ... Churches dataset consists of 70k images, filtered to 11k and split as 90% 10%, for training and validation, reserving 2k images for test.
Hardware Specification	Yes	Hardware In Co De has been trained in two NVIDIA Ge Force RTX 2080 Ti GPUs. For images of resolution 64 64, it takes 1 day to train. ... The binary attribute image classifier has been trained in two NVIDIA RTX A6000 GPUs during 3 days.
Software Dependencies	Yes	Our method for LSUN Bedroom experiments has been trained as a wraper to Stable Diffusion V1-4: huggingface.co/Comp Vis/stable-diffusion-v1-4. We use the same version for the results displayed in Fig. 2. When showing results for Stable Diffusion XL, we use the model in huggingface.co/stabilityai/stable-diffusion-xl-base-1.0.
Experiment Setup	Yes	Next, we describe the main hyperparameters used for Imagen’s U-Net. Learning rate: LR = 1e 4 with a cosine decay; Base dimension: 32; Dimensionality multiplyers: (1, 2, 4, 8), Self-attention at resolutions: (1/8); Query embedding size: 16 2; Condition size: 256; Number of steps for training and sampling: 256; Condition drop probability: p = 0.1.