reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

DisCo: Improving Compositional Generalization in Visual Reasoning through Distribution Coverage

Authors: Joy Hsu, Jiayuan Mao, Jiajun Wu

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply Dis Co to visual question answering, with three backbone networks (Fi LM, Tb D-net, and the Neuro-Symbolic Concept Learner), and demonstrate that it consistently enhances performance on a variety of compositional generalization tasks with varying levels of train data bias.
Researcher Affiliation	Academia	Joy Hsu EMAIL Department of Computer Science, Stanford University Jiayuan Mao EMAIL Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology Jiajun Wu EMAIL Department of Computer Science, Stanford University
Pseudocode	Yes	Algorithm 1 The Dis Co framework described in Section 3.2.
Open Source Code	Yes	Code for Dis Co with the Fi LM model can be found: https://github.com/joyhsu0504/disco, based on the Fi LM codebase (https://github.com/ethanjperez/film).
Open Datasets	Yes	In addition to the original CLEVR compositional generalization (Co Gen T) dataset 1 (Johnson et al., 2017) (released under the CC BY 4.0 license), we also report results on multiple Co Gen datasets based on CLEVR.
Dataset Splits	Yes	In our construction, the train set of Co Gen split A consists of 8,000 images, and the validation set of Co Gen split A and the test set of Co Gen split B consist of 2,000 images each. The larger, unseen test set consists of 8,000 images.
Hardware Specification	Yes	All models are trained on a single Titan RTX GPU.
Software Dependencies	No	The paper mentions software components like Style GAN2 and Adam optimizer, and base implementations for VAE and SimCLR, but it does not specify version numbers for general software dependencies like Python, PyTorch, or TensorFlow, nor for the specific implementations or optimizers.
Experiment Setup	Yes	The GAN image proposal function is the unconditional Style GAN2 (Karras et al., 2020), trained with the Adam optimizer of learning rate 0.002. We set our entropy threshold n to be at the 30th percentile.