reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Align and Distill: Unifying and Improving Domain Adaptive Object Detection

Authors: Justin Kay, Timm Haucke, Suzanne Stathatos, Siqi Deng, Erik Young, Pietro Perona, Sara Beery, Grant Van Horn

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform experiments on Cityscapes Foggy Cityscapes, Sim10k Cityscapes, and CFC Kenai Channel. In addition to being consistent with prior work, these datasets represent three common adaptation scenarios capturing a range of real-world challenges: weather adaptation, Sim2Real, and environmental adaptation, respectively. We report the Pascal VOC metric of mean Average Precision with Io U 0.5 ( AP50 ).
Researcher Affiliation	Collaboration	1MIT, 2Caltech, 3AWS, 4Skagit Fisheries Enhancement Group, 5UMass Amherst
Pseudocode	No	The paper describes the methods through text descriptions, mathematical equations (e.g., Lsup = 1 Bsrc i=1 L (θstu(t(xsrc,i)), ysrc,i) (1)), and diagrams (Figure 2), but does not present structured pseudocode or algorithm blocks.
Open Source Code	Yes	github.com/justinkay/aldi github.com/visipedia/caltech-fish-counting. We release ALDI as an open-source codebase built on a modern detector implementation.
Open Datasets	Yes	A new benchmark dataset, CFC-DAOD, sourced from a real-world adaptation challenge in environmental monitoring (Section 5). ... We make the dataset public. ... Cityscapes (CS) Foggy Cityscapes (FCS) (Cordts et al., 2016; Sakaridis et al., 2018) is a popular DAOD benchmark... Sim10k CS (Johnson-Roberson et al., 2016) poses a Sim2Real challenge...
Dataset Splits	Yes	Benchmarks consist of labeled data that is divided into two sets: a source and a target, each originating from different domains. DAOD methods are trained with source-domain images and labels, as in traditional supervised learning, and have access to unlabeled target domain images. The target-domain labels are not available for training. ... Our addition to CFC is crucial for DAOD as it adds an unsupervised training set for domain adaptation methods and a supervised training set to train oracle methods. We keep the original supervised Kenai training set from CFC (132k annotations in 70k images) and the original Channel test set (42k annotations in 13k images).
Hardware Specification	Yes	For training, we perform each experiments on 8 Nvidia V100 (32GB) GPUs distributed over four nodes. We use the MIT Supercloud (Reuther et al., 2018).
Software Dependencies	Yes	We build on top of a recent version of Detectron2 (Wu et al., 2019). ... we use a fixed version that we call v0.7ish based off of an unofficial pull request for v0.7, commit 7755101 dated August 30 2023.
Experiment Setup	Yes	All methods in our comparisons, including source-only and oracle models, utilize Faster R-CNN architectures with Res Net-50 (Ren et al., 2015) backbones with FPN (Lin et al., 2017), COCO (Lin et al., 2014) pre-training, and an image size of 1024px on the shortest side. ... We fix the total effective batch size at 48 samples for fair comparison across all experiments. ... Ldistill is hard pseudo-labeling with a confidence threshold of 0.8, and Lalign is disabled.