Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Align and Distill: Unifying and Improving Domain Adaptive Object Detection
Authors: Justin Kay, Timm Haucke, Suzanne Stathatos, Siqi Deng, Erik Young, Pietro Perona, Sara Beery, Grant Van Horn
TMLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform experiments on Cityscapes Foggy Cityscapes, Sim10k Cityscapes, and CFC Kenai Channel. In addition to being consistent with prior work, these datasets represent three common adaptation scenarios capturing a range of real-world challenges: weather adaptation, Sim2Real, and environmental adaptation, respectively. We report the Pascal VOC metric of mean Average Precision with Io U 0.5 ( AP50 ). |
| Researcher Affiliation | Collaboration | 1MIT, 2Caltech, 3AWS, 4Skagit Fisheries Enhancement Group, 5UMass Amherst |
| Pseudocode | No | The paper describes the methods through text descriptions, mathematical equations (e.g., Lsup = 1 Bsrc i=1 L (ΞΈstu(t(xsrc,i)), ysrc,i) (1)), and diagrams (Figure 2), but does not present structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | github.com/justinkay/aldi github.com/visipedia/caltech-fish-counting. We release ALDI as an open-source codebase built on a modern detector implementation. |
| Open Datasets | Yes | A new benchmark dataset, CFC-DAOD, sourced from a real-world adaptation challenge in environmental monitoring (Section 5). ... We make the dataset public. ... Cityscapes (CS) Foggy Cityscapes (FCS) (Cordts et al., 2016; Sakaridis et al., 2018) is a popular DAOD benchmark... Sim10k CS (Johnson-Roberson et al., 2016) poses a Sim2Real challenge... |
| Dataset Splits | Yes | Benchmarks consist of labeled data that is divided into two sets: a source and a target, each originating from different domains. DAOD methods are trained with source-domain images and labels, as in traditional supervised learning, and have access to unlabeled target domain images. The target-domain labels are not available for training. ... Our addition to CFC is crucial for DAOD as it adds an unsupervised training set for domain adaptation methods and a supervised training set to train oracle methods. We keep the original supervised Kenai training set from CFC (132k annotations in 70k images) and the original Channel test set (42k annotations in 13k images). |
| Hardware Specification | Yes | For training, we perform each experiments on 8 Nvidia V100 (32GB) GPUs distributed over four nodes. We use the MIT Supercloud (Reuther et al., 2018). |
| Software Dependencies | Yes | We build on top of a recent version of Detectron2 (Wu et al., 2019). ... we use a fixed version that we call v0.7ish based off of an unofficial pull request for v0.7, commit 7755101 dated August 30 2023. |
| Experiment Setup | Yes | All methods in our comparisons, including source-only and oracle models, utilize Faster R-CNN architectures with Res Net-50 (Ren et al., 2015) backbones with FPN (Lin et al., 2017), COCO (Lin et al., 2014) pre-training, and an image size of 1024px on the shortest side. ... We fix the total effective batch size at 48 samples for fair comparison across all experiments. ... Ldistill is hard pseudo-labeling with a confidence threshold of 0.8, and Lalign is disabled. |