Decoupled Adaptation for Cross-Domain Object Detection

Authors: Junguang Jiang, Baixu Chen, Jianmin Wang, Mingsheng Long

ICLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that D-adapt achieves state-of-the-art results on four crossdomain object detection tasks and yields 17% and 21% relative improvement on benchmark datasets Clipart1k and Comic2k in particular.
Researcher Affiliation Academia Junguang Jiang, Baixu Chen, Jianmin Wang, Mingsheng Long School of Software, BNRist, Tsinghua University, China EMAIL, EMAIL
Pseudocode Yes Algorithm 1: D-adapt Training Pipeline.
Open Source Code Yes Code is available at https://github. com/thuml/Decoupled-Adaptation-for-Cross-Domain-Object-Detection.
Open Datasets Yes Following six object detection datasets are used: Pascal VOC [11], Clipart [21], Comic [21], Sim10k [23], Cityscapes [9] and Foggy Cityscapes [44].
Dataset Splits Yes Comic2k contains 1k training images and 1k test images... Both Cityscapes and Foggy Cityscapes have 2975 training images and 500 validation images with 8 object categories.
Hardware Specification Yes We perform all experiments on public datasets using a 1080Ti GPU.
Software Dependencies No The paper mentions frameworks and models like Faster-RCNN, Res Net101, VGG-16, and SGD optimizer, but does not provide specific version numbers for any software libraries or dependencies (e.g., PyTorch, TensorFlow, CUDA versions).
Experiment Setup Yes Stage 1: Source-domain pre-training... with a learning rate of 0.005 for 12k iterations. Stage 2: Category adaptation... trained for 10k iterations using SGD optimizer with an initial learning rate of 0.01, momentum 0.9, and a batch size of 32 for each domain... λ is kept 1 for all experiments. Stage 3: Bounding box adaptation... The training hyper-parameters (learning rate, batch size, etc.) are the same as that of the category adaptor. η is kept 0.1 for all experiments. Stage 4: Target-domain pseudo-label training... for 4k iterations, with an initial learning rate of 2.5 10 4 and reducing to 2.5 10 5 exponentially. The adaptors and the detector are trained in an alternative way for T = 3 iterations.