DCA: Dividing and Conquering Amnesia in Incremental Object Detection
Authors: Aoting Zhang, Dongbao Yang, Chang Liu, Xiaopeng Hong, Miao Shang, Yu Zhou
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments validate that our approach achieves state-of-the-art performance, especially for long-term incremental scenarios. For example, under the four-step setting on MS-COCO, our DCA strategy significantly improves the final AP by 6.9%. [...] Experiments Experimental Setups Datasets and Metrics. We evaluate on two widely used datasets: PASCAL VOC (Everingham 2007) and MS COCO (Lin et al. 2014). AP50 and AP are reported for metrics. [...] Ablation Studies Component ablations. As shown in Table 5, fine-tuning with pseudo-labeling gets the lowest result 62.5%. |
| Researcher Affiliation | Academia | 1Institute of Information Engineering, Chinese Academy of Sciences 2VCIP & TMCC & DISSec, College of Computer Science, Nankai University 3School of Cyber Security, University of Chinese Academy of Sciences 4Harbin Institute of Technology 5Tsinghua University EMAIL, EMAIL EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the proposed methodology and decoding process in detailed paragraph text and uses diagrams (e.g., Figure 3: Pipeline of DCA) to illustrate the architecture and data flow. However, it does not contain any explicitly labeled pseudocode blocks or algorithm listings. |
| Open Source Code | Yes | Our codes are available at https://github.com/InfLoop111/DCA. |
| Open Datasets | Yes | We evaluate on two widely used datasets: PASCAL VOC (Everingham 2007) and MS COCO (Lin et al. 2014). [...] The architecture of DCA detector is an adaptation of D-DETR, which leverages Res Net50 (He et al. 2016) pre-trained on Image Net (Deng et al. 2009) as the backbone. |
| Dataset Splits | Yes | For VOC, we consider three different settings, where a group of classes (10, 5 and last class) are introduced incrementally to the detector. For COCO, we conduct experiments under 70+10, 60+20, 50+30 and 40+40 settings. To increase the task difficulty, multi-step settings are evaluated where base model is trained with 40 classes and 20 or 10 classes are added in each of the following phases. |
| Hardware Specification | No | The paper describes the model architecture and experimental settings, but it does not provide specific details about the hardware (e.g., GPU models, CPU types, or memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'D-DETR' as the baseline architecture and 'CLIP text encoder' as the language model. However, it does not specify version numbers for these or any other software dependencies such as programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or other libraries. |
| Experiment Setup | Yes | Following CL-DETR, we use the standard configurations without iterative bounding box refinement and the two-stage mechanism. For the shared decoder, the number of layers is set to L = 6 and the number of location queries N = 100. During inference, top-50 high-scoring detections per image are used for evaluation. [...] Here, we directly employ a weighted approach (weight β = 0.5) to fuse these probabilities to get the overall classification probabilities P = { p1, . . . , p N} RN K for training and inference: [...] The balance weight. β controls the relative importance of the standard linear head and our introduced semantic head in duplex classifier. β is set to {0.1, 0.2, . . . , 1.0}. |