DCSF-KD: Dynamic Channel-wise Spatial Feature Knowledge Distillation for Object Detection

Authors: Tao Dai, Yang Lin, Hang Guo, Jinbao Wang, Zexuan Zhu

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate that our DCSF-KD outperforms existing detection methods on both homogeneous and heterogeneous teacher-student network pairs. For example, when using the Mask RCNN-Swin detector as the teacher, and based on Retina Net and FCOS with Res Net-50 on MS COCO, our DCSF-KD can achieve 41.9% and 44.1% m AP, respectively. In this paper, we conduct comprehensive experiments on the MS COCO dataset.
Researcher Affiliation Academia 1College of Computer Science and Software Engineering, Shenzhen University 2National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University 3Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China 4Shenzhen City Key Laboratory of Embedded System Design, Shenzhen, China 5Guangdong Provincial Key Laboratory of Intelligent Information Processing, Shenzhen, China EMAIL, EMAIL
Pseudocode No The paper describes methods using mathematical equations and textual descriptions, but no clearly labeled 'Pseudocode' or 'Algorithm' blocks are present.
Open Source Code Yes Code https://github.com/Lin Y-ct/DCSF-KD
Open Datasets Yes To verify the effectiveness and robustness of our DCSF-KD method for object detection, we conducted experiments on various detection frameworks using the MS COCO dataset (Lin et al. 2014), which includes 120,000 training images and 5,000 test images.
Dataset Splits Yes We conducted experiments on various detection frameworks using the MS COCO dataset (Lin et al. 2014), which includes 120,000 training images and 5,000 test images.
Hardware Specification Yes All experiments were conducted on 4 NVIDIA 3090Ti GPUs, processing two images per GPU, using mmdetection (Chen et al. 2019b) and mmrazor (Contributors 2021) frameworks based on Py Torch (Paszke et al. 2017).
Software Dependencies No All experiments were conducted on 4 NVIDIA 3090Ti GPUs, processing two images per GPU, using mmdetection (Chen et al. 2019b) and mmrazor (Contributors 2021) frameworks based on Py Torch (Paszke et al. 2017). The paper mentions software frameworks like mmdetection, mmrazor, and PyTorch, but does not provide specific version numbers for these components.
Experiment Setup Yes We set α = 2 for all two-stage models and α = 1 for single-stage models. For training across all detectors, we used an SGD optimizer with an initial learning rate of 0.01, momentum of 0.9, and weight decay of 0.0001 for 24 epochs. For 12-epoch experiments, a compound strategy with Linear LR and Multi Step LR optimizers was used, starting with a linear learning rate change from iterations 0-500 with a starting factor of 0.001, and reducing the learning rate by a factor of 0.1 during epochs 8-11.