DCSF-KD: Dynamic Channel-wise Spatial Feature Knowledge Distillation for Object Detection
Authors: Tao Dai, Yang Lin, Hang Guo, Jinbao Wang, Zexuan Zhu
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate that our DCSF-KD outperforms existing detection methods on both homogeneous and heterogeneous teacher-student network pairs. For example, when using the Mask RCNN-Swin detector as the teacher, and based on Retina Net and FCOS with Res Net-50 on MS COCO, our DCSF-KD can achieve 41.9% and 44.1% m AP, respectively. In this paper, we conduct comprehensive experiments on the MS COCO dataset. |
| Researcher Affiliation | Academia | 1College of Computer Science and Software Engineering, Shenzhen University 2National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University 3Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China 4Shenzhen City Key Laboratory of Embedded System Design, Shenzhen, China 5Guangdong Provincial Key Laboratory of Intelligent Information Processing, Shenzhen, China EMAIL, EMAIL |
| Pseudocode | No | The paper describes methods using mathematical equations and textual descriptions, but no clearly labeled 'Pseudocode' or 'Algorithm' blocks are present. |
| Open Source Code | Yes | Code https://github.com/Lin Y-ct/DCSF-KD |
| Open Datasets | Yes | To verify the effectiveness and robustness of our DCSF-KD method for object detection, we conducted experiments on various detection frameworks using the MS COCO dataset (Lin et al. 2014), which includes 120,000 training images and 5,000 test images. |
| Dataset Splits | Yes | We conducted experiments on various detection frameworks using the MS COCO dataset (Lin et al. 2014), which includes 120,000 training images and 5,000 test images. |
| Hardware Specification | Yes | All experiments were conducted on 4 NVIDIA 3090Ti GPUs, processing two images per GPU, using mmdetection (Chen et al. 2019b) and mmrazor (Contributors 2021) frameworks based on Py Torch (Paszke et al. 2017). |
| Software Dependencies | No | All experiments were conducted on 4 NVIDIA 3090Ti GPUs, processing two images per GPU, using mmdetection (Chen et al. 2019b) and mmrazor (Contributors 2021) frameworks based on Py Torch (Paszke et al. 2017). The paper mentions software frameworks like mmdetection, mmrazor, and PyTorch, but does not provide specific version numbers for these components. |
| Experiment Setup | Yes | We set α = 2 for all two-stage models and α = 1 for single-stage models. For training across all detectors, we used an SGD optimizer with an initial learning rate of 0.01, momentum of 0.9, and weight decay of 0.0001 for 24 epochs. For 12-epoch experiments, a compound strategy with Linear LR and Multi Step LR optimizers was used, starting with a linear learning rate change from iterations 0-500 with a starting factor of 0.001, and reducing the learning rate by a factor of 0.1 during epochs 8-11. |