Towards Region-Adaptive Feature Disentanglement and Enhancement for Small Object Detection
Authors: Yanchao Bi, Yang Ning, Xiushan Nie, Xiankai Lu, Yongshun Gong, Leida Li
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on several public datasets demonstrate that the RAFDE strategy is highly effective and outperforms stateof-the-art methods. The code is available at https://github.com/b-yanchao/RAFDE.git. 4 Experiments We have integrated our RAFDE module with the latest YOLO model and conducted experiments on two widely used drone image benchmarks: the Vis Drone dataset [Du et al., 2019] and the Drone-vs-Bird dataset [Coluccia et al., 2021]. |
| Researcher Affiliation | Academia | 1School of Computer Science and Technology, Shandong Jianzhu University, Jinan, China 2School of Software, Shandong University, Jinan, China 3School of Artificial Intelligence, Xidian University, Xi an, China |
| Pseudocode | No | The paper describes the methods through mathematical definitions and textual explanations (e.g., Sections 3.1, 3.2, 3.3) and does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available at https://github.com/b-yanchao/RAFDE.git. |
| Open Datasets | Yes | We have integrated our RAFDE module with the latest YOLO model and conducted experiments on two widely used drone image benchmarks: the Vis Drone dataset [Du et al., 2019] and the Drone-vs-Bird dataset [Coluccia et al., 2021]. |
| Dataset Splits | Yes | The Vis Drone dataset consists of 7,019 high-resolution images (2000 1500) containing 10 classes of small, densely packed objects. Of these, 6,471 images are used for training, 548 for validation, and 1,610 for testing. The Drone-vs-Bird dataset includes 1,387 training images and 434 test images, featuring both UAV and environmental data. |
| Hardware Specification | Yes | Training and testing were conducted on a single RTX A6000 GPU, with batch sizes of 8 and 2 for input resolutions of 640 640 and 1280 1280, respectively. |
| Software Dependencies | No | We implemented our RAFDE strategy using Py Torch [Paszke et al., 2019]. The paper mentions using PyTorch but does not specify its version number or any other software dependencies with version numbers. |
| Experiment Setup | Yes | All models were trained for 150 epochs, with YOLOv11m serving as the baseline. Our approach employs the same loss function as YOLOv11 [Khanam R, 2024], which includes both object classification loss and bounding box regression loss. For the classification loss, we combine BCELoss [Zheng et al., 2020] and Focal Loss [Li et al., 2020], while for the regression loss, we use CIo ULoss [Wang et al., 2023]. The input resolutions were set to 640 640 and 1280 1280 for the Vis Drone dataset, and 640 640 for the Drone-vs-Bird dataset. All models were trained using the Adam optimizer with an initial learning rate of 0.01 and a decay rate of 1e-5. Training and testing were conducted on a single RTX A6000 GPU, with batch sizes of 8 and 2 for input resolutions of 640 640 and 1280 1280, respectively. |