GSDet: Gaussian Splatting for Oriented Object Detection

Authors: Zeyu Ding, Jiaqi Zhao, Yong Zhou, Wen-liang Du, Hancheng Zhu, Rui Yao

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on 3 datasets indicate that GSDet achieves AP50 gains of 0.7% on DIOR-R, 0.3% on DOTA-v1.0, and 0.55% on DOTA-v1.5 when evaluated with adaptive control and outperforms mainstream detectors.
Researcher Affiliation Academia 1School of Computer Science and Technology, China University of Mining and Techology 2Mine Digitization Engineering Research Center of the Ministry of Education EMAIL
Pseudocode Yes Algorithm 1 GSDet Training
Open Source Code Yes Code link https://github.com/wokaikaixinxin/GSDet.
Open Datasets Yes We conduct extensive experiments on three datasets DOTA-v1.0 [Xia et al., 2018], DOTA-v1.5 [Xia et al., 2018] and DIOR-R [Cheng et al., 2022a].
Dataset Splits Yes DOTA-v1.0 [Xia et al., 2018] comprises 1,869 images in the trainval set and 937 images in the test set, annotated with 188,282 instances across 15 categories. DOTA-v1.5 [Xia et al., 2018] dataset extends the DOTA-v1.0 dataset by adding a new category named Container Crane while keeping the same images. The number of instances is increased to 403,318 in total. DIOR-R [Cheng et al., 2022a] dataset consists of 11,725 training images in the trainval set, 11,738 test images in the test set and 192,512 instances belonging to 20 categories.
Hardware Specification Yes All models are trained with the batchsize 4 on two Nvidia 2080ti (2 images per GPU).
Software Dependencies No Our code is built on MMrotate with pytorch. No specific version numbers for MMrotate or Pytorch are provided, which prevents full reproducibility.
Experiment Setup Yes The optimizer Adam W [Loshchilov and Hutter, 2018] is used with the learning rate as 2.5 10 5 and the weight decay as 10 4. All models are trained with the batchsize 4 on two Nvidia 2080ti (2 images per GPU). The training schedule is 24 epochs, with the learning rate divided by 16 and 22 epochs. Data augmentation strategies contain only random flips.