Multispectral Pedestrian Detection with Sparsely Annotated Label

Authors: Chan Lee, Seungho Shin, Gyeong-Moon Park, Jung Uk Kim

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results demonstrate that our SAMPD significantly enhances performance in sparsely annotated environments within the multispectral domain. Experiments Dataset and Evaluation Metric. Comparisons Results on the KAIST Dataset. Ablation Study We conduct the ablation study to explore the effect of the proposed modules.
Researcher Affiliation Academia Kyung Hee University, Yong-in, South Korea EMAIL
Pseudocode No The paper includes diagrams (Figure 2) and mathematical equations (1-8), but no explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code No The paper does not provide any explicit statement about releasing source code, nor does it include links to a code repository or mention of code in supplementary materials for the methodology described in this paper.
Open Datasets Yes KAIST Dataset. The KAIST dataset (Hwang et al. 2015) comprises 95,328 pairs of visible and thermal images, enriched with 103,128 bounding box annotations to identify pedestrians. LLVIP Dataset. The LLVIP dataset (Jia et al. 2021) contains visible-thermal paired dataset for low-light vision. It consists 15,488 visible-thermal image pairs.
Dataset Splits Yes We use a test set of 2,252 images to evaluate performance. Following (Suri et al. 2023), we simulated a sparsely annotated scenario by increasing the probability of removing bounding-box annotations with smaller widths from the training set. More details are in the supplementary document. Finally, we removed 30%, 50%, and 70% of the bounding-box annotations among the total annotations, consistent with the ratios described in (Niitani et al. 2019). Following the same protocol as the KAIST dataset, we simulated a sparsely annotated scenario for the LLVIP dataset by removing 30%, 50%, and 70% of the bounding-box annotations from the total annotations.
Hardware Specification Yes coordinating the process across two GTX 3090 GPUs and processing 6 images in each mini-batch.
Software Dependencies No All experimental procedures are performed utilizing the Pytorch framework (Paszke et al. 2017). SSD (Liu et al. 2016) structure combined with a VGG16 (Simonyan and Zisserman 2015) backbone. While PyTorch is mentioned, a specific version number is not provided. SSD and VGG16 refer to model architectures rather than specific software dependencies with versions.
Experiment Setup Yes We optimize our framework using Stochastic Gradient Descent (SGD), coordinating the process across two GTX 3090 GPUs and processing 6 images in each mini-batch. Parameters are set with m = 1, τ = 0.1 and λ1 = λ2 = 1. We train our detector for 80 epochs with 0.0001 learning rate.