Multispectral Pedestrian Detection with Sparsely Annotated Label
Authors: Chan Lee, Seungho Shin, Gyeong-Moon Park, Jung Uk Kim
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results demonstrate that our SAMPD significantly enhances performance in sparsely annotated environments within the multispectral domain. Experiments Dataset and Evaluation Metric. Comparisons Results on the KAIST Dataset. Ablation Study We conduct the ablation study to explore the effect of the proposed modules. |
| Researcher Affiliation | Academia | Kyung Hee University, Yong-in, South Korea EMAIL |
| Pseudocode | No | The paper includes diagrams (Figure 2) and mathematical equations (1-8), but no explicitly labeled pseudocode or algorithm blocks with structured steps. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code, nor does it include links to a code repository or mention of code in supplementary materials for the methodology described in this paper. |
| Open Datasets | Yes | KAIST Dataset. The KAIST dataset (Hwang et al. 2015) comprises 95,328 pairs of visible and thermal images, enriched with 103,128 bounding box annotations to identify pedestrians. LLVIP Dataset. The LLVIP dataset (Jia et al. 2021) contains visible-thermal paired dataset for low-light vision. It consists 15,488 visible-thermal image pairs. |
| Dataset Splits | Yes | We use a test set of 2,252 images to evaluate performance. Following (Suri et al. 2023), we simulated a sparsely annotated scenario by increasing the probability of removing bounding-box annotations with smaller widths from the training set. More details are in the supplementary document. Finally, we removed 30%, 50%, and 70% of the bounding-box annotations among the total annotations, consistent with the ratios described in (Niitani et al. 2019). Following the same protocol as the KAIST dataset, we simulated a sparsely annotated scenario for the LLVIP dataset by removing 30%, 50%, and 70% of the bounding-box annotations from the total annotations. |
| Hardware Specification | Yes | coordinating the process across two GTX 3090 GPUs and processing 6 images in each mini-batch. |
| Software Dependencies | No | All experimental procedures are performed utilizing the Pytorch framework (Paszke et al. 2017). SSD (Liu et al. 2016) structure combined with a VGG16 (Simonyan and Zisserman 2015) backbone. While PyTorch is mentioned, a specific version number is not provided. SSD and VGG16 refer to model architectures rather than specific software dependencies with versions. |
| Experiment Setup | Yes | We optimize our framework using Stochastic Gradient Descent (SGD), coordinating the process across two GTX 3090 GPUs and processing 6 images in each mini-batch. Parameters are set with m = 1, τ = 0.1 and λ1 = λ2 = 1. We train our detector for 80 epochs with 0.0001 learning rate. |