FD2-Net: Frequency-Driven Feature Decomposition Network for Infrared-Visible Object Detection
Authors: Ke Li, Di Wang, Zhangyuan Hu, Shaofeng Li, Weiping Ni, Lin Zhao, Quan Wang
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that FD2-Net outperforms state-of-the-art (SOTA) models across various IVOD benchmarks, i.e. LLVIP (96.2% m AP), FLIR (82.9% m AP), and M3FD (83.5% m AP). |
| Researcher Affiliation | Academia | Ke Li1, Di Wang1*, Zhangyuan Hu1, Shaofeng Li1*, Weiping Ni2, Lin Zhao3, Quan Wang1 1Xidian University, 2Northwest Institute of Nuclear Technology, 3Nanjing University of Science and Technology |
| Pseudocode | No | The paper describes methods and mathematical formulations but does not include any explicit pseudocode blocks or algorithms labeled as such. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | Datasets. The proposed model is evaluated against SOTA methods using three IVOD benchmark datasets: (1) LLVIP dataset (Jia et al. 2021) is a prominent large-scale pedestrian dataset... (2) FLIR dataset offers a highly challenging multispectral object detection benchmark... (F.A. Group 2018). (3) M3FD dataset (Liu et al. 2022a) comprises 4,200 pairs of RGB and thermal images. |
| Dataset Splits | Yes | FLIR... It comprises 5,142 precisely aligned infrared-visible image pairs, with 4,129 pairs allocated for training and 1,013 pairs reserved for testing. The dataset encompasses three primary object categories: People, Cars, and Bicycles. (3) M3FD dataset (Liu et al. 2022a)... Specifically, 80% of the images are allocated to the training set, with the remaining images assigned to the validation set. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU models, CPU types) used for running the experiments. |
| Software Dependencies | No | FD2Net is built upon the SOTA detector YOLOv5 (Jocher 2020). For evaluation, we report F1-Score, Precision, Recall, and Average Precision, consistent with prior research. Xavier initialization (Glorot and Bengio 2010) is used to initialize parameters, and the model is trained for 150 epochs using SGD (Robbins and Monro 1951) with an initial learning rate of 0.01, weight decay of 10 4, and momentum of 0.9. While YOLOv5 and SGD are mentioned, specific version numbers for software libraries or environments (e.g., Python, PyTorch/TensorFlow versions) are not provided. |
| Experiment Setup | Yes | Xavier initialization (Glorot and Bengio 2010) is used to initialize parameters, and the model is trained for 150 epochs using SGD (Robbins and Monro 1951) with an initial learning rate of 0.01, weight decay of 10 4, and momentum of 0.9. |