Video Object Detection with Locally-Weighted Deformable Neighbors
Authors: Zhengkai Jiang, Peng Gao, Chaoxu Guo, Qian Zhang, Shiming Xiang, Chunhong Pan8529-8536
AAAI 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on VID dataset demonstrate that our method achieves superior performance in a speed and accuracy trade-off, i.e., 76.3% on the challenging VID dataset while maintaining 20fps in speed on Titan X GPU. |
| Researcher Affiliation | Collaboration | 1National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences 2School of Artificial Intelligence, University of Chinese Academy of Sciences 3The Chinese University of Hong Kong 4Horizon Robotics EMAIL EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1: Inference algorithm of Memory-Guided Propagation Networks |
| Open Source Code | No | No explicit statement or link providing access to open-source code for the described methodology was found. |
| Open Datasets | Yes | We evaluate the proposed method on the Image Net VID dataset which has been treated as a benchmark for video object detection (Russakovsky et al. 2015). [...] Thus we follow previous approaches and train our model on an intersection of Image Net VID and DET dataset. |
| Dataset Splits | Yes | VID dataset is split into 3862 training videos and 555 validation videos. |
| Hardware Specification | Yes | Extensive experiments on VID dataset demonstrate that our method achieves superior performance in a speed and accuracy trade-off, i.e., 76.3% on the challenging VID dataset while maintaining 20fps in speed on Titan X GPU. [...] For training, 4 epochs with SGD optimization method are performed on 8 GPUs with each GPU holding one mini-batch. |
| Software Dependencies | No | No specific version numbers for software dependencies (e.g., libraries, frameworks) were mentioned. |
| Experiment Setup | Yes | For training, 4 epochs with SGD optimization method are performed on 8 GPUs with each GPU holding one mini-batch. Learning rate begins with 2.5e-4 and divides by 10 after 2.5 epochs. We also employ standard left-right flipping augmentation. |