Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Adversarial Attacks on Event-Based Pedestrian Detectors: A Physical Approach
Authors: Guixu Lin, Muyao Niu, Qingtian Zhu, Zhengwei Yin, Zhuoxiao Li, Shengfeng He, Yinqiang Zheng
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our results demonstrate that the textures identified in the digital domain possess strong adversarial properties. Furthermore, we translated these digitally optimized textures into physical clothing and tested them in realworld scenarios, successfully demonstrating that the designed textures significantly degrade the performance of event-based pedestrian detection models. This work highlights the vulnerability of such models to physical adversarial attacks. |
| Researcher Affiliation | Academia | 1The University of Tokyo, Japan 2Singapore Management University, Singapore |
| Pseudocode | No | The paper describes the method and its components (e.g., texture map generation network, 3D human model rendering, adversarial event attack) using figures and explanatory text, but it does not contain a clearly labeled pseudocode or algorithm block. |
| Open Source Code | No | The paper does not contain any explicit statement about making the source code available, nor does it provide a link to a code repository. |
| Open Datasets | Yes | In recent years, research on event-based pedestrian detection using deep learning methods has garnered significant attention, particularly with the introduction of large-scale datasets like 1Mp X (Perot et al. 2020) and Gen1 (De Tournemire et al. 2020). Notably, RVT (Gehrig and Scaramuzza 2023) has highlighted the advantages of event cameras in pedestrian detection within traffic scenarios, achieving impressive results by processing event data alone. Further advancements in detection accuracy have been realized through innovative network architectures in models such as HMNet (Hamaguchi et al. 2023), GET (Peng et al. 2023), and SAST (Peng et al. 2024)... For instance, Marchisio et al. (Marchisio et al. 2021) explored the creation of adversarial examples by projecting event-based data onto 2D images, generating 2D adversarial images. However, this approach indirectly targets event data and does not directly manipulate the raw input of event cameras, thereby limiting its effectiveness and applicability in real-world scenarios. Building on this, Lee et al. (Lee and Myung 2022) developed an adversarial attack algorithm specifically designed for event-based models. Their approach involved shifting the timing of events and generating additional adversarial events to deceive the model. This method was tested successfully on the N-Caltech101 dataset (Orchard et al. 2015)... We utilize the CMU motion capture dataset provided by AMASS (Mahmood et al. 2019), which includes various subjects in different poses. Each subject comprises several trials, and we select a subset of these trials for our experiments. |
| Dataset Splits | Yes | Our training dataset consists of 47 trials, encompassing 114,173 poses, while the test dataset includes 13 trials, with a total of 10,323 poses. |
| Hardware Specification | Yes | Our framework is trained and validated with a batch size of 1 on an NVIDIA 3090 GPU. For the physical attack experiments, we used an INIVATION DAVIS346 MONO event camera to capture real-world event data, with a spatial resolution of 346 260 pixels. |
| Software Dependencies | No | Specifically, we employ the implementation of PyTorch3D (Ravi et al. 2020) for differentiable rendering for its native compatibility with PyTorch. ... we used YOLOv7 (Wang, Bochkovskiy, and Liao 2023) to annotate bounding boxes and classify these grayscale images, providing ground truth labels. The paper mentions software tools like PyTorch3D, PyTorch, and YOLOv7, but it does not specify their version numbers. |
| Experiment Setup | Yes | The time bins B are set to 10. We train the network for 11,500 iterations using the Adam optimizer with an initial learning rate of 10^-4. Our framework is trained and validated with a batch size of 1 on an NVIDIA 3090 GPU. ...we selected four thresholds between 0.001 and 0.25 common values in detection tasks to determine the optimal pattern and evaluate attack performance at different confidence levels. As shown in Table 1, these thresholds are 0.001, 0.01, 0.1, and 0.25, and the sequence length of the input event is 3. ... The confidence threshold was set at 0.25 during training and testing. The input event sequence length for the detector was set to 10. |