Point Cluster: A Compact Message Unit for Communication-Efficient Collaborative Perception
Authors: Zihan Ding, Jiahui Fu, Si Liu, Hongyu Li, Siheng Chen, Hongsheng Li, Shifeng Zhang, Xu Zhou
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on serval widely recognized collaborative perception benchmarks showcase the superior performance of our method compared to the previous state-of-the-art approaches. 4 EXPERIMENTS 4.1 DATASETS AND EVALUATION METRICS 4.2 IMPLEMENTATION DETAILS 4.3 COMPARISON WITH STATE-OF-THE-ART METHODS 4.4 ABLATION STUDIES |
| Researcher Affiliation | Collaboration | Zihan Ding1, Jiahui Fu1, Si Liu1 , Hongyu Li1, Siheng Chen2, Hongsheng Li3,4, Shifeng Zhang5, Xu Zhou5 1 Institute of Artificial Intelligence, Beihang University, 2 School of Artificial Intelligence, Shanghai Jiao Tong University, 3 MMLab, CUHK, 4 Centre for Perceptual and Interactive Intelligence, 5 Sangfor Technologies EMAIL |
| Pseudocode | Yes | Algorithm 1 Semantic and Distribution guided Farthest Point Sampling Algorithm. Npoint is the number of input points and Nsample = Npoint ζ is the number of sampled points controlled by a predefined sampling rate ζ. Input: coordinates P = {p1, . . . , p Nfg} RNpoint 3; semantic scores Sf = {s1 f , . . . , s Npoint f } RNpoint; distribution scores Sd = {s1 d, . . . , s Npoint d } RNpoint. Output: sampled key point set e P = {ep1, . . . , ep Nsample} |
| Open Source Code | No | The paper does not contain an explicit statement about releasing their code or a link to a code repository. |
| Open Datasets | Yes | We conducted experiments on three widely used benchmarks for collaborative perception, i.e., V2XSet Xu et al. (2022a), OPV2V Xu et al. (2022b), and DAIR-V2X-C Yu et al. (2022). DAIR-V2X-C Yu et al. (2022) is the first to provide a large-scale collection of real-world scenarios for vehicle-infrastructure collaborative autonomous driving. V2XSet Xu et al. (2022a) is a large-scale V2X perception dataset founded on CARLA Dosovitskiy et al. (2017) and Open CDA Xu et al. (2021). OPV2V Xu et al. (2022b) is a vehicle-to-vehicle collaborative perception dataset, cosimulated by Open CDA Xu et al. (2021) and Carla Dosovitskiy et al. (2017). |
| Dataset Splits | Yes | V2XSet has 11,447 frames (6,694/ 1,920/2,833 for train/validation/test respectively) captured in 55 representative simulation scenes that cover the most common driving scenarios in real life. |
| Hardware Specification | Yes | Adam Kingma & Ba (2014) is employed as the optimizer for training our model end-to-end on NVIDIA Tesla V100 GPUs, with a total of 35 epochs. |
| Software Dependencies | No | Our method is implemented with Py Torch. (Only Py Torch is mentioned, no version number). |
| Experiment Setup | Yes | We set the perception range along the x, y, and z-axis to [ 140.8m, 140.8m] [ 40m, 40m] [ 3m, 1m] for V2XSet and [ 100.8m, 100.8m] [ 40m, 40m] [ 3m, 1m] for DAIR-V2X-C, respectively. The thresholds ϵagg, ϵpose, ϵlatency, and ϵlatency for cluster matching are set as 0.6, 1.5, 0.5 and 2.0, respectively. The number of SIR layers is L1 = 6 in PCE and L2 = 3 during message decoding. The channel number of cluster features is D = 128. Adam Kingma & Ba (2014) is employed as the optimizer for training our model end-to-end on NVIDIA Tesla V100 GPUs, with a total of 35 epochs. The initial learning rate is set as 0.001 and we reduce it by 10 after 20 and 30 epochs, respectively. |