Point Cluster: A Compact Message Unit for Communication-Efficient Collaborative Perception

Authors: Zihan Ding, Jiahui Fu, Si Liu, Hongyu Li, Siheng Chen, Hongsheng Li, Shifeng Zhang, Xu Zhou

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on serval widely recognized collaborative perception benchmarks showcase the superior performance of our method compared to the previous state-of-the-art approaches. 4 EXPERIMENTS 4.1 DATASETS AND EVALUATION METRICS 4.2 IMPLEMENTATION DETAILS 4.3 COMPARISON WITH STATE-OF-THE-ART METHODS 4.4 ABLATION STUDIES
Researcher Affiliation Collaboration Zihan Ding1, Jiahui Fu1, Si Liu1 , Hongyu Li1, Siheng Chen2, Hongsheng Li3,4, Shifeng Zhang5, Xu Zhou5 1 Institute of Artificial Intelligence, Beihang University, 2 School of Artificial Intelligence, Shanghai Jiao Tong University, 3 MMLab, CUHK, 4 Centre for Perceptual and Interactive Intelligence, 5 Sangfor Technologies EMAIL
Pseudocode Yes Algorithm 1 Semantic and Distribution guided Farthest Point Sampling Algorithm. Npoint is the number of input points and Nsample = Npoint ζ is the number of sampled points controlled by a predefined sampling rate ζ. Input: coordinates P = {p1, . . . , p Nfg} RNpoint 3; semantic scores Sf = {s1 f , . . . , s Npoint f } RNpoint; distribution scores Sd = {s1 d, . . . , s Npoint d } RNpoint. Output: sampled key point set e P = {ep1, . . . , ep Nsample}
Open Source Code No The paper does not contain an explicit statement about releasing their code or a link to a code repository.
Open Datasets Yes We conducted experiments on three widely used benchmarks for collaborative perception, i.e., V2XSet Xu et al. (2022a), OPV2V Xu et al. (2022b), and DAIR-V2X-C Yu et al. (2022). DAIR-V2X-C Yu et al. (2022) is the first to provide a large-scale collection of real-world scenarios for vehicle-infrastructure collaborative autonomous driving. V2XSet Xu et al. (2022a) is a large-scale V2X perception dataset founded on CARLA Dosovitskiy et al. (2017) and Open CDA Xu et al. (2021). OPV2V Xu et al. (2022b) is a vehicle-to-vehicle collaborative perception dataset, cosimulated by Open CDA Xu et al. (2021) and Carla Dosovitskiy et al. (2017).
Dataset Splits Yes V2XSet has 11,447 frames (6,694/ 1,920/2,833 for train/validation/test respectively) captured in 55 representative simulation scenes that cover the most common driving scenarios in real life.
Hardware Specification Yes Adam Kingma & Ba (2014) is employed as the optimizer for training our model end-to-end on NVIDIA Tesla V100 GPUs, with a total of 35 epochs.
Software Dependencies No Our method is implemented with Py Torch. (Only Py Torch is mentioned, no version number).
Experiment Setup Yes We set the perception range along the x, y, and z-axis to [ 140.8m, 140.8m] [ 40m, 40m] [ 3m, 1m] for V2XSet and [ 100.8m, 100.8m] [ 40m, 40m] [ 3m, 1m] for DAIR-V2X-C, respectively. The thresholds ϵagg, ϵpose, ϵlatency, and ϵlatency for cluster matching are set as 0.6, 1.5, 0.5 and 2.0, respectively. The number of SIR layers is L1 = 6 in PCE and L2 = 3 during message decoding. The channel number of cluster features is D = 128. Adam Kingma & Ba (2014) is employed as the optimizer for training our model end-to-end on NVIDIA Tesla V100 GPUs, with a total of 35 epochs. The initial learning rate is set as 0.001 and we reduce it by 10 after 20 and 30 epochs, respectively.