STAMP: Scalable Task- And Model-agnostic Collaborative Perception

Authors: Xiangbo Gao, Runsheng Xu, Jiachen Li, Ziran Wang, Zhiwen Fan, Zhengzhong Tu

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on both simulated (OPV2V) and real-world (V2V4Real) datasets demonstrate that STAMP achieves comparable or superior accuracy to state-of-the-art models with significantly reduced computational costs.
Researcher Affiliation Academia Xiangbo Gao1, Runsheng Xu2, Jiachen Li3, Ziran Wang4, Zhiwen Fan5, Zhengzhong Tu1 1Texas A&M University, 2UCLA, 3UC Riverside, 4Purdue University, 5UT Austin
Pseudocode No The paper describes methodologies using text, equations (e.g., Equations 1-12), and architectural diagrams (e.g., Figure 1, Figure A1), but does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our project page is at https://xiangbogaobarry.github.io/STAMP and the code is available at https://github.com/taco-group/STAMP.
Open Datasets Yes Using the simulated OPV2V dataset (Xu et al., 2022b) and the real-world V2V4Real dataset (Xu et al., 2023d)
Dataset Splits No The paper mentions using the OPV2V and V2V4Real datasets for experiments. It states, "We use different setups for 3D object detection (Section 4.2) and task-agnostic settings (Section 4.3), detailed within each section." However, it does not provide explicit training, validation, or test split percentages or sample counts for these datasets that were used in their experiments.
Hardware Specification Yes Training GPU hours refers to the time required to complete model training on the OPV2V dataset using an RTA A6000 GPU. We utilize a single NVIDIA RTX A6000 GPU for both model training and inference.
Software Dependencies No Unless using end-to-end training, local and protocol models are trained for 30 epochs using Adam optimizer (Kingma & Ba, 2014). The paper mentions using the Adam optimizer but does not specify versions for any software libraries or frameworks (e.g., PyTorch, TensorFlow, Python).
Experiment Setup Yes Unless using end-to-end training, local and protocol models are trained for 30 epochs using Adam optimizer (Kingma & Ba, 2014). For end-to-end training, we use Iters N = 30N epochs, where N is the number of heterogeneous models, to ensure all models receive the same amount of supervision. Local adapters ϕ and reverters ψ are trained for 5 epochs. We set loss scaling factors λf ϕ, λf ψ, λd ϕ, and λd ψ = 0.5 empirically. For additional details, please refer to Section 4.2, Section 4.3, and Appendix A.1. We initialize the learning rate at 0.001 and reduce it by a factor of 0.1 at 50% and 83% of the total epochs. For adapters and reverters, we start with a learning rate of 0.01, reducing it by a factor of 0.1 after the first epoch.