Towards Generalizable Multi-Camera 3D Object Detection via Perspective Rendering
Authors: Hao Lu, Yunpeng Zhang, Guoqing Wang, Qing Lian, Dalong Du, Ying-Cong Chen
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on both Domain Generalization (DG) and Unsupervised Domain Adaptation (UDA) demonstrate its effectiveness. We explore the training on a virtual engine without the real scene annotations to achieve real-world MC3D-Det tasks for the first time. To verify the effectiveness, we elaborately use both DG and UDA protocols for MC3D-Det. The paper also includes detailed tables (Table 1, Table 2, Table 3) showing performance metrics and ablation studies on various datasets. |
| Researcher Affiliation | Collaboration | 1 The Hong Kong University of Science and Technology (Guangzhou) 2 The Hong Kong University of Science and Technology 3 Phi Gent Robotics 4Shanghai Jiao Tong University. This shows affiliations with academic institutions (The Hong Kong University of Science and Technology, Shanghai Jiao Tong University) and an industry entity (Phi Gent Robotics). |
| Pseudocode | No | No explicit pseudocode or algorithm blocks are present in the provided text. The methodology is described narratively with mathematical formulas and figures. |
| Open Source Code | Yes | Code https://github.com/En Vision-Research/Generalizable-BEV |
| Open Datasets | Yes | Experimental results on both Domain Generalization (DG) and Unsupervised Domain Adaptation (UDA) demonstrate its effectiveness. The details of datasets, evaluation metrics, and implementation details are elaborated in the supplementary materials. The paper explicitly mentions using 'nu Scenes' and 'Lyft' datasets in Table 1, which are well-known public benchmarks. It also states 'Deep Accident (Wang et al. 2023b) was collected from the CARLA virtual engine', citing the source of this dataset. |
| Dataset Splits | No | The paper states: 'The details of datasets, evaluation metrics, and implementation details are elaborated in the supplementary materials.' While it mentions using source and target domains for DG and UDA protocols, specific training/test/validation split percentages or sample counts for the datasets (nu Scenes, Lyft, Deep Accident) are not provided in the main text. |
| Hardware Specification | No | No specific hardware details (like GPU models, CPU types, or cloud configurations) are mentioned in the main text of the paper for running experiments. |
| Software Dependencies | No | No specific software dependencies with version numbers are mentioned in the main text of the paper. |
| Experiment Setup | No | The paper provides general information about the loss function: 'L = λs Ldet + λs Lrender + λs Lpg + λs Lps + λt Lcon, (8) where λs sets to 1 for the source domain and sets to 0 for the target domain, and it is the opposite for λt.' It also mentions 'β is a hyperparameter, iternum denotes the current iteration count, and maxiter represents the maximum number of iterations.' However, concrete hyperparameter values such as learning rates, batch sizes, number of epochs, or specific optimizer settings are not provided in the main text. It states: 'The details of datasets, evaluation metrics, and implementation details are elaborated in the supplementary materials.' |