TRIDE: A Text-assisted Radar-Image weather-aware fusion network for Depth Estimation

Authors: Huawei Sun, Zixu Wang, Hao Feng, Julius Ott, Lorenzo Servadei, Robert Wille

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our method, benchmarked on the nu Scenes dataset, demonstrates performance gains over the state-of-the-art, achieving a 12.87% improvement in MAE and a 9.08% improvement in RMSE. Code: https://github.com/harborsarah/TRIDE
Researcher Affiliation Collaboration Huawei Sun EMAIL Technical University of Munich Infineon Technologies AG Zixu Wang EMAIL Technical University of Munich Infineon Technologies AG Hao Feng EMAIL Technical University of Munich Julius Ott EMAIL Technical University of Munich Infineon Technologies AG Lorenzo Servadei EMAIL Technical University of Munich Robert Wille EMAIL Technical University of Munich
Pseudocode No The paper describes methods using prose and diagrams (e.g., Figure 2, Figure 3, Figure 4, Figure 5) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code: https://github.com/harborsarah/TRIDE
Open Datasets Yes We train and evaluate our approach on the nu Scenes Caesar et al. (2020) dataset, a widely recognized benchmark for radar-camera-based solutions, where it achieves state-of- the-art performance compared to existing radar-camera depth estimation algorithms. ... The KITTI dataset Geiger et al. (2012) is the most widely used benchmark for outdoor depth estimation in driving scenarios.
Dataset Splits Yes nu Scenes Dataset. To validate our TRIDE, we use the nu Scenes dataset Caesar et al. (2020), a comprehensive multi-sensor dataset. This dataset includes 700, 150, and 150 scenes in its training, validation, and test subsets, respectively, covering diverse driving conditions such as normal, rainy, and night-time scenarios. ... KITTI Dataset. The KITTI dataset Geiger et al. (2012) is the most widely used benchmark for outdoor depth estimation in driving scenarios. We evaluate the efficacy of our proposed text branch by integrating it into existing depth estimation algorithms. The models are retrained and evaluated using the KITTI Eigen split Eigen et al. (2014), with all implementation details consistent with the original publications.
Hardware Specification Yes Our model is implemented in Py Torch Paszke et al. (2019) and trained on an Nvidia Tesla A30 GPU with a batch size of 10.
Software Dependencies No Our model is implemented in Py Torch Paszke et al. (2019)... We use the Adam optimizer Kingma & Ba (2017)...
Experiment Setup Yes Our model is implemented in Py Torch Paszke et al. (2019) and trained on an Nvidia Tesla A30 GPU with a batch size of 10. We use the Adam optimizer Kingma & Ba (2017) with an initial learning rate of 1e 4, adjusting it based on a polynomial decay schedule with a power of p = 0.9. To prevent overfitting, we apply data augmentation techniques, including random flips and adjustments to contrast, brightness, and color. Additionally, random cropping to 448 800 pixels is performed during training to further enhance model robustness. ... The final loss is then obtained by summing the two individual losses L = w DLDepth + w CLCls, where w D and w C are the weighting factors. ... in the final loss function, we set w C = 1 and w D = 1 during training.