TRIDE: A Text-assisted Radar-Image weather-aware fusion network for Depth Estimation
Authors: Huawei Sun, Zixu Wang, Hao Feng, Julius Ott, Lorenzo Servadei, Robert Wille
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our method, benchmarked on the nu Scenes dataset, demonstrates performance gains over the state-of-the-art, achieving a 12.87% improvement in MAE and a 9.08% improvement in RMSE. Code: https://github.com/harborsarah/TRIDE |
| Researcher Affiliation | Collaboration | Huawei Sun EMAIL Technical University of Munich Infineon Technologies AG Zixu Wang EMAIL Technical University of Munich Infineon Technologies AG Hao Feng EMAIL Technical University of Munich Julius Ott EMAIL Technical University of Munich Infineon Technologies AG Lorenzo Servadei EMAIL Technical University of Munich Robert Wille EMAIL Technical University of Munich |
| Pseudocode | No | The paper describes methods using prose and diagrams (e.g., Figure 2, Figure 3, Figure 4, Figure 5) but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code: https://github.com/harborsarah/TRIDE |
| Open Datasets | Yes | We train and evaluate our approach on the nu Scenes Caesar et al. (2020) dataset, a widely recognized benchmark for radar-camera-based solutions, where it achieves state-of- the-art performance compared to existing radar-camera depth estimation algorithms. ... The KITTI dataset Geiger et al. (2012) is the most widely used benchmark for outdoor depth estimation in driving scenarios. |
| Dataset Splits | Yes | nu Scenes Dataset. To validate our TRIDE, we use the nu Scenes dataset Caesar et al. (2020), a comprehensive multi-sensor dataset. This dataset includes 700, 150, and 150 scenes in its training, validation, and test subsets, respectively, covering diverse driving conditions such as normal, rainy, and night-time scenarios. ... KITTI Dataset. The KITTI dataset Geiger et al. (2012) is the most widely used benchmark for outdoor depth estimation in driving scenarios. We evaluate the efficacy of our proposed text branch by integrating it into existing depth estimation algorithms. The models are retrained and evaluated using the KITTI Eigen split Eigen et al. (2014), with all implementation details consistent with the original publications. |
| Hardware Specification | Yes | Our model is implemented in Py Torch Paszke et al. (2019) and trained on an Nvidia Tesla A30 GPU with a batch size of 10. |
| Software Dependencies | No | Our model is implemented in Py Torch Paszke et al. (2019)... We use the Adam optimizer Kingma & Ba (2017)... |
| Experiment Setup | Yes | Our model is implemented in Py Torch Paszke et al. (2019) and trained on an Nvidia Tesla A30 GPU with a batch size of 10. We use the Adam optimizer Kingma & Ba (2017) with an initial learning rate of 1e 4, adjusting it based on a polynomial decay schedule with a power of p = 0.9. To prevent overfitting, we apply data augmentation techniques, including random flips and adjustments to contrast, brightness, and color. Additionally, random cropping to 448 800 pixels is performed during training to further enhance model robustness. ... The final loss is then obtained by summing the two individual losses L = w DLDepth + w CLCls, where w D and w C are the weighting factors. ... in the final loss function, we set w C = 1 and w D = 1 during training. |