Deep Evidential Hashing for Trustworthy Cross-Modal Retrieval

Authors: Yuan Li, Liangli Zhen, Yuan Sun, Dezhong Peng, Xi Peng, Peng Hu

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate the efficacy of our DECH through extensive experimentation on four benchmark datasets. The experimental results demonstrate our superior performance compared to 12 state-of-the-art methods.
Researcher Affiliation Collaboration 1College of Computer Science, Sichuan University 2Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore 3Sichuan National Innovation New Vision UHD Video Technology Co., Ltd., Chengdu 610095, China
Pseudocode Yes Algorithm 1: The Optimization Procedure of DECH
Open Source Code Yes Code https://github.com/blackant-dev/DECH
Open Datasets Yes We conduct our experiments on four benchmark datasets: MIRFLICKR25K(Huiskes and Lew 2008), IAPR TC12(Escalante et al. 2010), NUS-WIDE(Rasiwasia et al. 2010), and MS-COCO(Lin et al. 2014).
Dataset Splits Yes MIRFLICKR25K contains 20,500 image-text pairs from 24 classes, with 2,000 pairs reserved for querying, 10,000 for training, and the rest for retrieval. IAPR TC-12 comprises 20,000 pairs across 255 categories, with 2,000 pairs used for querying, 10,000 for training, and the remainder for retrieval. NUS-WIDE includes 195,834 pairs in 21 categories, with 2,000 pairs for querying, 10,500 for training, and the rest for retrieval. MS-COCO consists of 122,218 pairs in 80 classes, with 5,000 pairs for querying, 10,000 for training, and the remaining pairs forming the retrieval database.
Hardware Specification Yes Our method is implemented with Py Torch(Paszke et al. 2019) on a single NVIDIA GEFORCE RTX 3090 Ti GPU.
Software Dependencies No The paper mentions 'Py Torch(Paszke et al. 2019)' but does not specify a version number for PyTorch or any other software library.
Experiment Setup Yes For our DECH, we set τ to 0.2 and γ to 1. The parameter λ is empirically determined as per (Sensoy, Kaplan, and Kandemir 2018). Additionally, we also employ Lnzce RM as the near-zero correct-evidence loss during training. ... The optimal retrieval performance is achieved when τ is around 0.1.