Lightweight Contrastive Distilled Hashing for Online Cross-modal Retrieval

Authors: Jiaxing Li, Lin Jiang, Zeqi Ma, Kaihang Jiang, Xiaozhao Fang, Jie Wen

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on three widely used datasets demonstrate that LCDH outperforms some state-of-the-art methods.
Researcher Affiliation Academia 1School of Artificial Intelligence, Guangzhou University, Guangzhou 510006, China 2School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou 510665, China 3School of Fashion and Textiles, The Hong Kong Polytechic University, Hong Kong SAR, China 4School of Automation, Guangdong University of Technology, Guangzhou 510006, China 5Bio-Computing Research Center, Harbin Institute of Technology, Shenzhen 518067, China EMAIL, linn EMAIL, EMAIL, EMAIL, EMAIL, jiewen EMAIL
Pseudocode No The paper describes the proposed method using equations and textual descriptions but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code No The paper does not contain any explicit statement about releasing source code, nor does it provide any links to a code repository.
Open Datasets Yes MIRFlickr-25K consists of 25000 image-text pairs that annotated with 1 or more of 24 categories from the Flickr website (Huiskes and Lew 2008). IAPR TC-12 consists of 20000 pairs of image-text instances annotated with 255 categories (Rasiwasia et al. 2010). NUS-WIDE consists of 269648 pairs of image-text instances annotated with 81 categories (Chua et al. 2009).
Dataset Splits Yes MIRFlickr-25K: 2000 pairs of them were selected as the query set and rest of them were selected for the training and retrieval sets. IAPR TC-12: only 2000 pairs were randomly selected as the query set, and the rest were selected for the training and retrieval sets. NUS-WIDE: only 2100 pairs were treated as query set and the rest were treated as the training and retrieval sets.
Hardware Specification Yes All experiments were conducted on a workstation that running Windows 10 professional operation system with AMD Ryzen9 5900X CPU and NVIDIA Ge Force RTX3090 GPU.
Software Dependencies Yes The versions of CUDA is 11.3, of python is 3.9.8 and of pytorch is 1.10.1.
Experiment Setup Yes In the objective function of LCDH, parameter λ1 controls the impact of the similarity preservation in teacher network, while parameter λ2 controls the impact of similarity approximation for offline and online similarity in knowledge distillation. The ranges of λ1 and λ2 are set to {1e 5, 1e 4, 1e 3, 1e 2, 1e 1, 1e0, 1e1, 1e2, 1e3, 1e4, 1e5}. As shown in Fig. 6(a) and Fig. 6(b), LCDH achieved the best performance on MIRFlickr-25K for both I T and T I retrieval tasks when λ1 = 1e4, λ2 = 1e0.