Learning Attribute-Aware Hash Codes for Fine-Grained Image Retrieval via Query Optimization
Authors: Peng Wang, Yong Li, Lin Zhao, Xiu-Shen Wei
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on benchmark datasets demonstrate that our method generates attribute-aware hash codes and consistently outperforms state-of-the-art techniques in retrieval accuracy and robustness, especially for low-bit hash codes, underscoring its potential in fine-grained image hashing tasks. Our code is available at https://github.com/ SEU-VIPGroup/Query Opt |
| Researcher Affiliation | Academia | 1School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China. 2School of Computer Science and Engineering, and Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications, Southeast University, Nanjing, China. This work was upported by National Key R&D Program of China (2021YFA1001100), National Natural Science Foundation of China under Grant (62272231, 62172222), CIETencent Robotics X Rhino-Bird Focused Research Program, the Fundamental Research Funds for the Central Universities (4009002401,4009002509), and the Big Data Computing Center of Southeast University. Correspondence to: Xiu-Shen Wei <EMAIL>. |
| Pseudocode | No | The paper describes the methodology in detail using mathematical formulations and descriptive text, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/ SEU-VIPGroup/Query Opt |
| Open Datasets | Yes | We conducted a series of ablation studies and comparisons with the latest methods on five fine-grained datasets, i.e., CUB200 (Wah et al., 2011), Aircraft (Maji et al., 2013), Food101 (Bossard et al., 2014), Veg Fru (Hou et al., 2017) and NABirds (Van Horn et al., 2015). |
| Dataset Splits | Yes | Specifically, CUB200 contains 11,788 bird images from 200 bird species and is officially split into 5,994 images for training and 5,794 images for testing. Aircraft includes 10,000 images of 100 aircraft variants, with 6,667 images designated for training and 3,333 for testing. For large-scale datasets, Food101 features 101 kinds of foods with 101,000 images, each class has 250 test images that are manually checked for correctness, while the 750 training images may still contain a certain amount of noise. NABirds comprises 48,562 images of north american birds across 555 sub-categories, with 23,929 images for training and 24,633 for testing. Veg Fru is another large-scale fine-grained dataset covering 200 kinds of vegetables and 92 kinds of fruits, with 29,200 images for training, 14,600 for validation, and 116,931 for testing. |
| Hardware Specification | Yes | All experiments are conducted with one Ge Force RTX 3090 GPU. |
| Software Dependencies | No | The paper mentions using ResNet-50 as a backbone and SGD for training but does not provide specific version numbers for software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages. |
| Experiment Setup | Yes | For all datasets, we preprocess all images to 224 224. Initial learning rate is 0.0003. SGD with a mini-batch size of 16 is used for training. We set the weight decay to 0.0001 and momentum to 0.90. We set L = 2 to leverage multi-scale features. d is set to 384, and N is set to 96 k. |