Text and Image Are Mutually Beneficial: Enhancing Training-Free Few-Shot Classification with CLIP

Authors: Yayuan Li, Jintao Guo, Lei Qi, Wenbin Li, Yinghuan Shi

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that TIMO significantly outperforms the state-of-the-art (SOTA) training-free method. Additionally, by exploring the extent of mutual guidance, we propose an enhanced variant, TIMO-S, which even surpasses the best training-required methods by 0.33% with approximately 100 less time cost. The paper also includes sections like 'Experiments Setting and Implementation', 'Main Result', and 'Ablation Study'.
Researcher Affiliation Academia 1National Key Laboratory for Novel Software Technology, Nanjing University, China 2School of Computer Science and Engineering, Southeast University, China EMAIL, EMAIL, EMAIL. All listed affiliations are universities and email domains are academic (.edu.cn).
Pseudocode No The paper describes methods using mathematical formulations and textual descriptions (e.g., equations 1-14) but does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Code https://github.com/lyymuwu/TIMO
Open Datasets Yes According to previous works (Wang et al. 2024b; Zhang et al. 2022), we use 11 image classification benchmarks to evaluate few-shot image classification. The specific datasets and their licenses are detailed in the Appendix. Specific datasets mentioned include Oxford Pets, UCF101, FGVCAircraft, Stanford Cars, Image Net-V2, Image Net Sketch, Image Net-A, and Image Net-R.
Dataset Splits Yes This paper evaluates few-shot learning by comparing 1, 2, 4, 8, and 16 few-shot training sets, and also mentions using '16-shot Image Net as training data'.
Hardware Specification Yes All experiments use Py Torch (Paszke et al. 2019) and are conducted on a single NVIDIA Ge Force RTX 3090Ti GPU.
Software Dependencies No The paper mentions 'Py Torch (Paszke et al. 2019)' but does not provide a specific version number for PyTorch or any other software dependencies.
Experiment Setup Yes For TIMO, we set the hyperparameter γ to 50 for all datasets, except 1 for Image Net and 100 for Flowers102. A grid search over α {10 4, 10 3, . . . , 104} is performed in the validation set to determine its optimal value. The paper also mentions adjusting 'hyperparameters β and γ'.