QuRe: Query-Relevant Retrieval through Hard Negative Sampling in Composed Image Retrieval

Authors: Jaehyun Kwak, Ramahdani Muhammad Izaaz Inhar, Se-Young Yun, Sung-Ju Lee

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that QURE achieves stateof-the-art performance on Fashion IQ and CIRR datasets while exhibiting the strongest alignment with human preferences on the HP-Fashion IQ dataset.
Researcher Affiliation Academia 1KAIST. Correspondence to: Sung-Ju Lee <EMAIL>.
Pseudocode Yes Algorithm 1 Training Flow of QURE
Open Source Code Yes The source code is available at https: //github.com/jackwaky/Qu Re.
Open Datasets Yes We evaluate the models on widely used CIR datasets, Fashion IQ (Wu et al., 2021) and CIRR (Suhr et al., 2018), to assess their ability to retrieve the target image.
Dataset Splits Yes We evaluate the models on widely used CIR datasets, Fashion IQ (Wu et al., 2021) and CIRR (Suhr et al., 2018), to assess their ability to retrieve the target image. Additionally, we evaluate them on the HP-Fashion IQ dataset to assess their alignment with human preferences. ... We selected the Fashion IQ dataset for its high relevance and broad applicability, mirroring the search functionalities of e-commerce platforms.
Hardware Specification Yes All experiments were conducted using a single Nvidia RTX 3090 GPU.
Software Dependencies No No specific software dependencies with version numbers are mentioned in the paper, beyond the use of BLIP-2 as a backbone model and AdamW optimizer.
Experiment Setup Yes QURE is trained using the Adam W optimizer (Loshchilov, 2017) for 50 epochs on CIRR and 30 epochs on Fashion IQ. The hard negative set H was defined ndef times, starting with a warm-up phase where H initially included the entire corpus except for the target during the first nepoch/ndef epochs. The hard negative set H is updated every nepoch/ndef epochs. We set ndef to six for both Fashion IQ and CIRR. ... We resized images to 224 224 with a 1.25 padding ratio.