reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ENCODER: Entity Mining and Modification Relation Binding for Composed Image Retrieval

Authors: Zixu Li, Zhiwei Chen, Haokun Wen, Zhiheng Fu, Yupeng Hu, Weili Guan

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on four benchmark datasets demonstrate the superiority of our proposed method.
Researcher Affiliation	Academia	1School of Software, Shandong University 2School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen) 3School of Data Science, City University of Hong Kong EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology using textual explanations and mathematical formulations, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Moreover, we have released our codes to facilitate other researchers1. 1https://sdu-l.github.io/ENCODER.github.io/
Open Datasets	Yes	Following previous works, we chose four benchmark datasets for evaluation, including three fashion-domain datasets, Fashion IQ (Wu et al. 2021), Shoes (Guo et al. 2018), Fashion200K (Han et al. 2017) and an open-domain dataset CIRR (Liu et al. 2021b).
Dataset Splits	No	The paper mentions evaluation metrics for different datasets (e.g., R@k for Shoes and Fashion200K, R@10, R@50 for Fashion IQ), but it does not explicitly state the training, validation, or test splits for these datasets. For example, it mentions a batch size but not the dataset split ratios or counts.
Hardware Specification	Yes	All experiments were conducted on a single NVIDIA Tesla T4 GPU with 16GB memory and trained 10 epochs.
Software Dependencies	No	ENCODER is built upon the pretrained CLIP (Radford et al. 2021) (Vi T-B/32 version). We trained ENCODER using the Adam W optimizer... The paper mentions CLIP and its version (ViT-B/32) and the Adam W optimizer, but it does not specify versions for other ancillary software like Python, PyTorch, or CUDA.
Experiment Setup	Yes	ENCODER is built upon the pretrained CLIP (Radford et al. 2021) (Vi T-B/32 version). We trained ENCODER using the Adam W optimizer with the initial learning rate of 5e-5, while the batch size is set to 128 and the learning rate for CLIP is 1e-6. Empirically, we maintained a consistent embedding dimension D of 512 throughout the network. We set the latent factor number P to 4 and the query number E of LRQ to 3. We also adopt the temperature factor τ to 0.1 for Eqn.(9,13,14). Through a comprehensive grid search, we set κ = 0.8, γ = 0.5, and µ = 0.5 for all four datasets. All experiments were conducted on a single NVIDIA Tesla T4 GPU with 16GB memory and trained 10 epochs.