reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

MSCI: Addressing CLIP's Inherent Limitations for Compositional Zero-Shot Learning

Authors: Yue Wang, Shuai Xu, Xuelin Zhu, Yicong Li

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on three widely used datasets fully validate the effectiveness and superiority of the proposed model. Data and code are available at https://github.com/ltpwy/MSCI.
Researcher Affiliation	Academia	1Nanjing University of Aeronautics and Astronautics, Nanjing, China 2Key Laboratory of Social Computing and Cognitive Intelligence (Dalian University of Technology), Ministry of Education, China 3The Hong Kong Polytechnic University, Hong Kong, China EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes methods using equations and prose, but no explicit pseudocode or algorithm blocks are provided.
Open Source Code	Yes	Data and code are available at https://github.com/ltpwy/MSCI.
Open Datasets	Yes	We evaluate the performance of the proposed MSCI on three widely-used compositional zero-shot learning datasets: MITStates [Isola et al., 2015], UT-Zappos [Yu and Grauman, 2014], and C-GQA [Naeem et al., 2021].
Dataset Splits	Yes	Consistent with previous research, we adopt the dataset partitioning method proposed by Purushwalkam et al. [Purushwalkam et al., 2019] , with specific details presented in Table 1.
Hardware Specification	Yes	All experiments are conducted on an Nvidia H20 GPU.
Software Dependencies	No	The paper mentions "PyTorch" and "CLIP's backbone with the Vi T-L/14 architecture", but specific version numbers for these software components are not provided.
Experiment Setup	Yes	During training, we use the Adam optimizer, combined with learning rate decay and weight decay strategies. To simplify the model complexity, we use only one cross-attention layer for both local feature interaction and global feature fusion across the three datasets, with 12 attention heads and a dropout rate set to 0.1. The parameter β, used to control the inference weights of each branch, is set to 0.1, 1.0 and 0.1 for MITStates, UT-Zappos and C-GQA in the close-world setting, and set to 0.3, 1.0 and 0.3 in the open-world setting.