reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Leveraging MLLM Embeddings and Attribute Smoothing for Compositional Zero-Shot Learning

Authors: Xudong Yan, Songhe Feng, Yang Zhang, Jian Yang, Yueguan Lin, Haojun Fei

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that our method achieves state-of-the-art performance on three challenging datasets. The supplementary material and source code will be available at https://github.com/xud-yan/Trident. ... 4 Experiment 4.1 Experiment Setup 4.2 Results and Discussion 4.3 Ablation Study
Researcher Affiliation	Collaboration	1School of Computer Science and Technology, Beijing Jiaotong University 2Qifu Technology EMAIL, {yangjian1, linyueguan, feihaojun}-EMAIL
Pseudocode	No	The paper describes the methodology in prose and mathematical formulations within the 'Approach' section (Section 3), but it does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	Yes	The supplementary material and source code will be available at https://github.com/xud-yan/Trident.
Open Datasets	Yes	We evaluate our model on three challenging CZSL datasets: MIT-states [Isola et al., 2015], C-GQA [Naeem et al., 2021], and VAW-CZSL [Saini et al., 2022]. The common data splits are presented in Table 1.
Dataset Splits	Yes	We evaluate our model on three challenging CZSL datasets: MIT-states [Isola et al., 2015], C-GQA [Naeem et al., 2021], and VAW-CZSL [Saini et al., 2022]. The common data splits are presented in Table 1. Table 1: Summary statistics of the datasets used in our experiments. Train Validation Test \|A\| \|O\| \|Cs\| \|X\| \|Cs\| \|Cu\| \|X\| \|Cs\| \|Cu\| \|X\| MIT-States 115 245 1262 30k 300 300 10k 400 400 13k C-GQA 413 674 5592 27k 1252 1040 7k 888 923 5k VAW-CZSL 440 541 1252 72k 2121 2322 10k 2449 2470 11k
Hardware Specification	No	The paper states: "We use the visual encoder of LLa VA v1.5, Vi T-Large-Patch14-336px as our frozen visual backbone." and "TRIDENT and all baseline models are trained with the batch size of 128 for 50 epochs under the Py Torch framework [Paszke et al., 2019]". This specifies model architecture and training parameters, but no concrete hardware details (e.g., GPU model, CPU, memory) are provided.
Software Dependencies	No	The paper mentions "under the Py Torch framework [Paszke et al., 2019]" and uses "LLa VA v1.5" and "GPT-3.5 [Open AI, 2023]". While PyTorch and GPT-3.5 are software/models, specific version numbers for PyTorch or other libraries are not provided.
Experiment Setup	Yes	We use the visual encoder of LLa VA v1.5, Vi T-Large-Patch14-336px as our frozen visual backbone. TRIDENT and all baseline models are trained with the batch size of 128 for 50 epochs under the Py Torch framework [Paszke et al., 2019]. The number of global features is set to 6, 2, and 4 for the three datasets, respectively, and the number of local features is twice that of the global features. The label smoothing factor is set to 0.09, 0.03, and 0.03 for the three datasets, respectively. The number of auxiliary attributes generated for each composition is set to 3. We train TRIDENT by Adam optimizer with the weight decay of 5e-5, learning rates of 1.5e-6 for word embedding and 2e-4 for other modules. We decay the learning rate by a factor of 0.1 at epoch 30 and 40. The temperature variable of cosine similarity δ is set to 0.05. For weighting coefficients γortho, γcomp, and γpri, we set them to 0.1, 1, 0.25, respectively.