Exploring Transferable Homogenous Groups for Compositional Zero-Shot Learning
Authors: Zhijie Rao, Jingcai Guo, Miaoge Li, Yang Chen, Mengzhu Wang
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on three benchmark datasets validate the effectiveness of our method. Code is available at: https://github.com/zjrao/HGRL. [...] We conduct experiments on three major benchmark datasets, and the results show that the proposed method achieves state-of-the-art performance. |
| Researcher Affiliation | Academia | Zhijie Rao , Jincai Guo , Miaoge Li , Yang Chen and Mengzhu Wang Department of COMP/LSGI, The Hong Kong Polytechnic University, Hong Kong SAR EMAIL, EMAIL |
| Pseudocode | No | The paper describes the methodology in prose and mathematical equations, and includes a figure illustrating the overview of the proposed method (Figure 2), but does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at: https://github.com/zjrao/HGRL. |
| Open Datasets | Yes | We perform experiments on three commonly used datasets including MITStates [Isola et al., 2015], UT-Zappos [Yu and Grauman, 2014] and C-GQA [Naeem et al., 2021]. |
| Dataset Splits | Yes | MIT-States has 115 states and 245 objects. The number of seen compositions is 1262 and unseen compositions is 400. UT-Zappos is a small footwear dataset with 16 states and 12 objects. There are 83 seen compositions for training and 18 unseen compositions for testing. C-GQA is a challenging dataset containing 413 states and 674 objects. There are 5592 seen compositions and 923 unseen combinations. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. It only mentions the use of a pre-trained CLIP model as a backbone. |
| Software Dependencies | No | The paper mentions using 'CLIP [Radford et al., 2021] Vi T-L/14 model as the backbone' but does not specify any other software dependencies, libraries, or their version numbers, which are essential for reproducibility. |
| Experiment Setup | Yes | The learning rate is 5e 4 for UT-Zappos and 5e 5 for MIT-States and C-GQA. The batch size is 180 for UTZappos and 32 for MIT-States and C-GQA. We use Adam optimizer to train the model. The group number of state ks and object ko are set to 3 for UT-Zappos and 5 for MIT-States and C-GQA. The hyper-parameter λ is set to 1.0 for UT-Zappos and 0.1 for MIT-States and C-GQA. |