Open-World Semi-Supervised Learning with Class Semantic Correlations
Authors: Yuxin Fan, Junbiao Cui, Jiye Liang, Jianqing Liang
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct a comprehensive evaluation of our method. The experimental results and detailed analysis demonstrate the superiority of our method. Datasets We conduct experiments on five fine-grained datasets: CUB [Wah et al., 2011] , Stanford Cars [Krause et al., 2013] , Flowers102 [Nilsback and Zisserman, 2008], and Image Net-100 [Deng et al., 2009], which contain 200, 196, 102, and 100 classes, respectively. To ensure the fairness of the experiment, we conduct our experiments using the data partitioning method described in [Zheng et al., 2024]. We adopt the same approach to divide the classes into known and unknown, considering 50% of the classes as known and the remaining 50% as unknown. Consequently, we construct the datasets Dl and Du accordingly. We train our method on Dl and Du, and subsequently evaluate its performance on Du. This procedure is applied consistently across all compared methods. Detailed dataset information is provided in Table 1 . |
| Researcher Affiliation | Academia | Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, School of Computer and Information Technology, Shanxi University, Taiyuan 030006, Shanxi, China EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the proposed method using mathematical formulations and descriptive text, but it does not include a clearly labeled pseudocode block or algorithm section. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code, nor does it provide any links to a code repository. |
| Open Datasets | Yes | Datasets We conduct experiments on five fine-grained datasets: CUB [Wah et al., 2011] , Stanford Cars [Krause et al., 2013] , Flowers102 [Nilsback and Zisserman, 2008], and Image Net-100 [Deng et al., 2009], which contain 200, 196, 102, and 100 classes, respectively. |
| Dataset Splits | Yes | We adopt the same approach to divide the classes into known and unknown, considering 50% of the classes as known and the remaining 50% as unknown. Consequently, we construct the datasets Dl and Du accordingly. We train our method on Dl and Du, and subsequently evaluate its performance on Du. |
| Hardware Specification | Yes | All our experiments are conducted on a single NVIDIA 3090 GPU. |
| Software Dependencies | No | The paper mentions using CLIP as a pre-trained backbone and GPT-3 as an LLM, but it does not provide specific version numbers for these or any other software components (e.g., Python, PyTorch versions). |
| Experiment Setup | Yes | We use a batch size of 128 and train for 200 epochs with an initial learning rate of 0.1. We adjust the learning rate using a cosine annealing schedule. The parameters are set as follows: α to 0.35, λ1 to 1, τs to 0.1, and τk to 0.01. τt is initialized to 0.07 and gradually increased to 0.04 using a cosine annealing schedule during the first 30 epochs of training. We choose GPT-3 [Brown et al., 2020] as our LLM and utilize the prompt templates proposed by [Pratt et al., 2023] to generate textual descriptions, and choose Vi T-H based CLIP model as the auxiliary VLM. To validate the generalization capability of our method across different datasets, we pretrain the model using [Vaze et al., 2022], uniformly setting λ2 to 0.2 and β to 0.3. |