reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Open-World Semi-Supervised Learning with Class Semantic Correlations

Authors: Yuxin Fan, Junbiao Cui, Jiye Liang, Jianqing Liang

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we conduct a comprehensive evaluation of our method. The experimental results and detailed analysis demonstrate the superiority of our method. Datasets We conduct experiments on five fine-grained datasets: CUB [Wah et al., 2011] , Stanford Cars [Krause et al., 2013] , Flowers102 [Nilsback and Zisserman, 2008], and Image Net-100 [Deng et al., 2009], which contain 200, 196, 102, and 100 classes, respectively. To ensure the fairness of the experiment, we conduct our experiments using the data partitioning method described in [Zheng et al., 2024]. We adopt the same approach to divide the classes into known and unknown, considering 50% of the classes as known and the remaining 50% as unknown. Consequently, we construct the datasets Dl and Du accordingly. We train our method on Dl and Du, and subsequently evaluate its performance on Du. This procedure is applied consistently across all compared methods. Detailed dataset information is provided in Table 1 .
Researcher Affiliation	Academia	Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, School of Computer and Information Technology, Shanxi University, Taiyuan 030006, Shanxi, China EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the proposed method using mathematical formulations and descriptive text, but it does not include a clearly labeled pseudocode block or algorithm section.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code, nor does it provide any links to a code repository.
Open Datasets	Yes	Datasets We conduct experiments on five fine-grained datasets: CUB [Wah et al., 2011] , Stanford Cars [Krause et al., 2013] , Flowers102 [Nilsback and Zisserman, 2008], and Image Net-100 [Deng et al., 2009], which contain 200, 196, 102, and 100 classes, respectively.
Dataset Splits	Yes	We adopt the same approach to divide the classes into known and unknown, considering 50% of the classes as known and the remaining 50% as unknown. Consequently, we construct the datasets Dl and Du accordingly. We train our method on Dl and Du, and subsequently evaluate its performance on Du.
Hardware Specification	Yes	All our experiments are conducted on a single NVIDIA 3090 GPU.
Software Dependencies	No	The paper mentions using CLIP as a pre-trained backbone and GPT-3 as an LLM, but it does not provide specific version numbers for these or any other software components (e.g., Python, PyTorch versions).
Experiment Setup	Yes	We use a batch size of 128 and train for 200 epochs with an initial learning rate of 0.1. We adjust the learning rate using a cosine annealing schedule. The parameters are set as follows: α to 0.35, λ1 to 1, τs to 0.1, and τk to 0.01. τt is initialized to 0.07 and gradually increased to 0.04 using a cosine annealing schedule during the first 30 epochs of training. We choose GPT-3 [Brown et al., 2020] as our LLM and utilize the prompt templates proposed by [Pratt et al., 2023] to generate textual descriptions, and choose Vi T-H based CLIP model as the auxiliary VLM. To validate the generalization capability of our method across different datasets, we pretrain the model using [Vaze et al., 2022], uniformly setting λ2 to 0.2 and β to 0.3.