reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Info-Coevolution: An Efficient Framework for Data Model Coevolution

Authors: Ziheng Qin, Hailun Xu, Wei Chee Yew, Qi Jia, Yang Luo, Kanchan Sarkar, Danhui Guan, Kai Wang, Yang You

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method on Image Net-1K (Deng et al., 2009), CIFAR10/100(Krizhevsky et al., a;b), Stanford Cars, Food-101(Bossard et al., 2014), SVHN(Netzer et al., 2011) under different annotation ratios and settings. Image Net-1k Results We show our improvement in data/annotation efficiency here in Fig.5 and Tab.1. Ablation of Components. As our algorithm involves fusing the prediction of model and KNN predictions with dynamic rechecking, we here ablate their corresponding influence on performance in Tab.4.
Researcher Affiliation	Collaboration	1National University of Singapore, School of Computing, 11 Research Link, Singapore 2Byte Dance Ltd, Singapore.
Pseudocode	No	The paper describes the algorithm steps in sections "3.5. The Algorithm" under "Selective Annotation" and "Dynamic Rechecking" using natural language, but does not present them in a structured pseudocode block or a clearly labeled algorithm figure.
Open Source Code	Yes	Code is available at https://github.com/NUS-HPC-AILab/Info-Coevolution/.
Open Datasets	Yes	We evaluate our method on Image Net-1K (Deng et al., 2009), CIFAR10/100(Krizhevsky et al., a;b), Stanford Cars, Food-101(Bossard et al., 2014), SVHN(Netzer et al., 2011) under different annotation ratios and settings. We further extend the training data with data from CC3M, CC12M, SBU, Visual Genome, COCO, and LAION-400m to study the effect of scaling unlabeled data.
Dataset Splits	Yes	On 10% data setting of Image Net, our selective annotation can increase the accuracy by 1.3% compared to the random baseline. With only 68% annotated samples from Image Net-1k, we can achieve lossless performance (85.6% Acc), surpassing the 85.5% Acc of Semi-Vi T with 80% labeled data and 20% unlabeled data. ... Semi-Vi T trained with 50% Image Net-1K annotations selected by Info-Coevolution can achieve an almost lossless result (85.5%).
Hardware Specification	Yes	Our results are trained on a single node of 8 NVIDIA A100-SXM4-80G.
Software Dependencies	No	The paper states, "The experiments on Image Net-1K follows Semi-Vi T (Cai et al., 2022), using Vi T (Dosovitskiy et al., 2021) model with MAE (He et al., 2021) pretrained backbone to conduct supervised and semi-supervised training. All other details can be found in the Appendix." However, it does not provide specific software versions for libraries or frameworks used in their own implementation, referring instead to the settings of a cited work.
Experiment Setup	Yes	For supervised finetuning, we train the model with batchsize 512, learning rate 0.001 for 50 epochs with all augmentations same as in Semi-Vi T.