reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Enabling Visual Foundation Models to Teach Compact Students via Mixture of Distillation

Authors: Xinye Yang, Shang Wang, Li Luking, Yipeng Chen

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on various classification, detection, and medical segmentation tasks validate the effectiveness of our approach with other models.
Researcher Affiliation	Academia	Xinye Yang1 , Shang Wang2 , Li Luking and Yipeng Chen3 1Newcastle University 2Indepentdent Researcher 3University of Science and Technology Beijing
Pseudocode	No	The paper describes the methodology in prose and with a diagram in Figure 1, but does not include a structured pseudocode or algorithm block.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code, nor does it provide a link to a code repository.
Open Datasets	Yes	We assess our framework on fine-grained classification datasets (e.g., Stanford Cars[Krause et al., 2013], Oxford Pets[Parkhi et al., 2012], CIFAR-100[Alex, 2009] and Food-101[Bossard et al., 2014]), large-scale dataset Image Net-1K[Deng et al., 2009]. For detection on MS-COCO dataset [Caesar et al., 2018]... This ongoing evaluation includes three 2D medical datasets [Fang et al., 2022], ISIC2018, CVCClinic DB, and Kvasir, and two 3D medical datasets [Wang et al., 2021], Synapse and ACDC.
Dataset Splits	Yes	The results on the Image Net-1K[Deng et al., 2009] dataset, as shown in Table ??. In the annotation-free setting... Table 3 and Table 4 demonstrate the remarkable gains achieved by our method with Grounding DINO-L as teacher across various object detection settings... on COCO val set.
Hardware Specification	No	The paper does not provide specific hardware details (like GPU models, CPU types, or memory amounts) used for running its experiments. It mentions that 'Detailed implementations are in the Appendix,' but the appendix is not included.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers. It states that 'Detailed implementations are in the Appendix,' but this information is not available in the main body of the paper.
Experiment Setup	Yes	Table 8 presents ablation study on KD configurations, we find the following: (1) The Vi T-L/14 teacher achieves the highest performance at 79.82%, followed by Vi T-B/32 at 79.47% and Res Net50 at 78.95%. (2) Loss weight of 1 produces the best results at 79.82%, while lower or higher weights lead to slightly reduced performance. (3) Temperature of 4 yields the highest accuracy of 79.82%, indicating this level of logit scaling is most effective for knowledge distillation.