reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Hybrid Data-Free Knowledge Distillation

Authors: Jialiang Tang, Shuo Chen, Chen Gong

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Intensive experiments across multiple benchmarks demonstrate that our Hi DFD can achieve state-of-the-art performance using 120 times less collected data than existing methods. Original Datasets. We evaluate the effectiveness of our Hi DFD on popular datasets, including CIFAR (Krizhevsky 2009), CINIC (Darlow et al. 2018), and Tiny Image Net (Le and Yang 2015), which are widely used by existing DFKD methods (Chen et al. 2019, 2021b). Additionally, we also conduct experiments on the large-scale Image Net (Deng et al. 2009) and the practical medical image dataset HAM (Tschandl, Rosendahl, and Kittler 2018), which are challenging for existing DFKD methods.
Researcher Affiliation	Academia	1School of Computer Science and Engineering, Nanjing University of Science and Technology, China 2Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, China 3Jiangsu Key Laboratory of Image and Video Understanding for Social Security, China 4Center for Advanced Intelligence Project, RIKEN, Japan 5Department of Automation, Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, China
Pseudocode	Yes	The whole algorithm of our proposed Hi DFD is given in Appendix.
Open Source Code	Yes	Code https://github.com/tangjialiang97/Hi DFD
Open Datasets	Yes	Original Datasets. We evaluate the effectiveness of our Hi DFD on popular datasets, including CIFAR (Krizhevsky 2009), CINIC (Darlow et al. 2018), and Tiny Image Net (Le and Yang 2015), which are widely used by existing DFKD methods (Chen et al. 2019, 2021b). Additionally, we also conduct experiments on the large-scale Image Net (Deng et al. 2009) and the practical medical image dataset HAM (Tschandl, Rosendahl, and Kittler 2018), which are challenging for existing DFKD methods. Collected Datasets. When using CIFAR and CINIC as the original datasets, we search for examples from Image Net. With Tiny Image Net and Image Net as the original datasets, we utilize Web Vision (Li et al. 2017) as our source of collected data. Moreover, we collect examples from ISIC (Codella et al. 2018) when using HAM as the original dataset.
Dataset Splits	Yes	Here, we deﬁne the ratio between the collected data Dc and original data Do as ρ = \|Dc\|/\|Do\|. We construct small (ρ=0.1) and moderate (ρ=1.0) collected data for the experiments of collection-based DFKD methods. We adopt a moderate inﬂation factor of N = \|Ds\|/\|Dc\| and further details are available in Extended Experiments. We follow (Chen et al. 2021b) and sample a part of examples from the corresponding dataset as collected data Dc.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	No	The paper mentions optimizers like SGD and Adam, but does not provide specific software dependencies with version numbers, such as Python version, or specific deep learning framework versions (e.g., PyTorch, TensorFlow).
Experiment Setup	Yes	All student networks in our Hi DFD employ SGD with weight decay as 5 10 4 and momentum as 0.9 as the optimizer. The student networks are trained over 240 epochs with a learning rate of 0.05, which is sequentially divided by 10 at the 150th, 180th, and 210th epochs. Meanwhile, the generator and discriminator in GAN utilize Adam for optimization with learning rates 1 10 4 and 4 10 4, respectively, and both of them are trained over 500 epochs. Additionally, the hyper-parameters in Eq. (13) are conﬁgured as λd = 0.1 and λg = 0.1.