Hybrid Data-Free Knowledge Distillation
Authors: Jialiang Tang, Shuo Chen, Chen Gong
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Intensive experiments across multiple benchmarks demonstrate that our Hi DFD can achieve state-of-the-art performance using 120 times less collected data than existing methods. Original Datasets. We evaluate the effectiveness of our Hi DFD on popular datasets, including CIFAR (Krizhevsky 2009), CINIC (Darlow et al. 2018), and Tiny Image Net (Le and Yang 2015), which are widely used by existing DFKD methods (Chen et al. 2019, 2021b). Additionally, we also conduct experiments on the large-scale Image Net (Deng et al. 2009) and the practical medical image dataset HAM (Tschandl, Rosendahl, and Kittler 2018), which are challenging for existing DFKD methods. |
| Researcher Affiliation | Academia | 1School of Computer Science and Engineering, Nanjing University of Science and Technology, China 2Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, China 3Jiangsu Key Laboratory of Image and Video Understanding for Social Security, China 4Center for Advanced Intelligence Project, RIKEN, Japan 5Department of Automation, Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, China |
| Pseudocode | Yes | The whole algorithm of our proposed Hi DFD is given in Appendix. |
| Open Source Code | Yes | Code https://github.com/tangjialiang97/Hi DFD |
| Open Datasets | Yes | Original Datasets. We evaluate the effectiveness of our Hi DFD on popular datasets, including CIFAR (Krizhevsky 2009), CINIC (Darlow et al. 2018), and Tiny Image Net (Le and Yang 2015), which are widely used by existing DFKD methods (Chen et al. 2019, 2021b). Additionally, we also conduct experiments on the large-scale Image Net (Deng et al. 2009) and the practical medical image dataset HAM (Tschandl, Rosendahl, and Kittler 2018), which are challenging for existing DFKD methods. Collected Datasets. When using CIFAR and CINIC as the original datasets, we search for examples from Image Net. With Tiny Image Net and Image Net as the original datasets, we utilize Web Vision (Li et al. 2017) as our source of collected data. Moreover, we collect examples from ISIC (Codella et al. 2018) when using HAM as the original dataset. |
| Dataset Splits | Yes | Here, we define the ratio between the collected data Dc and original data Do as ρ = |Dc|/|Do|. We construct small (ρ=0.1) and moderate (ρ=1.0) collected data for the experiments of collection-based DFKD methods. We adopt a moderate inflation factor of N = |Ds|/|Dc| and further details are available in Extended Experiments. We follow (Chen et al. 2021b) and sample a part of examples from the corresponding dataset as collected data Dc. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper mentions optimizers like SGD and Adam, but does not provide specific software dependencies with version numbers, such as Python version, or specific deep learning framework versions (e.g., PyTorch, TensorFlow). |
| Experiment Setup | Yes | All student networks in our Hi DFD employ SGD with weight decay as 5 10 4 and momentum as 0.9 as the optimizer. The student networks are trained over 240 epochs with a learning rate of 0.05, which is sequentially divided by 10 at the 150th, 180th, and 210th epochs. Meanwhile, the generator and discriminator in GAN utilize Adam for optimization with learning rates 1 10 4 and 4 10 4, respectively, and both of them are trained over 500 epochs. Additionally, the hyper-parameters in Eq. (13) are configured as λd = 0.1 and λg = 0.1. |