Can Students Beyond the Teacher? Distilling Knowledge from Teacher’s Bias
Authors: Jianhua Zhang, Yi Gao, Ruyu Liu, Xu Cheng, Houxiang Zhang, Shengyong Chen
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate that our strategy, as a plug-and-play module, is versatile across various mainstream KD frameworks. We conducted experiments on three classification datasets, CIFAR-10(Krizhevsky, Hinton et al. 2009), CIFAR100(Krizhevsky, Hinton et al. 2009), and Image Net 1K(Russakovsky et al. 2015), as well as on an object detection dataset MS-COCO(Lin et al. 2014). |
| Researcher Affiliation | Academia | 1School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, 300384, China 2School of Information Science and Technology, Hangzhou Normal University, Hangzhou, 311121, China 3the Department of Technology, Management and Economics, Technical University of Denmark, Lyngby, Denmark 4Norwegian University of Science and Technology EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the methodology using natural language, mathematical equations, and diagrams (Figure 1, 2, 3), but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/smartyige/BTKD |
| Open Datasets | Yes | We conducted experiments on three classification datasets, CIFAR-10(Krizhevsky, Hinton et al. 2009), CIFAR100(Krizhevsky, Hinton et al. 2009), and Image Net 1K(Russakovsky et al. 2015), as well as on an object detection dataset MS-COCO(Lin et al. 2014). |
| Dataset Splits | Yes | We conducted experiments on three classification datasets, CIFAR-10(Krizhevsky, Hinton et al. 2009), CIFAR100(Krizhevsky, Hinton et al. 2009), and Image Net 1K(Russakovsky et al. 2015), as well as on an object detection dataset MS-COCO(Lin et al. 2014). top-1 and top-5 are standards for calculating classification accuracy on the validation set. Results on MS-COCO based on Faster-RCNN (Ren et al. 2015)-FPN(Lin et al. 2017): AP evaluated on val2017. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU, CPU models, memory) used to conduct the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions) that would be needed to replicate the experiments. |
| Experiment Setup | No | We have the following dynamic adjustment coefficient as γ = e / E . e represents the current training iteration, and E denotes the total training iterations. In order to shorten the training cycles, we employed a method for dynamically adjusting the student s learning focus. |