Strengthen Out-of-Distribution Detection Capability with Progressive Self-Knowledge Distillation
Authors: Yang Yang, Haonan Xu
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results from multiple OOD scenarios verify the effectiveness and general applicability of PSKD. |
| Researcher Affiliation | Academia | 1Nanjing University of Science and Technology. Correspondence to: Yang Yang <EMAIL>. |
| Pseudocode | Yes | The pseudo code of PSKD is available in Appendix A. |
| Open Source Code | Yes | The code for this work is available at https://github. com/njustkmg/ICML25-PSKD |
| Open Datasets | Yes | Datasets. For small-scale experimental setups, CIFAR10/100 (Krizhevsky, 2009) is adopted as the ID dataset. ... Large-scale experiments are conducted on Image Net-200, a 200-class subset of Image Net-1K (Deng et al., 2009), as the ID dataset. |
| Dataset Splits | Yes | For validation, 1000 samples from the official test set are used, while the remainder are for testing. The model s self-selection mechanism is applied after the validation step at the end of each epoch. |
| Hardware Specification | Yes | All experiments are conducted using Python 3.8.19 and Py Torch 2.0.1 on a workstation equipped with dual 2.20 GHz CPUs, 384 GB of RAM, and six NVIDIA RTX 4090 GPUs. |
| Software Dependencies | Yes | All experiments are conducted using Python 3.8.19 and Py Torch 2.0.1 on a workstation equipped with dual 2.20 GHz CPUs, 384 GB of RAM, and six NVIDIA RTX 4090 GPUs. |
| Experiment Setup | Yes | Models are trained with stochastic gradient descent (SGD) for 100 epochs, using a learning rate of 0.1 with cosine annealing decay schedule (Loshchilov & Hutter, 2017), momentum of 0.9, and weight decay of 5 × 10−4. The batch size is set to 128 for CIFAR-10/100 and 256 for Image Net-200. |