Strengthen Out-of-Distribution Detection Capability with Progressive Self-Knowledge Distillation

Authors: Yang Yang, Haonan Xu

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results from multiple OOD scenarios verify the effectiveness and general applicability of PSKD.
Researcher Affiliation Academia 1Nanjing University of Science and Technology. Correspondence to: Yang Yang <EMAIL>.
Pseudocode Yes The pseudo code of PSKD is available in Appendix A.
Open Source Code Yes The code for this work is available at https://github. com/njustkmg/ICML25-PSKD
Open Datasets Yes Datasets. For small-scale experimental setups, CIFAR10/100 (Krizhevsky, 2009) is adopted as the ID dataset. ... Large-scale experiments are conducted on Image Net-200, a 200-class subset of Image Net-1K (Deng et al., 2009), as the ID dataset.
Dataset Splits Yes For validation, 1000 samples from the official test set are used, while the remainder are for testing. The model s self-selection mechanism is applied after the validation step at the end of each epoch.
Hardware Specification Yes All experiments are conducted using Python 3.8.19 and Py Torch 2.0.1 on a workstation equipped with dual 2.20 GHz CPUs, 384 GB of RAM, and six NVIDIA RTX 4090 GPUs.
Software Dependencies Yes All experiments are conducted using Python 3.8.19 and Py Torch 2.0.1 on a workstation equipped with dual 2.20 GHz CPUs, 384 GB of RAM, and six NVIDIA RTX 4090 GPUs.
Experiment Setup Yes Models are trained with stochastic gradient descent (SGD) for 100 epochs, using a learning rate of 0.1 with cosine annealing decay schedule (Loshchilov & Hutter, 2017), momentum of 0.9, and weight decay of 5 × 10−4. The batch size is set to 128 for CIFAR-10/100 and 256 for Image Net-200.