Instance-dependent Early Stopping

Authors: Suqin Yuan, Runqi Lin, Lei Feng, Bo Han, Tongliang Liu

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on benchmarks demonstrate that IES method can reduce backpropagation instances by 10%-50% while maintaining or even slightly improving the test accuracy and transfer learning performance of a model.
Researcher Affiliation Academia 1 Sydney AI Centre, The University of Sydney 2 Southeast University 3 Hong Kong Baptist University
Pseudocode Yes Algorithm 1 Instance-dependent Early Stopping (IES)
Open Source Code Yes Our implementation can be found at https://github.com/tmllab/2025_ICLR_IES.
Open Datasets Yes Our evaluations confirmed the effectiveness of the IES method across multiple datasets. These datasets comprise CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009), and Image Net-1k (Deng et al., 2009). ... Caltech-101 (Li et al., 2022) datasets. ... For both task, we use the PASCAL VOC datasets (Everingham et al., a;b).
Dataset Splits Yes CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009) Description: 10 classes (CIFAR-10) and 100 classes (CIFAR-100), 50,000 training and 10,000 test images each.
Hardware Specification No This research was undertaken with the assistance of resources from the National Computational Infrastructure (NCI Australia), an NCRIS enabled capability supported by the Australian Government. However, specific hardware details such as GPU/CPU models are not provided.
Software Dependencies Yes Framework: Py Torch, Version 1.11.0.
Experiment Setup Yes Batch Size: {64} for CIFAR and {128} for Image Net-1k. Training Epochs: 200 epochs for CIFAR. 150 epochs for Image Net. Loss Function: Utilizes the Cross Entropy Loss from the nn module. ... For SGD, momentum=0.9, weight_decay=5e-4. SGD(F) lr = 0.001. SGD(L) lr = 0.1, scheduler: Linear LR(_,start_factor=1,end_factor=0.01,total_iters=150). SGD(M) lr = 0.1, scheduler: Multi Step LR(_, milestones=[50, 100], gamma=0.1). SGD(E) lr = 0.1, scheduler: Exponential LR(_, gamma=0.96). Adam (Kingma, 2014) lr = 0.001. Adam W (Loshchilov & Hutter, 2017) lr = 0.001, weight_decay=0.01.