Towards Unbiased Calibration using Meta-Regularization

Authors: Cheng Wang, Jacek Golebiowski

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the effectiveness of the proposed approach in regularizing neural networks towards improved and unbiased calibration on three computer vision datasets. We empirically demonstrate that: (a) learning sample-wise γ as continuous variables can effectively improve calibration; (b) SECE smoothly optimizes γ-Net towards unbiased and robust calibration with respect to the binning schemes; and (c) the combination of γ-Net and SECE achieves the best calibration performance across various calibration metrics while retaining very competitive predictive performance as compared to multiple recently proposed methods.
Researcher Affiliation Industry Cheng Wang EMAIL Amazon Berlin, Germany Jacek Golebiowski EMAIL Amazon Berlin, Germany
Pseudocode Yes Algorithm 1 in Appendix describes the learning procedures.
Open Source Code No We implemented our methods by adapting and extending the code from (Bohdal et al., 2021) with Pytorch (Paszke et al., 2019).
Open Datasets Yes We conducted our experiments on CIFAR-10 and CIFAR-100 (in (Bohdal et al., 2021)) as well as Tiny-Image Net (Ya Le, 2015).
Dataset Splits Yes For meta-learning, we split the training set into 8:1:1 as training/val/metavalidation, keeping the original test sets untouched.
Hardware Specification No The paper mentions using Res Net18 as base model, which is an architecture, but does not specify any hardware details like GPU model, CPU type, or memory.
Software Dependencies No We implemented our methods by adapting and extending the code from (Bohdal et al., 2021) with Pytorch (Paszke et al., 2019). While PyTorch is mentioned, a specific version number is not provided.
Experiment Setup Yes For all experiments we used their default settings (using Res Net18 as base model, batch size 128, data augmented with random crop and horizontal flip) unless otherwise stated. Each experiment was run 5 times with different random seeds, and results were averaged. ... The models were trained with SGD (learning rate 0.1, momentum 0.9, weight decay 0.0005) for up to 350 epochs. The learning rate was decreased at 150 and 250 epochs by a factor of 10. ... The hidden dimension is set to 512, the temperature τ is fixed at 0.01. For SECE, we used the Gaussian kernel with bandwidth of 0.01 (selected via grid search) for both datasets. We initialized γ = 1.0.