Truthfulness of Calibration Measures

Authors: Nika Haghtalab, Mingda Qiao, Kunhe Yang, Eric Zhao

NeurIPS 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We study calibration measures in a sequential prediction setup. We introduce a new calibration measure termed the Subsampled Smooth Calibration Error (SSCE), which is complete and sound, and under which truthful prediction is optimal up to a constant multiplicative factor. We answer this question in three parts: Part I: We show that existing calibration measures do not simultaneously meet these criteria. ... Part II: We introduce a new calibration measure, called SSCE, that is sound, complete, and approximately truthful. ... Part III: There is a forecasting algorithm that achieves O(T) SSCE even in the adversarial setting.
Researcher Affiliation Academia Nika Haghtalab, Mingda Qiao, Kunhe Yang, and Eric Zhao University of California, Berkeley EMAIL
Pseudocode Yes Algorithm 1: Forecaster for Product Distributions
Open Source Code No The paper does not contain any statement about releasing source code or links to a code repository.
Open Datasets No This is a theoretical paper and does not involve the use of datasets for training or evaluation.
Dataset Splits No This is a theoretical paper and does not describe any experimental data splits (training, validation, test).
Hardware Specification No This is a theoretical paper and does not describe any hardware used for experiments.
Software Dependencies No This is a theoretical paper and does not list any specific software dependencies with version numbers.
Experiment Setup No This is a theoretical paper and does not describe any experimental setup details such as hyperparameters or training configurations.