Training Uncertainty-Aware Classifiers with Conformalized Deep Learning
Authors: Bat-Sheva Einbinder, Yaniv Romano, Matteo Sesia, Yanfei Zhou
NeurIPS 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments with synthetic and real data demonstrate this method can lead to smaller conformal prediction sets with higher conditional coverage, after exact calibration with hold-out data, compared to state-of-the-art alternatives. |
| Researcher Affiliation | Academia | Bat-Sheva Einbinder Faculty of Electrical & Computer Engineering (ECE) Technion, Israel EMAIL Yaniv Romano Faculty of ECE and of Computer Science Technion, Israel EMAIL Matteo Sesia Department of Data Sciences and Operations University of Southern California Los Angeles, California, USA EMAIL Yanfei Zhou Department of Data Sciences and Operations University of Southern California Los Angeles, California, USA EMAIL |
| Pseudocode | Yes | Algorithm 1: Conformalized uncertainty-aware training of deep multi-class classifiers |
| Open Source Code | Yes | A more technically detailed version of Algorithm 1 is provided in Appendix A1.2, and an open-source software implementation of this method is available online at https://github.com/bat-sheva/conformal-learning. |
| Open Datasets | Yes | Convolutional neural networks guided by the conformal loss are trained on the publicly available CIFAR-10 image classification data set [81] (10 classes)... |
| Dataset Splits | Yes | For this purpose, we generate an additional validation set of 2000 independent data points and use it to preview the out-of-sample accuracy and loss value at each epoch. |
| Hardware Specification | Yes | For example, training a conformal loss model on 45000 images in the CIFAR-10 data set took us approximately 20 hours on an Nvidia P100 GPU |
| Software Dependencies | No | The paper mentions PyTorch [79] but does not specify a version number for it or any other software dependency. |
| Experiment Setup | Yes | Input: Data {Xi, Yi}n i=1; hyper-parameter λ [0, 1], learning rate γ > 0, batch size M; Randomly initialize the model parameters θ(0); Randomly split the data into two disjoint subsets, I1, I2, such that I1 I2 = [n]; Set the number of batches to B = (n/2)/M (assuming for simplicity that |I1| = |I2|); for t = 1, . . . , T do |