Simple Calibration via Geodesic Kernels
Authors: Jayanta Dey, Haoyin Xu, Ashwin De Silva, Joshua T Vogelstein
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on both tabular and vision benchmarks show that the proposed approaches, namely Kernel Density Forest (KDF) and Kernel Density Network (KDN), obtain well-calibrated posteriors for both ID and OOD samples, while mostly preserving the classification accuracy and extrapolating beyond the training data to handle OOD inputs appropriately. |
| Researcher Affiliation | Academia | Jayanta Dey EMAIL Department of Biomedical Engineering Johns Hopkins University Haoyin Xu EMAIL Department of Biomedical Engineering Johns Hopkins University Ashwin De Silva EMAIL Department of Biomedical Engineering Johns Hopkins University Joshua T. Vogelstein EMAIL Department of Biomedical Engineering Johns Hopkins University |
| Pseudocode | Yes | Algorithm 1 Computing Geodesic Kernel |
| Open Source Code | Yes | Our code, including the package and the approach proposed in this manuscript, is available from https://github.com/neurodata/kdg. |
| Open Datasets | Yes | Our experiments on both tabular and vision benchmarks show that the proposed approaches, namely Kernel Density Forest (KDF) and Kernel Density Network (KDN), obtain well-calibrated posteriors for both ID and OOD samples, while mostly preserving the classification accuracy and extrapolating beyond the training data to handle OOD inputs appropriately. [...] We conduct experiments on a two-dimensional Gaussian XOR simulation (described in Appendix D) and the 784-dimensional Fashion-MNIST dataset (from Open ML-CC18 (Bischl et al., 2017)) using fully-connected networks of varying depth and width. [...] We experiment with popular benchmark datasets CIFAR10, CIFAR-100 and SVHN. |
| Dataset Splits | Yes | For each approach, 70% of the training data was used to fit the model and the rest of the data was used to calibrate the model. [...] For simulation study on tabular data, we use 6 simulation datasets. Three of the simulations are visualized in Figure 3A and see Appendix D for additional simulations and details. We sample 10,000 training samples with half of the samples from each class. |
| Hardware Specification | Yes | All the computations were performed for producing the results in Table 1 using a Mac Book Pro with an Apple M1 Max chip and 64 GB of RAM. |
| Software Dependencies | Yes | Software: Python 3.8, scikit-learn 0.22.0, tensorflow-macos 2.9, tensorflow-metal 0.5.0. |
| Experiment Setup | Yes | Table 3: Hyperparameters for RF and KDF. Hyperparameters Value n_estimators 500 max_depth min_samples_leaf 1 λ 1 10 6 b exp ( 10 7) Table 4: Hyperparameters for Re LU-net and KDN on Tabular data. Hyperparameters Value number of hidden layers 4 nodes per hidden layer 1000 optimizer Adam learning rate 3 10 4 b exp ( 10 7) |