Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Learning to Predict Trustworthiness with Steep Slope Loss
Authors: Yan Luo, Yongkang Wong, Mohan S. Kankanhalli, Qi Zhao
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiment & Analysis In this section, we first introduce the experimental set-up. Then, we report the performances of baselines and the proposed steep slope loss on Image Net, followed by comprehensive analyses. |
| Researcher Affiliation | Academia | Department of Computer Science & Engineering, University of Minnesota School of Computing, National University of Singapore EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code and pre-trained trustworthiness predictors for reproducibility are available at https://github.com/luoyan407/predict_trustworthiness. |
| Open Datasets | Yes | The experiment is conducted on Image Net [11], which consists of 1.2 million labeled training images and 50000 labeled validation images. |
| Dataset Splits | Yes | The experiment is conducted on Image Net [11], which consists of 1.2 million labeled training images and 50000 labeled validation images. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models used for running experiments. |
| Software Dependencies | No | The paper mentions using 'Python with PyTorch' in Appendix B.1, but does not specify version numbers for Python, PyTorch, or any other software dependencies. |
| Experiment Setup | Yes | Training the oracles with all the loss functions uses the same hyperparameters, such as learning rate, weight decay, momentum, batch size, etc. The details for the training process and the implementation are provided in Appendix B. We used Adam [39] as the optimizer. The initial learning rate is 1e-4 with a cosine decay schedule. The batch size is 128. For the focal loss, we follow [18] to use γ = 2... For the proposed loss, we use α+ = 1 and α = 3 for the oracle that is based on Vi T s backbone, while we use α+ = 2 and α = 5 for the oracle that is based on Res Net s backbone. |