reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning to Abstain From Uninformative Data

Authors: Yikai Zhang, Songzhu Zheng, Mina Dalirrooyfard, Pengxiang Wu, Anderson Schneider, Anant Raj, Yuriy Nevmyvaka, Chao Chen

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We build upon the strength of our theoretical guarantees by describing an iterative algorithm, which jointly optimizes both a predictor and a selector, and evaluates its empirical performance in a variety of settings. ... In this section, we test the efficacy of our practical algorithm (Algorithm 1) on both publicly-available and semi-synthetic datasets.
Researcher Affiliation	Collaboration	Yikai Zhang EMAIL Morgan Stanley Songzhu Zheng EMAIL Morgan Stanley Mina Dalirrooyfard EMAIL Morgan Stanley Pengxiang Wu EMAIL Snap Inc. Anderson Schneider EMAIL Morgan Stanley Anant Raj EMAIL Morgan Stanley Yuriy Nevmyvaka EMAIL Morgan Stanley Chao Chen EMAIL Stony Brook University
Pseudocode	Yes	Algorithm 1 Iterative Soft Abstain (ISA) 1: Input: Dataset Sn = {(x1, y1), ..., (xn, yn))}, weight parameter:β, random initial classifier ˆf 0 and selector ˆg0, number of iterations T 2: for t 1, , T do 3: Optimize loss to update predictor ˆf t : 1 n Pn i=1 ˆgt(xi){yi log( ˆf t(xi)) + (1 yi) log(1 ˆf t(xi))} 4: Approximate the pseudo-informative label : zt i = 1{1{ ˆf t(xi) > 1 2} = yi} 5: Optimize loss to update selector gt : Pn i=1 {zt i log(ˆgt(xi)) + β(1 zt i) log(1 ˆgt(xi))} 6: end for 7: Output: ˆf T , ˆg T
Open Source Code	Yes	The code for reproducing the results could be found in https://github. com/morganstanley/MSML/tree/main/paper/Learn_to_Abstain.
Open Datasets	Yes	We test the efficacy of our practical algorithm (Algorithm 1) on both publicly-available and semi-synthetic datasets. ... For MNIST+Fashion-MNIST dataset, images from MNIST are defined to be uninformative, while images from Fashion-MNIST are set to be informative. For SVHN(Netzer et al., 2011) dataset, class 5-9 are set to be uninformative and class 0-4 are set to be informative. ... In this section, we report our empirical study on 3 publicly-available datasets: (1) Oxford realized volatility (Volatility) dataset (Heber et al., 2009), (2) breast ultrasound images (BUS) (Al-Dhabyani et al., 2020), and (3) lending club dataset (LC) (Lending Club, 2007). The data is retrieved from Kaggle, https://www.kaggle.com/datasets/ wordsforthewise/lending-club.
Dataset Splits	Yes	The data description and respective train-test splits are presented in Table 11 in section D.1. ... Table 11: Real-world Dataset Description ... Volatility Time Series 2 2 143784 9525 ... BUS Image 3 324 324 3 624 156 ... LC Tabular 1805 3 1058093 264524
Hardware Specification	Yes	We execute our program on Red Hat Enterprise Linux Server 7.9 (Maipo) and use NVIDIA V100 GPU with cuda version 12.1.
Software Dependencies	Yes	We conduct all our experiments using Pytorch 3.10 (Paszke et al., 2019). We execute our program on Red Hat Enterprise Linux Server 7.9 (Maipo) and use NVIDIA V100 GPU with cuda version 12.1.
Experiment Setup	Yes	For all of our synthetic experiments, we use Adam optimizer with learning rate 1e-3 and weight decay rate 1e-4. We use batch size 256 and train 60 epochs for MNIST+Fashion and 120 epochs for SVHN. The learning rate is reduced by 0.5 at epochs 15, 35 and 55 for MNIST+Fashion and is reduced by 0.5 at epochs 40, 60 and 80 for SVHN. ... For Volatility and LC, we use the same Adam optimizer with learning rate (1e-3), weight decay rate (1e-4) and batch size 256. For BUS, due to its limited sample size, we use smaller batch size (16) and reduce learning rate (1e-4) accordingly. For all three datasets, we train each algorithm for 50 epochs and reduce the learning rate by half at epochs 15 and 35.