reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning

Authors: Yong Lin, Chen Liu, Chenlu Ye, Qing Lian, Yuan Yao, Tong Zhang

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To assess the eﬀectiveness of our proposed method, we conducted extensive empirical experiments using deep neural networks on benchmark datasets. The results consistently showcase the superior performance of COPS compared to baseline methods, reaﬃrming its eﬃcacy1. Keywords: Subset Selection, Uncertainty Estimation, Model Misspeciﬁcation
Researcher Affiliation	Academia	Yong Lin * EMAIL Department of Computer Science Hong Kong University of Science and Technology Hong Kong, China Chen Liu * EMAIL Department of Mathematics Hong Kong University of Science and Technology Hong Kong, China Chenlu Ye * EMAIL Siebel School of Computing and Data Science University of Illinois Urbana-Champaign Illinois, USA Qing Lian EMAIL Department of Computer Science Hong Kong University of Science and Technology Hong Kong, China Yuan Yao EMAIL Department of Mathematics Hong Kong University of Science and Technology Hong Kong, China Tong Zhang EMAIL Siebel School of Computing and Data Science University of Illinois Urbana-Champaign Illinois, USA
Pseudocode	Yes	Algorithm 1: Uncertainty estimation in linear softmax regression. Algorithm 2: COPS for sampling with labels on linear models Algorithm 3: COPS for sampling without labels on linear models Algorithm 4: COPS for sampling with labels on DNNs Algorithm 5: COPS for sampling without labels on DNNs Algorithm 6: Uncertainty estimation for DNNs. Algorithm 7: COPS with uncertainty clipping for sampling with labels on DNNs Algorithm 8: COPS with uncertainty clipping for sampling without labels on DNNs Algorithm 9: COPS with full details for sampling with labels on DNNs Algorithm 10: COPS with full details for sampling without labels on DNNs
Open Source Code	Yes	1. Our code can be ﬁnd on https://github.com/corwinliu9669/COPS.
Open Datasets	Yes	CIFAR10 Krizhevsky et al. (2009): We utilize the original CIFAR10 dataset Krizhevsky et al. (2009). CIFAR10-N: We use CIFAR10-N, a corrupted version of CIFAR10 introduced by Wei et al. Wei et al. (2021). CIFAR100: From the CIFAR100 dataset Krizhevsky et al. (2009), we randomly select 200 samples for each class. IMDB: The IMDB dataset Maas et al. (2011) consists of positive and negative movie comments. SVHN: The SVHN dataset contains images of house numbers. Place365 (subset): We select ten classes from the Place365 dataset Zhou et al. (2017).
Dataset Splits	Yes	For all settings, we split the training set into two subsets, i.e., the probe set (S in Algorithm 4-5) and the sampling dataset set (S in Algorithm 4-5). We train 10 probe neural networks on S and estimate the uncertainty of each sample in S with these networks. By sampling with replacement, we select an subset with 300 samples per class from S according to Algorithm 4-5, on which we train the a Res Net20 from scratch. CIFAR10 Krizhevsky et al. (2009): ... To construct the probe set, we randomly select 1000 samples from each class, while the remaining training samples are used for the sampling set. IMDB: ... We split 5000 samples from the training set for uncertainty estimation and conduct our scheme on the remaining 20000 samples. Table 3: The table provides descriptions of the datasets used in our study. The Probe Set/ Sampling Set column indicates the number of samples included in the Probe Set and Sampling Set for each dataset. The Target Size of Sub-sampling column represents the number of samples selected from the Sampling Set for sub-sampling.
Hardware Specification	No	The paper mentions 'GPU hours' in Table 7 but does not specify the type or model of GPU, CPU, or any other specific hardware component used for the experiments.
Software Dependencies	No	The paper mentions 'Adam W Optimizer Loshchilov and Hutter (2019)' and 'SGD' as optimizers and 'cosine lr decay' as a learning rate schedule, but it does not specify version numbers for these or any other key software libraries, frameworks (like PyTorch or TensorFlow), or programming languages.
Experiment Setup	Yes	We use Adam W Optimizer Loshchilov and Hutter (2019) with cosine lr decay for 150 epochs, the batch size is 256. We put a limit on the maximum weight when solving Eqn (9) to avoid large variance. Speciﬁcally, let ui denote the uncertainty of ith sample. In Eqn (9), we use 1 ui to reweight the selected data. To avoid large variance, we use 1 max{β,ui} as the reweighting to replace 1 ui . We simply set β = 0.1 for all experiments following Citovsky et al. (2023). Table 4: This table illustrates the training details. Here we set weight decay as 5e-4 for all the experiments. Here no schedule means using the start learning rate without modiﬁcation during training. Schedule 1 stands for the decaying of the learning rate by 0.1 every 30 epochs. Schedule 2 means using the cosine learning schedule with Tmax = 50 and etamin = 0