reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Pareto Optimization for Active Learning under Out-of-Distribution Data Scenarios

Authors: Xueying Zhan, Zeyu Dai, Qingzhong Wang, Qing Li, Haoyi Xiong, Dejing Dou, Antoni B. Chan

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate the effectiveness of our POAL approach on classical Machine Learning (ML) and Deep Learning (DL) tasks.
Researcher Affiliation	Collaboration	Xueying Zhan EMAIL City University of Hong Kong Zeyu Dai EMAIL The Hong Kong Polytechnic University Qingzhong Wang EMAIL Baidu Research Qing Li EMAIL The Hong Kong Polytechnic University Haoyi Xiong EMAIL Baidu Research Dejing Dou EMAIL BCG X Antoni B. Chan EMAIL City University of Hong Kong
Pseudocode	Yes	Algorithm 1 Monte-Carlo POAL with early stopping under OOD data scenarios. Algorithm 2 Pre-selecting technique on large-scale datasets.
Open Source Code	No	The paper does not contain an explicit statement about releasing its source code for the described methodology, nor does it provide a direct link to a repository for its implementation. References to code are for baselines or general implementations (e.g., Deep AL+10, CCAL, scikit-learn, PyTorch).
Open Datasets	Yes	For classical ML tasks, we use pre-processed data from LIBSVM (Chang & Lin, 2011): Synthetic data: EX8 uses EX8a as ID data and EX8b as OOD data (Ng, 2008). Real-life data: Vowel (Asuncion & Newman, 2007; Aggarwal & Sathe, 2015)... Letter (Frey & Slate, 1991; Asuncion & Newman, 2007)... For DL tasks, we adopt the following image datasets: CIFAR10 (Krizhevsky et al., 2009)... CIFAR100 (Krizhevsky et al., 2009)
Dataset Splits	Yes	The training/test split of the datasets is fixed in the experiments, while the initial label set and the unlabeled set is randomly generated from the training set. ...Table 2: Datasets used in the experiments. Dataset ... # of initial labeled set ... # of unlabeled pool ... # of test set ... e.g., CIFAR10-04: # of initial labeled set: 1000, # of unlabeled pool: 49000, # of test set: 6000
Hardware Specification	Yes	We run all our experiments on a single Tesla V100-SXM2 GPU with 16GB memory except for running SIMILAR (FLCMI) related experiments. Since SIMILAR (FLCMI) needs much memory. We run the experiments (SIMILAR on down-sampled CIFAR10) on another workstation with Tesla V100-SXM2 GPU with 94GB memory in total.
Software Dependencies	No	The paper mentions using specific software components like 'scikit-learn library (Pedregosa et al., 2011)' and 'Py Torch' with 'Adam optimizer', but it does not provide specific version numbers for these software dependencies, which are required for full reproducibility.
Experiment Setup	Yes	For Res Net18, we employed Res Net18 (He et al., 2016) based on Py Torch with Adam optimizer (learning rate: 1e 3) as the basic learner in DL tasks. In CIFAR10 and CIFAR100 tasks, we set the number of training epochs as 30, the kernel size of the first convolutional layer in Res Net18 is 3x3 (consistent with Py Torch CIFAR implementation9). Input pre-processing steps include random crop (pad = 4), random horizontal flip (p = 0.5) and normalization.