Active Sequential Two-Sample Testing
Authors: Weizhi Li, Prad Kadambi, Pouria Saidi, Karthikeyan Natesan Ramamurthy, Gautam Dasarathy, Visar Berisha
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In practice, we introduce an instantiation of our framework and evaluate it using several experiments; the experiments on the synthetic, MNIST, and application-specific datasets demonstrate that the testing power of the instantiated active sequential test significantly increases while the Type I error is under control. |
| Researcher Affiliation | Collaboration | Weizhi Li EMAIL Arizona State University Los Alamos National Laboratory Prad Kadambi EMAIL Arizona State University Pouria Saidi EMAIL Arizona State University Karthikeyan Natesan Ramamurthy EMAIL IBM Research Gautam Dasarathy EMAIL Arizona State University Visar Berisha EMAIL Arizona State University |
| Pseudocode | Yes | Algorithm 1 Bimodal Query Based Active Sequential Two-Sample Testing (BQ-AST) |
| Open Source Code | No | The paper does not provide an explicit statement about releasing code, nor does it include a link to a code repository for the methodology described. |
| Open Datasets | Yes | The experiments on the synthetic, MNIST, and application-specific datasets demonstrate that the testing power of the instantiated active sequential test significantly increases while the Type I error is under control. We demonstrate the utility of the proposed test in a clinical application using data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (Jack Jr et al., 2008). |
| Dataset Splits | No | The paper describes the sizes of unlabeled sets (e.g., "Each case of data is of size 2000 with labels masked, resulting in an unlabeled set Su with |Su| = 2000") and the number of initial labeled samples (N0 = 10) and total label budget (Nq), but it does not specify fixed training/validation/test splits in the traditional sense, as the experiment involves sequential active labeling. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU/CPU models or memory. |
| Software Dependencies | No | In this section, we compare the BQ-AST with a sequential testing baseline (Lhéritier & Cazals, 2018) that uses the same statistic in equation 2, but the baseline labels features randomly sampled from the unlabeled set Su. In addition, we build Q (z | s) for the test statistic in equation 2 using logistic regression, SVM, or KNN classifiers; we set N0 = 10 for the number of label queries used to initialize Q (z | s), and set significance level α = 0.05. The paper mentions classifiers but does not specify software package versions. |
| Experiment Setup | Yes | In addition, we build Q (z | s) for the test statistic in equation 2 using logistic regression, SVM, or KNN classifiers; we set N0 = 10 for the number of label queries used to initialize Q (z | s), and set significance level α = 0.05. |