Undersampling is a Minimax Optimal Robustness Intervention in Nonparametric Classification
Authors: Niladri S. Chatterji, Saminul Haque, Tatsunori Hashimoto
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also perform an experimental case study on a label shift dataset and find that in line with our theory, the test accuracy of robust neural network classifiers is constrained by the number of minority samples. (...) Figure 2: Convolutional neural network classifiers trained on the Imbalanced Binary CIFAR10 dataset with a 5:1 label imbalance. |
| Researcher Affiliation | Academia | Niladri S. Chatterji EMAIL Department of Computer Science Stanford University Saminul Haque EMAIL Department of Computer Science Stanford University Tatsunori B. Hashimoto EMAIL Department of Computer Science Stanford University |
| Pseudocode | Yes | Undersampled binning estimator The undersampled binning estimator AUSB takes as input a dataset S and a positive integer K corresponding to the number of bins, and returns a classifier AS,K USB : [0, 1] { 1, 1}. This estimator is defined as follows: 1. First, we compute the undersampled dataset SUS. 2. Given this dataset SUS, let n1,j be the number of points with label +1 that lie in the interval Ij = [ j 1 K ]. Also, define n 1,j analogously. Then set ( 1 if n1,j > n 1,j, 1 otherwise. 3. Define the classifier AS,K USB such that if x Ij then AS,K USB(x) = Aj. (5). |
| Open Source Code | No | The paper does not provide an explicit statement about releasing code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | To explore this question, we conduct a small case study using the imbalanced binary CIFAR10 dataset (Byrd & Lipton, 2019; Wang et al., 2022) that is constructed using the cat and dog classes. |
| Dataset Splits | Yes | The test set consists of all of the 1000 cat and 1000 dog test examples. To form our initial train and validation sets, we take 2500 cat examples but only 500 dog examples from the official train set, corresponding to a 5:1 label imbalance. We then use 80% of those examples for training and the rest for validation. |
| Hardware Specification | No | We note that all of the experiments were performed on an internal cluster on 8 GPUs. This statement is too general and does not provide specific hardware models or specifications. |
| Software Dependencies | No | The paper mentions using SGD and various loss functions (importance weighted cross entropy loss, importance weighted VS loss, tilted loss, group-DRO) but does not specify any software libraries or frameworks with their version numbers. |
| Experiment Setup | Yes | We train this model using SGD for 800 epochs with batchsize 64, a constant learning rate 0.001 and momentum 0.9. The importance weights used upweight the minority class samples in the training loss and validation loss is calculated to be #Cat Train Examples / #Dog Train Examples. We set τ = 3 and γ = 0.3, the best hyperparameters identified by Wang et al. (2022) on this dataset for this neural network architecture. |