Distributionally Robust Classification on a Data Budget
Authors: Benjamin Feuer, Ameya Joshi, Minh Pham, Chinmay Hegde
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To rigorously address this question, we introduce JANu S (Joint Annotations and Names Set), a collection of four new training datasets with images, labels, and corresponding captions, and perform a series of carefully controlled investigations of factors contributing to robustness in image classification, then compare those results to findings derived from a large-scale meta-analysis. Using this approach, we show that standard Res Net-50 trained with the cross-entropy loss on 2.4 million image samples can attain comparable robustness to a CLIP Res Net-50 trained on 400 million samples. |
| Researcher Affiliation | Academia | Benjamin Feuer EMAIL Department of Computer Science and Engineering New York University 370 Jay St., 11 Fl., Brooklyn, NY, 11201 Ameya Joshi EMAIL Department of Electrical and Computer Engineering New York University Minh Pham EMAIL Department of Computer Science and Engineering New York University Chinmay Hegde EMAIL Department of Computer Science and Engineering New York University |
| Pseudocode | No | The paper includes diagrams (e.g., Figure 14 'Subset matching; an overview.') that illustrate processes, but it does not contain structured pseudocode blocks or sections explicitly labeled 'Algorithm' or 'Pseudocode'. |
| Open Source Code | Yes | In order to enable future research and reproducibility, we release our code, our dataset, and a complete enumeration of our results for all models in the study (see supplemental attachments). |
| Open Datasets | Yes | To rigorously address this question, we introduce JANu S (Joint Annotations and Names Set), a collection of four new training datasets with images, labels, and corresponding captions... In order to enable future research and reproducibility, we release our code, our dataset, and a complete enumeration of our results for all models in the study (see supplemental attachments). |
| Dataset Splits | Yes | Shifts. Following Radford et al. (2021), we focus on the following four distribution shifts: Imagenet-Sketch, Imagenet R, Imagenet-A, and Imagenet-V2, for our evaluation. For Image Net-R and Image Net-A, which are subsets of Image Net, we evaluate only the 35 shared classes. We reference the validation sets for these shifts on IN100 as IN100-V2, IN100-S, IN100-R, and IN100-A. |
| Hardware Specification | Yes | Models are typically distributed across a single node with 4 NVIDIA A100 GPUs; our largest models were trained on 16 NVIDIA GPUs. |
| Software Dependencies | No | The paper mentions using the 'AMP library' and refers to 'pytorch-image-models (timm) repository (Wightman, 2019)' and 'open-clip repository (Ilharco et al., 2021)'. However, it does not provide specific version numbers for these software components, which is required for a reproducible description of ancillary software. |
| Experiment Setup | Yes | In all our model training experiments, we train with mixed precision, at a batch size of 256, and do not use gradient clipping. We use the AMP library to implement the training process. Model hyperparameters are chosen via grid search. All JANu S models were trained for 256 epochs. Following Santurkar et al. (2022), we use Sim CLR augmentations (resize, crop, flip, jitter, blur, grayscale) rather than CLIP augmentations (resize and crop) for model training. |