Distributionally Robust Classification on a Data Budget

Authors: Benjamin Feuer, Ameya Joshi, Minh Pham, Chinmay Hegde

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To rigorously address this question, we introduce JANu S (Joint Annotations and Names Set), a collection of four new training datasets with images, labels, and corresponding captions, and perform a series of carefully controlled investigations of factors contributing to robustness in image classification, then compare those results to findings derived from a large-scale meta-analysis. Using this approach, we show that standard Res Net-50 trained with the cross-entropy loss on 2.4 million image samples can attain comparable robustness to a CLIP Res Net-50 trained on 400 million samples.
Researcher Affiliation Academia Benjamin Feuer EMAIL Department of Computer Science and Engineering New York University 370 Jay St., 11 Fl., Brooklyn, NY, 11201 Ameya Joshi EMAIL Department of Electrical and Computer Engineering New York University Minh Pham EMAIL Department of Computer Science and Engineering New York University Chinmay Hegde EMAIL Department of Computer Science and Engineering New York University
Pseudocode No The paper includes diagrams (e.g., Figure 14 'Subset matching; an overview.') that illustrate processes, but it does not contain structured pseudocode blocks or sections explicitly labeled 'Algorithm' or 'Pseudocode'.
Open Source Code Yes In order to enable future research and reproducibility, we release our code, our dataset, and a complete enumeration of our results for all models in the study (see supplemental attachments).
Open Datasets Yes To rigorously address this question, we introduce JANu S (Joint Annotations and Names Set), a collection of four new training datasets with images, labels, and corresponding captions... In order to enable future research and reproducibility, we release our code, our dataset, and a complete enumeration of our results for all models in the study (see supplemental attachments).
Dataset Splits Yes Shifts. Following Radford et al. (2021), we focus on the following four distribution shifts: Imagenet-Sketch, Imagenet R, Imagenet-A, and Imagenet-V2, for our evaluation. For Image Net-R and Image Net-A, which are subsets of Image Net, we evaluate only the 35 shared classes. We reference the validation sets for these shifts on IN100 as IN100-V2, IN100-S, IN100-R, and IN100-A.
Hardware Specification Yes Models are typically distributed across a single node with 4 NVIDIA A100 GPUs; our largest models were trained on 16 NVIDIA GPUs.
Software Dependencies No The paper mentions using the 'AMP library' and refers to 'pytorch-image-models (timm) repository (Wightman, 2019)' and 'open-clip repository (Ilharco et al., 2021)'. However, it does not provide specific version numbers for these software components, which is required for a reproducible description of ancillary software.
Experiment Setup Yes In all our model training experiments, we train with mixed precision, at a batch size of 256, and do not use gradient clipping. We use the AMP library to implement the training process. Model hyperparameters are chosen via grid search. All JANu S models were trained for 256 epochs. Following Santurkar et al. (2022), we use Sim CLR augmentations (resize, crop, flip, jitter, blur, grayscale) rather than CLIP augmentations (resize and crop) for model training.