Object-aware Cropping for Self-Supervised Learning
Authors: Shlok Kumar Mishra, Anshul Shah, Ankan Bansal, Janit K Anjaria, Abhyuday Narayan Jagannatha, Abhishek Sharma, David Jacobs, Dilip Krishnan
TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct a number of experiments incorporating object-aware cropping, finding consistent improvements on state of the art self-supervised methods such as Mo Co-v2 Chen et al. (2020b), BYOL Grill et al. (2020) and Dense-CL Wang et al. (2021) across varied datasets and tasks. For example, for pre-training on Open Images, our approach achieves an improvement of 8.8% m AP over random scene cropping (both methods using Mo Co-v2). We also show significant improvements on COCO and PASCAL-VOC object detection and segmentation tasks over the state-of-the-art self-supervised learning approaches. |
| Researcher Affiliation | Collaboration | 1University of Maryland, College Park, 2Johns Hopkins University, 3University of Massachusetts Amherst 4Google Research |
| Pseudocode | No | The paper describes the proposed approach and loss functions in detail using descriptive text and mathematical formulas, but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its own source code or a link to a code repository. It mentions using 'Detectron2 Wu et al. (2019)' for evaluation protocols, but this is a third-party tool. |
| Open Datasets | Yes | We conduct a number of experiments... across varied datasets and tasks. For example, for pre-training on Open Images... We also show significant improvements on COCO and PASCAL-VOC object detection and segmentation tasks... We also propose Open Images Hard Multi-object Subset OHMS, which is a balanced subset of Open Images... We also experiment with the complete Open Images ( 1.9 million images.) In addition, we perform pre-training on Image Net Deng et al. (2009) and MS-COCO Lin et al. (2014). |
| Dataset Splits | Yes | For VOC object detection, we evaluate on the Faster-RCNN(C4-backbone) Ren et al. (2015) detector on VOC trainval07+12 dataset using the standard 1 standard protocol. For COCO-Object detection and semantic segmentation, we fine tune on the Mask RCNN detector (FPN-backbone) He et al. (2018) on COCO train2017 split (118k images) with the standard 1 schedule, evaluating on the COCO 5k val2017 split. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'Detectron2 Wu et al. (2019)' for evaluation protocols, but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | For pre-training on Mo Co-v2, we closely follow the standard protocol described in Chen et al. (2020b). Image Net baselines has been trained for 100 epochs Shah et al. (2021) and Open Images model have been trained for same 100 Image Net equivalent epochs. All other methods are run for 90K, finetuning iterations. All SSL models have been pre-trained on complete Open Images dataset(1.9 million images) for 75 epochs and then finetuned on COCO and VOC dataset. For the full dataset pre-training has been done for 100 epochs and for the random subset the pre-training has been done for 200 epochs. |