Self-Supervised Visual Representation Learning for Medical Image Analysis: A Comprehensive Survey

Authors: Siladittya Manna, Saumik Bhattacharya, Umapada Pal

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Additionally, we also present an exhaustive review of the self-supervised methods applied to medical image analysis. Furthermore, we also present an extensive compilation of the details of the datasets used in the different works and provide performance metrics of some notable works on image and video datasets.
Researcher Affiliation Academia Siladittya Manna EMAIL Computer Vision and Pattern Recognition Unit Indian Statistical Institute, Kolkata Saumik Bhattacharya EMAIL Department of Electronics and Electrical Communication Engineering Indian Institute of Technology Kharagpur Umapada Pal EMAIL Computer Vision and Pattern Recognition Unit Indian Statistical Institute, Kolkata
Pseudocode No The paper describes various algorithms and frameworks conceptually and through visual taxonomies and illustrations (e.g., Figure 1, 2, 3, 4, 5). However, it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format.
Open Source Code No The paper does not provide any explicit statement about releasing source code for the methodology described, nor does it include a link to a code repository. The Open Review link provided is for the paper's review process, not its code.
Open Datasets Yes In Table 1, we document the different datasets used for pre-training and downstream performance evaluation for each domain covered in DABS. Table 1: Summary of datasets used in DABSv1 benchmarking for SSL Domain Pre-training Downstream Natural Images Image Net (Deng et al., 2009) FGVC-Aircraft dataset (Maji et al., 2013), the Caltech-UCSD Birds dataset (Wah et al., 2011), the German Traffic Sign Recognition Benchmark dataset (Houben et al., 2013), the Describable Textures Dataset (Cimpoi et al., 2014), the VGG Flower Dataset (Nilsback & Zisserman, 2008), and the CIFAR-10 dataset (Krizhevsky, 2009)
Dataset Splits Yes In Table 1, we document the different datasets used for pre-training and downstream performance evaluation for each domain covered in DABS. All the frameworks mentioned in Table 14, are pre-trained on Image Net1K (Deng et al., 2009) dataset for varying number of epochs as per their respective needs. We have attempted to present a comparative analysis using both linear classification (using a frozen encoder) and fine-tuning Top-1 accuracy on the Image Net-1K dataset. We also present our findings on the PASCAL VOC dataset for object classification (m AP), detection (AP50) and segmentation (m IOU) tasks. Unless otherwise mentioned, the default dataset for PASCAL VOC tasks is VOC2007. There are a few frameworks that opt for the VOC2012 and VOC07+12 versions of the dataset for finetuning.
Hardware Specification No The paper is a comprehensive survey that analyzes existing research. It does not describe any new experiments or computational work performed by the authors that would require specific hardware. Therefore, no hardware specifications are mentioned.
Software Dependencies No The paper explicitly states: "It is to be noted that, no part of this review was written or edited using large language models like Chat GPT." While it mentions Chat GPT, this is a statement about what was NOT used, rather than a software dependency for experimental methodology with a version number. As a survey, the paper focuses on reviewing and compiling information from other works, not on performing new computational experiments that would require specific software dependencies.
Experiment Setup No The paper is a comprehensive survey and does not conduct new experiments. It reviews and compiles the experimental setups and results from other research papers. Therefore, it does not contain details about its own hyperparameters, model initialization, or training schedules.