SC2 Benchmark: Supervised Compression for Split Computing
Authors: Yoshitomo Matsubara, Ruihan Yang, Marco Levorato, Stephan Mandt
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This study introduces supervised compression for split computing (SC2) and proposes new evaluation criteria: minimizing computation on the mobile device, minimizing transmitted data size, and maximizing model accuracy. We conduct a comprehensive benchmark study using 10 baseline methods, three computer vision tasks, and over 180 trained models, and discuss various aspects of SC2. |
| Researcher Affiliation | Academia | Yoshitomo Matsubara EMAIL Department of Computer Science University of California, Irvine Ruihan Yang EMAIL Department of Computer Science University of California, Irvine Marco Levorato EMAIL Department of Computer Science University of California, Irvine Stephan Mandt EMAIL Departments of Computer Science and Statistics University of California, Irvine |
| Pseudocode | No | The paper describes methods and processes in detail, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format. |
| Open Source Code | Yes | We also release our code1 and sc2bench,2 a Python package for future research on SC2. Our proposed metrics and package will help researchers better understand the tradeoffs of supervised compression in split computing. 1https://github.com/yoshitomo-matsubara/sc2-benchmark 2https://pypi.org/project/sc2bench/ |
| Open Datasets | Yes | We use image data with relatively high resolution, including Image Net (ILSVRC 2012) (Russakovsky et al., 2015), COCO 2017 (Lin et al., 2014), and PASCAL VOC 2012 datasets (Everingham et al., 2012). |
| Dataset Splits | Yes | Specifically, we use Image Net (ILSVRC 2012) (Russakovsky et al., 2015), that consists of 1.28 million training and 50,000 validation samples... For object detection, we use the COCO 2017 dataset (Lin et al., 2014) to fine-tune the models. The training and validation splits in the COCO 2017 dataset have 118,287 and 5,000 annotated images, respectively. For semantic segmentation, we use the PASCAL VOC 2012 dataset (Everingham et al., 2012) with 1,464 and 1,449 samples for training and validation splits, respectively. |
| Hardware Specification | No | The paper discusses FLOPS and MAC as proxies for computing cost but explicitly states these are not well-defined or static values. While Appendix E shows results using 'encoder FLOPS approximated by Py Torch Profiler,' it also notes, 'Note that the reported FLOPS do not cover all the operations in the encoders as Py Torch Profiler supports only specific operations such as matrix multiplication and 2D convolution at the time of writing.' There is no specific mention of the hardware (e.g., GPU/CPU models, memory) used for conducting the experiments. |
| Software Dependencies | No | This Python package is built on Py Torch (Paszke et al., 2019) and torchdistill (Matsubara, 2021) for reproducible SC2 studies, using Compress AI (Bégaint et al., 2020) and Py Torch Image Models (Wightman, 2019) for neural compression modules/models and reference models, respectively. The paper lists several software packages and their foundational papers, but it does not specify the exact version numbers for these software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | Using the Adam optimizer (Kingma & Ba, 2015), we train the student model on the Image Net dataset for 20 epochs with the batch size of 32. The initial learning rate is set to 10-3 and reduced by a factor of 10 at the end of the 5th, 10th, and 15th epochs... We set the training batch size to 2 and 8 for object detection and semantic segmentation tasks, respectively. The learning rate is reduced by a factor of 10 at the end of the 5th and 15th epochs... At the 1st stage, we train the student model for 10 epochs... We use Adam optimizer with batch size of 64 and an initial learning rate of 10-3. The learning rate is decreased by a factor of 10 after the end of the 5th and 8th epochs. Once we finish the 1st stage... we use the stochastic gradient descent (SGD) optimizer with an initial learning rate of 10-3, momentum of 0.9, and weight decay of 5e-4. We reduce the learning rate by a factor of 10 after the end of the 5th epoch, and the training batch size is set to 128. The balancing weight α and temperature τ for knowledge distillation are set to 0.5 and 1, respectively. |