Privacy-Shielded Image Compression: Defending Against Exploitation from Vision-Language Pretrained Models

Authors: Xuelin Shen, Jiayin Xu, Kangsheng Yin, Wenhan Yang

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments across multiple downstream tasks have demonstrated the effectiveness of our design. ... During our experiments, we compare LIC networks both with and without our PSIC implementation to assess perceptual quality and encryption effectiveness. Additionally, we employ a cutting-edge LIC-oriented backdoor attack method to demonstrate the superiority of the proposed PSIC. Furthermore, we conduct comprehensive ablation studies to highlight the effectiveness of our specially designed modules.
Researcher Affiliation Academia 1Pengcheng Laboratory 2Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ) 3Shenzhen University. Correspondence to: Wenhan Yang <EMAIL>.
Pseudocode No The paper describes the methodology using textual explanations and block diagrams (Figure 2), but does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes 1Code is available at https://github.com/Jiayin Xu5499/PSIC.
Open Datasets Yes Image classification: We employ 5,000 images from the ILSVRC 2012 dataset (Deng et al., 2009) (Image Net-1k). ... Facial attribute analysis: We utilize 10,000 images from the Celeb A dataset (Liu et al., 2018)... Image captioning: The entire validation set of Flickr8k dataset (Hodosh et al., 2013) is employed. ... Human perception: The Kodak dataset (Kodak, 1993) is utilized... trained by randomly selected 70,000 image-text pairs from the CC3M dataset (Sharma et al., 2018)
Dataset Splits Yes Image classification: We employ 5,000 images from the ILSVRC 2012 dataset (Deng et al., 2009) (Image Net-1k). ... Facial attribute analysis: We utilize 10,000 images from the Celeb A dataset (Liu et al., 2018)... Image captioning: The entire validation set of Flickr8k dataset (Hodosh et al., 2013) is employed. ... trained by randomly selected 70,000 image-text pairs from the CC3M dataset (Sharma et al., 2018) from scratch for a fair comparison.
Hardware Specification Yes implemented in Py Torch with CUDA support, and trained on a single NVIDIA A8000-80G GPU.
Software Dependencies No The paper mentions "implemented in Py Torch with CUDA support", but does not specify version numbers for PyTorch or CUDA.
Experiment Setup Yes In the first stage of the training process, we use the Adam optimizer with a learning rate of 1e-4 and train for 200 epochs, which will be reduced to 1e-5 and train for an additional 100 epochs for the second stage. Moreover, all models are trained with a batch size of 128, implemented in Py Torch with CUDA support, and trained on a single NVIDIA A8000-80G GPU. They are all trained four times with different Lagrange parameters, obtaining four distinct compression levels regarding bpp.