SSHR: More Secure Generative Steganography with High-Quality Revealed Secret Images

Authors: Jiannian Wang, Yao Lu, Guangming Lu

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Various experiments indicate that our model outperforms existing methods in terms of recovery quality and secret image security. We conduct extensive experiments to demonstrate that our model surpasses existing methods in terms of stego images quality and security, as well as the quality of the revealed secret images. The paper includes a dedicated '4. Experiments' section with 'Datasets and settings', 'Benchmarks', 'Evaluation Metrics', 'Quantitative results', 'Qualitative Results', and 'Security Analysis' subsections.
Researcher Affiliation Academia 1Department of Computer Science and Technology, University of Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong, China. Correspondence to: Yao Lu <EMAIL>, Guangming Lu <EMAIL>. All authors are affiliated with the 'University of Harbin Institute of Technology (Shenzhen)', and the email domains are '.edu.cn', indicating an academic affiliation.
Pseudocode No The paper describes the conceal and reveal processes using mathematical equations (e.g., Equation 12 for conceal, Equation 14 for reveal) and textual descriptions, but it does not include a distinct section or figure explicitly labeled 'Pseudocode' or 'Algorithm'.
Open Source Code No The paper mentions 'Our model is implemented with Py Torch' in the 'Experimental Setting' section but does not provide any explicit statement about code release or a link to a code repository.
Open Datasets Yes Our model is implemented with Py Torch and trained on the DIV2K (Agustsson & Timofte, 2017) training dataset. The evaluation is performed on the DIV2K (Agustsson & Timofte, 2017) test dataset (100 images), COCO (Lin et al., 2014) (5000 images), Image Net (Russakovsky et al., 2015) (10,000 images), and Uni Stega (Yang et al., 2024) (100 images) at a resolution of 256 × 256.
Dataset Splits Yes Our model is implemented with Py Torch and trained on the DIV2K (Agustsson & Timofte, 2017) training dataset. The evaluation is performed on the DIV2K (Agustsson & Timofte, 2017) test dataset (100 images), COCO (Lin et al., 2014) (5000 images), Image Net (Russakovsky et al., 2015) (10,000 images), and Uni Stega (Yang et al., 2024) (100 images) at a resolution of 256 × 256. Training images are randomly cropped to 256 × 256 and augmented with random horizontal and vertical flips. Comparatively, images in the DIV2K dataset are center-cropped, while in the other datasets, the images are resized to 256 × 256.
Hardware Specification Yes All experiments are conducted on a Nvidia 4090 GPU.
Software Dependencies No Our model is implemented with Py Torch and trained on the DIV2K (Agustsson & Timofte, 2017) training dataset. The Adam W optimizer with an initial learning rate of 1 × 10^-4 is used for training. The paper mentions software like 'Py Torch' and 'Adam W optimizer' but does not specify their version numbers.
Experiment Setup Yes Training images are randomly cropped to 256 × 256 and augmented with random horizontal and vertical flips. The Adam W optimizer with an initial learning rate of 1 × 10^-4 is used for training. The total loss function LT otal is defined as the weighted sum of the frequency loss LF and perceptual loss LP , formulated as: LT otal = λ1LF + λ2LP , where λ1 and λ2 are trade-off parameters set to 2.0 and 1.0, respectively, to balance the different losses.