FreCaS: Efficient Higher-Resolution Image Generation via Frequency-aware Cascaded Sampling

Authors: zhengqiang ZHANG, Ruihuang Li, Lei Zhang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate that Fre Ca S significantly outperforms state-of-the-art methods in image quality and generation speed. In particular, Fre Ca S is about 2.86 and 6.07 faster than Scale Crafter and Demo Fusion in generating a 2048 2048 image using a pre-trained SDXL model and achieves an FIDb improvement of 11.6 and 3.7, respectively.
Researcher Affiliation Collaboration Zhengqiang Zhang1,2, Ruihuang Li1, Lei Zhang1,2, 1The Hong Kong Polytechnic University 2OPPO Research Institute
Pseudocode No The paper describes the method using text and mathematical equations, but it does not include a clearly labeled pseudocode block or algorithm section.
Open Source Code Yes The source code of Fre Ca S can be found at https://github.com/xtudbxk/Fre Ca S.
Open Datasets Yes We randomly select 10K, 5K, and 1K prompts from the LAION5B aesthetic subset for generating images of 1024 1024, 2048 2048, and 4096 4096, respectively.
Dataset Splits Yes We randomly select 10K, 5K, and 1K prompts from the LAION5B aesthetic subset for generating images of 1024 1024, 2048 2048, and 4096 4096, respectively.
Hardware Specification Yes We measure the model latency on a single NVIDIA A100 GPU with a batch size of 1.
Software Dependencies No The paper mentions using specific samplers like "DDIM sampler" and "flow matching based Euler solver" but does not provide specific version numbers for these or other software libraries like Python, PyTorch, or CUDA.
Experiment Setup Yes We follow the default settings and perform a 50-step sampling process with DDIM sampler for SD2.1 and SDXL, and perform a 28-step sampling process with a flow matching based Euler solver for SD3. For 4 experiments, we employ two sampling stages at the training size and target size, respectively. For 16 experiments, we employ three sampling stages at the training size, 4 training size and 16 training size, respectively. More details can be found in Appendix B.