Fast Direct: Query-Efficient Online Black-box Guidance for Diffusion-model Target Generation
Authors: Kim Yong Tan, YUEMING LYU, Ivor Tsang, Yew-Soon Ong
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on twelve high-resolution (1024 1024) image target generation tasks and six 3D-molecule target generation tasks show 6 up to 10 query efficiency improvement and 11 up to 44 query efficiency improvement, respectively. |
| Researcher Affiliation | Academia | 1College of Computing and Data Science, Nanyang Technological University, Singapore 2Centre for Frontier AI Research, Agency for Science, Technology and Research, Singapore 3Institute of High Performance Computing, Agency for Science, Technology and Research, Singapore |
| Pseudocode | Yes | Algorithm 1: Guided Noise Sequence Optimization Input: Stepsize α, target data x , number of diffusion sampling steps K, pre-trained diffusion sampler Sθ, number of iterations T. Output: Generated target x K. ... Algorithm 2: Fast Direct Input: Max number of batch queries N, batch size B, step-size α, number of diffusion sampling steps K, pre-trained diffusion sampler Sθ. Output: Set of optimized samples {x1 K, , x B K}, pseudo target model that learns from dataset D. |
| Open Source Code | Yes | Our implementation is publicly available at: https://github.com/kimyong95/ guide-stable-diffusion/tree/fast-direct |
| Open Datasets | Yes | We use Target Diff (Guan et al., 2023) as the backbone molecules generative model. It is pre-trained on the Cross Docked2020 dataset (Francoeur et al., 2020)... For aesthetic quality, the aesthetic score is evaluated by the pre-trained LAION aesthetics predictor (?), which is trained on human ratings of the images aesthetic quality... |
| Dataset Splits | No | The paper does not explicitly provide training/test/validation dataset splits for its own experimental methodology. It refers to '12 tasks' or '6 tasks' based on prompts/protein receptors, but these are not dataset splits in the conventional sense for reproducibility of data partitioning. |
| Hardware Specification | No | The paper mentions general concepts like 'GPU memory' but does not provide specific hardware details (e.g., exact GPU models, CPU specifications, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions several software components, models, and schedulers like 'SDXL-Lightning', 'Gemini 1.5 Flash model (code: gemini-1.5-flash-001)', 'Eular Descrete Scheduler', 'DDIMScheduler', and 'Auto Dock Vina simulation software'. While some have associated citations, none are given with specific version numbers in the format typically expected for software dependencies (e.g., Python 3.8, PyTorch 1.9). |
| Experiment Setup | Yes | We use N = 50 iterations to utilize the 50 batch query budget, and set the batch size as B = 32 and step size as α = 80. We use the Eular Descrete Scheduler (Karras et al., 2022) sampler... For molecule tasks, we use N = 50 iterations... and set the batch size as B = 32, and the step size as α = 10^-2. |