Collaborative Discrete-Continuous Black-Box Prompt Learning for Language Models

Authors: Hualin Zhang, Haozhen Zhang, Zhekai Liu, Bin Gu, Yi Chang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experiments on different datasets demonstrate significant improvements in various tasks compared to the baselines. Through extensive experiments on various datasets, we demonstrate that ZO-Po G significantly improves the performance of PTMs.
Researcher Affiliation Academia 1Department of Machine Learning, Mohamed bin Zayed University of Artificial Intelligence 2School of Artificial Intelligence, Jilin University 3School of Mathematics, Jilin University 4International Center of Future Science, Jilin University 5Engineering Research Center of Knowledge-Driven Human-Machine Intelligence, MOE
Pseudocode Yes Algorithm 1 Black-Box Prompt Learning via Zeroth-Order and Policy Gradient Method
Open Source Code Yes Our code is available at: https://github.com/zhanghualin0/ZO-Po G.
Open Datasets Yes For performance evaluation, we chose 5 commonly utilized datasets from the GLUE benchmark (Wang et al., 2018): Co LA (Warstadt et al., 2018), MNLI (Williams et al., 2017), QNLI (Wang et al., 2019), SNLI (Bowman et al., 2015), and WNLI (Levesque et al., 2012). These datasets encompass various typical language understanding tasks such as natural language inference. ... we conducted experiments on a challenging mathematical problem-solving dataset GSM8K (Cobbe et al., 2021) in the 64-shot setting.
Dataset Splits Yes All experiments are performed under the few-shot learning setting. We assemble the training and development sets by randomly selecting m instances for each class from the original training data. ... 16-shot (per class) setting ... 64-shot setting.
Hardware Specification Yes The experiments are executed on a cluster of NVIDIA A40 GPUs.
Software Dependencies No We employ Ro BERTa-large (Liu et al., 2019), GPT2-XL (Radford et al., 2019), and Llama3 (AI@Meta, 2024) as our backbone models, and all pre-trained weights are sourced directly from Hugging Face.
Experiment Setup Yes Comprehensive details of the input templates and hyperparameters used in our experiments can be found in Appendix B. ... Table 6: Main hyperparameters used in our algorithms.