PEARL: Towards Permutation-Resilient LLMs
Authors: Liang CHEN, Li Shen, Yang Deng, Xiaoyan Zhao, Bin Liang, Kam-Fai Wong
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on synthetic pre-training and real-world instruction tuning tasks demonstrate that PEARL effectively mitigates permutation attacks and enhances performance. |
| Researcher Affiliation | Academia | 1The Chinese University of Hong Kong 2Shenzhen Campus of Sun Yat-sen University 3SMU EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1: Adversarial Optimization Algorithm for PEARL |
| Open Source Code | Yes | The code is available at https://github.com/Chan Liang/PEARL. |
| Open Datasets | Yes | We validate our method in two scenarios: (1) pretraining a transformer to in-context learn linear functions (Garg et al., 2022), and (2) instruction tuning of LLMs on the Super-Natural Instructions (Wang et al., 2022). |
| Dataset Splits | Yes | We selected 17 representative tasks, comprising 9 natural language generation (NLG) tasks and 8 natural language understanding (NLU) tasks. Following the methodology of Wang et al. (2022), we randomly designated 4 datasets as held-out test sets and used the remaining 13 datasets for training. Each training dataset contains 150 examples, and each test dataset contains 100 examples, resulting in a training set of 1,950 examples and a test set of 400 examples, as summarized in Table 2. |
| Hardware Specification | Yes | We train the models on the instruction dataset for two epochs using a single NVIDIA A40 GPU, with a batch size of 16, resulting in a total of 246 training steps. |
| Software Dependencies | No | The paper mentions models and optimizers like GPT-2, Adam W, BERT-base, LLa MA3-8B, FLAN-large, and Lo RA, but does not provide specific version numbers for any underlying software libraries (e.g., PyTorch, TensorFlow, Python version). |
| Experiment Setup | Yes | Key training parameters include a batch size of 128 and 500k training steps. In the PEARL framework, the P-Net is initialized as a BERT-base (Devlin et al., 2019a) and also trained from scratch. ... We train the models on the instruction dataset for two epochs using a single NVIDIA A40 GPU, with a batch size of 16, resulting in a total of 246 training steps. The optimizer used was Adam W. The learning rates for the P-Net and the LLM are set to 1 10 4 and 3 10 4, respectively. For the Sinkhorn algorithm, we use 80 iterations, a temperature parameter of 0.1, and an entropy constraint coefficient β = 1.0. Table 6 also lists hyperparameter settings. |