Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization
Authors: Jeonghoon Kim, Jung Hyun Lee, Sungdong Kim, Joonsuk Park, Kang Min Yoo, Se Jung Kwon, Dongsoo Lee
NeurIPS 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments In this section, we empirically validate the effectiveness of our proposed PEQA method by examining its performance in both parameter-efficient fine-tuning (PEFT) and as a quantization method. We achieve this goal by using a series of benchmarks [52 57], datasets [51, 58, 59], and LLMs [4, 6, 60, 61] that have been publicly introduced. |
| Researcher Affiliation | Collaboration | Jeonghoon Kim NAVER Cloud EMAIL Jung Hyun Lee NAVER Cloud EMAIL Sungdong Kim NAVER Cloud, KAIST AI EMAIL Joonsuk Park NAVER Cloud, NAVER AI Lab, University of Richmond EMAIL Kang Min Yoo NAVER Cloud, SNU AI Center EMAIL Se Jung Kwon NAVER Cloud EMAIL Dongsoo Lee NAVER Cloud EMAIL |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | We utilize Huggingface repository[66]3 for training, evaluation code and dataset. |
| Open Datasets | Yes | We fine-tune and assess LLMs on the Wikitext2 [51] and Penn Tree Bank (PTB) [58] datasets using PEQA and Lo RA [21]. |
| Dataset Splits | No | The paper does not provide specific data split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning. |
| Hardware Specification | Yes | To provide a clear understanding of these benefits, we conducted tests using a single NVIDIA A100-80GB GPU and the causal language modeling code from the Hugging Face repository7. |
| Software Dependencies | No | For the common experimental settings, Adam W [64] optimizer and linear-decaying learning rate scheduler were used. We use Deepspeed repository [65] 2 for FP16 and BF16 training. Additionally, we utilize Huggingface repository[66]3 for training, evaluation code and dataset. |
| Experiment Setup | Yes | Batch size and epoch for all experiments are set to 128 and 15 respectively. The learning rates for the experiments of Table 2 are displayed in Table 8. |