GeoLoRA: Geometric integration for parameter efficient fine-tuning

Authors: Steffen Schotthöfer, Emanuele Zangrando, Gianluca Ceruti, Francesco Tudisco, Jonas Kusch

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments on the GLUE benchmark, Vision Transformers, and Stable Diffusion, we show that Geo Lo RA outperforms existing PEFT methods both in terms of accuracy and computational efficiency.
Researcher Affiliation Collaboration 1Computer Science and Mathematics Division, Oak Ridge National Laboratory, USA 2Gran Sasso Science Institute, L Aquila, Italy 3Department of Mathematics, University of Innsbruck, Austria 4School of Mathematics and Maxwell Institute, University of Edinburgh, UK 5Miniml.AI Ltd, UK 6Department of Data Science, Norwegian University of Life Sciences, Norway
Pseudocode Yes Algorithm 1: Single iteration of Geo Lo RA. The functions optimizer_step, basis_augmentation, and truncation are detailed in Algorithm 2 in the appendix. Input :Initial orthonormal bases U, V Rn r and diagonal S Rr r; τ: singular value threshold for rank truncation; λ: learning rate. Algorithm 2: Various auxiliary functions
Open Source Code No The paper states "Link to source code" in the header, but does not provide an actual link or an explicit statement of code release for the methodology described in this paper. It mentions using existing open-source implementations for other methods but not its own.
Open Datasets Yes Through extensive experiments on the GLUE benchmark, Vision Transformers, and Stable Diffusion, we show that Geo Lo RA outperforms existing PEFT methods both in terms of accuracy and computational efficiency. De BERTa for GLUE. We evaluate the performance of Geo Lo RA by fine-tuning the 183 million parameter transformer De BERTa V3-base (He et al., 2023) on the GLUE Benchmark (Wang et al., 2019). Vision transformer for object classification. We compare Geo Lo RA and Ada Lo RA on fine-tuning the Vit-base-patch16-224 Vision Transformer, pre-trained on the Imagenet-1k dataset, and fine-tuned on Cifar10, Cifar100, and Tiny-Imagenet. Dreambooth stable diffusion. We test Geo Lo RA on fine-tuning Stable Diffusion (Rombach et al., 2021) using Dreambooth (Ruiz et al., 2023) on their original datasets.
Dataset Splits Yes Table 5: Summary of GLUE benchmark tasks Corpus Task #Train #Dev #Test #Label Metrics Single-Sentence Classification (GLUE) Co LA Acceptability 8.5k 1k 1k 2 Matthews corr SST Sentiment 67k 872 1.8k 2 Accuracy Pairwise Text Classification (GLUE) MNLI NLI 393k 20k 20k 3 Accuracy RTE NLI 2.5k 276 3k 2 Accuracy QQP Paraphrase 364k 40k 391k 2 F1 MRPC Paraphrase 3.7k 408 1.7k 2 Accuracy QNLI QA/NLI 108k 5.7k 5.7k 2 Accuracy
Hardware Specification No The paper mentions evaluating computational efficiency and training speed but does not specify any particular hardware like GPU models, CPU types, or memory amounts used for the experiments.
Software Dependencies No The paper mentions using "Hugging Face open source implementations" for reference methods and specifies optimizer parameters like AdamW (β1, β2) but does not provide specific version numbers for any software libraries or programming languages used for their own implementation.
Experiment Setup Yes Table 6: Hyper-parameter setup for the GLUE benchmark. Learning rate, batch size, and number of epochs are adopted from the Git Hub repository of Ada Lo RA. Table 7: Hyper-parameter setup for fine-tuning vit-base-patch16-224 vision transformer with Geo Lo RA. Ada Lo RA uses the same hyperparameters and the same rank budget for the global truncation as Geo Lo RA.