WeGeFT: Weight-Generative Fine-Tuning for Multi-Faceted Efficient Adaptation of Large Models

Authors: Chinmay Savadikar, Xi Song, Tianfu Wu

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on commonsense reasoning, arithmetic reasoning, instruction following, code generation, and visual recognition verify the effectiveness of our proposed We Ge FT.
Researcher Affiliation Academia 1Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, USA 2An Independent Researcher. Correspondence to: Chinmay Savadikar <EMAIL>, Tianfu Wu <EMAIL>.
Pseudocode No The paper describes the methodology using mathematical formulations (e.g., Eqn. 1, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13) and textual descriptions, but does not include a dedicated pseudocode block or a clearly labeled algorithm.
Open Source Code Yes Code: https://savadikarc.github.io/wegeft
Open Datasets Yes We conduct extensive experiments across Natural Language Generation and Visual Recognition... on the Math10k benchmark (Hu et al., 2023)... Meta Math QA (Yu et al., 2024)... GSM8k test set (Cobbe et al., 2021)... Commonsense170k (Bool Q (Clark et al., 2019), PIQA (Bisk et al., 2020), SIQA (Sap et al., 2019), Hella Swag (Zellers et al., 2019), Wino G. (Sakaguchi et al., 2021), Arc-e and Arc-c (Clark et al., 2018), and OBQA (Mihaylov et al., 2018) datasets)... Wizard LM dataset (Xu et al., 2024)... MT-Bench dataset (Zheng et al., 2023)... Code-Feedback dataset (Zheng et al., 2024)... Human Eval (Chen et al., 2021)... VTAB-1k benchmark (Zhai et al., 2019)... Caltech-UCSD Birds (Wah et al., 2011), NABirds (Horn et al., 2015), Oxford Flowers (Nilsback & Zisserman, 2008), Stanford Cars (Gebru et al., 2017), and Stanford Dogs (Khosla et al., 2011)... Image Net21k dataset (Deng et al., 2009).
Dataset Splits Yes On the Math10k, we follow (Wu et al., 2024), and tune the hyperparameters by fine-tuning the LLa MA-1 (7B) model on the GSM8k dataset (Cobbe et al., 2021) using a separate validation set constructed from the training set... We use the same train, validation and test splits as (Shi et al., 2023), except for Stanford Cars dataset... we create our own training and validation split (with the same number of images as (Shi et al., 2023)) and use the official testing split.
Hardware Specification Yes All our experiments are run on a single Nvidia A100 GPU.
Software Dependencies No The paper mentions a 'Hugging Face PEFT-based implementation' and 'timm package' in the context of the experiments, and 'Optimizer Adam W', but does not provide specific version numbers for these software components.
Experiment Setup Yes The paper includes multiple tables detailing hyperparameters and experimental settings, such as Table 12 'Hyperparameters used for the Math10k experiments', Table 13 'Hyperparameters used for fine-tuning on Meta Math QA and evaluating on GSM8k', Table 14 'Hyperparameters used for the commonsense reasoning experiments', and Table 17 'Hyperparameter search space used for FGVC experiments'. These tables specify values for parameters like Max Sequence Length, Optimizer, Learning Rate, Batch Size, Epochs, Rank, Scaling Factor, Warmup Ratio, Dropout, and Fine-Tuned Layers.