On the Utility of Existing Fine-Tuned Models on Data-Scarce Domains

Authors: Md Ibrahim Ibne Alam, Parikshit Ram, Soham Dan, Horst Samulowitz, Koushik Kar

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we explore different utilization techniques of these existing DAFT models for data-scarce problems, i.e., tasks for which data is not available or limited. We observe that for zero-shot problems, ensembling of DAFT models provides an accuracy performance close to that of the single best model. With few-shot problems (few data from target domain available), this performance can be improved further by picking or putting more weights to the DAFT models that are expected to perform better on the target task.
Researcher Affiliation Collaboration Md Ibrahim Ibne Alam EMAIL Department of Electrical, Computer, and Systems Engineering Rensselaer Polytechnic Institute Parikshit Ram EMAIL IBM Soham Dan EMAIL Microsoft Horst Samulowitz EMAIL IBM Koushik Kar EMAIL Department of Electrical, Computer, and Systems Engineering Rensselaer Polytechnic Institute
Pseudocode No The paper describes methods (DAFT-EZ, DAFT-E) but does not provide structured pseudocode or algorithm blocks.
Open Source Code No The paper refers to using existing fine-tuned models from Hugging-face (HF, 2024c; Kim, 2023) and creating their own DAFT models by fine-tuning base models. There is no explicit statement or link provided for the open-sourcing of the code developed for this paper's methodology.
Open Datasets Yes The datasets for Sentiment Analysis are: Amazon polarity, Cornell Movie, IMDB, SST2, Tweet sentiment, and Yelp Polarity; and for Textual Similarity: MRPC, QQP, STS-B (details in Appendix A.1). ... The direct link to download these datasets are given as follows: https:// huggingface.co /datasets/amazon_polarity, https:// www.cs.cornell.edu /people/pabo/movie-review-data/ , https://huggingface.co/datasets/ imdb , https:// huggingface.co/ datasets/sst2 , https:// huggingface.co/ datasets/ mteb/tweet_sentiment_extraction , https:// huggingface.co/ datasets/ yelp_polarity , https://huggingface.co /datasets/nyu-mll/glue
Dataset Splits Yes Let us denote the train and test splits of the target dataset as D T and D T respectively. ... In our experiments, with few shot fine-tuning, we vary n in the range of 2 256 samples. ... For DAFT-E, each dataset was split in half: one half was used to tune the linear weights of DAFT-E, and the other half was used for performance evaluation9. ... Due to the small size of each dataset, we repeated the random split 200 times for each of the 9 datasets and report the average performance.
Hardware Specification Yes To fine-tune and train these models we used Google Colab platform with the T4 GPU equipped machine.
Software Dependencies No The paper mentions using 'SGDRegressor from sklearn.linear_model' and 'Random Forest Classifier from sklearn.ensemble', as well as 'huggingface using the Auto Tokenizer.from_pretrained', but does not provide specific version numbers for these software libraries or packages.
Experiment Setup Yes In our experiments, with few shot fine-tuning, we vary n in the range of 2 256 samples. For the (FFT) models, we fine-tune until loss stabilization3. ... For LR we have used SGDRegressor from sklearn.linear_model with maximum iteration of 3. ... We also used coefficient initialization = 1/N, where N is the number of DAFT models used... For RF we imported the Random Forest Classifier from sklearn.ensemble, and set the max depth = 2. ... For both FT and DA(FT)2 on few shot training, we performed the all the runs five times with five different seeds. For the case of DAFT-E, the weight calculations were done using five different random seeds as well.