Rethinking the Bias of Foundation Model under Long-tailed Distribution
Authors: Jiahao Chen, Bin Qin, Jiangmeng Li, Hao Chen, Bing Su
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we examine how such imbalances from pre-training affect long-tailed downstream tasks. Specifically, we find the imbalance biases inherited in foundation models on downstream tasks as parameter imbalance and data imbalance. ... We achieve at least 1.5%, 1.5%, 2.0% performance gains on Image Net-LT (Deng et al., 2009), Places365LT (Liu et al., 2019), and i Naturalist2018 (Van Horn et al., 2018) compared with state-of-the-art methods. |
| Researcher Affiliation | Academia | 1Gaoling School of Artificial Intelligence, Renmin University of China 2Beijing Key Laboratory of Research on Large Models and Intelligent Governance 3Engineering Research Center of Next-Generation Intelligent Search and Recommendation, MOE 4Institute of Software Chinese Academy of Sciences 5University of Chinese Academy of Sciences 6Electrical and Computer Engineering, Carnegie Mellon University. Correspondence to: Bing Su <EMAIL>. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing code or a link to a code repository. |
| Open Datasets | Yes | We achieve at least 1.5%, 1.5%, 2.0% performance gains on Image Net-LT (Deng et al., 2009), Places365LT (Liu et al., 2019), and i Naturalist2018 (Van Horn et al., 2018) compared with state-of-the-art methods. |
| Dataset Splits | No | Following OLTR (Liu et al., 2019), we split the classes into three groups named D-Many , D-Medium , and D-Few relying on the number of samples. Similarly, for parameter imbalance, we split the classes into three groups named P-Many , P-Medium , and P-Few relying on b PP (Y ). More details are in the Appendix Sec. A. ... Additionally, in Tab.11, we provide results highlighting the performance under parameter imbalance. |
| Hardware Specification | Yes | For training resources, all experiments are conducted on Intel(R) Xeon(R) Gold 5318Y CPU @ 2.10GHz with a single RTX A40 GPU. Normally, a GPU with 24GB of memory is sufficient for the reproduction. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python version, PyTorch/TensorFlow version, CUDA version). |
| Experiment Setup | Yes | We present the details about the hyper-parameters of our experiments on different datasets in Tab. 9, where lr, epochs denote the initial learning rate and training epochs, respectively. We denote batch size in Tab. 9 as the training batch size during the fine-tuning phase. ... The learning rate, number of epochs, and parameter initialization strategies follows (Shi et al., 2024). |