On Catastrophic Inheritance of Large Foundation Models
Authors: Hao Chen, Bhiksha Raj, Xing Xie, Jindong Wang
DMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this position paper, we propose to identify a neglected issue deeply rooted in LFMs: Catastrophic Inheritance, describing the weaknesses and limitations inherited from biased large-scale pre-training data to behaviors of LFMs on the downstream tasks... We discuss the challenges behind this issue and propose UIM, a framework to Understand the catastrophic inheritance of LFMs from both pre-training and downstream adaptation, Interpret the implications of catastrophic inheritance on downstream tasks, and how to Mitigate it. |
| Researcher Affiliation | Collaboration | Hao Chen EMAIL Carnegie Mellon University Bhiksha Raj EMAIL Carnegie Mellon University Xing Xie EMAIL Microsoft Research Jindong Wang EMAIL Microsoft Research, William & Mary |
| Pseudocode | No | The paper defines concepts and proposes a framework (UIM) but does not include any specific pseudocode or algorithm blocks. It presents a conceptual framework and discussions without formal algorithms. |
| Open Source Code | No | The paper discusses other models and their training data (e.g., LAION-5B, GPT) as examples to illustrate points about catastrophic inheritance, but it does not provide any specific code for the methodology or framework proposed in this paper. |
| Open Datasets | No | The paper cites numerous external datasets used in other research (e.g., LAION-5B, Red Pajama, ImageNet) to illustrate points about biased pre-training data, but it does not perform its own experiments using these datasets or any other dataset in the context of the framework it proposes. Therefore, it does not provide concrete access information for a dataset used in its own work. |
| Dataset Splits | No | This paper is a position paper proposing a framework and discussing existing research; it does not describe any experiments that would require dataset splits. |
| Hardware Specification | No | This paper is a position paper proposing a framework and discussing existing research; it does not describe any experiments that would require specific hardware for execution. |
| Software Dependencies | No | This paper is a position paper proposing a framework and discussing existing research; it does not describe any experiments that would require specific software dependencies for execution. |
| Experiment Setup | No | This paper is a position paper proposing a framework and discussing existing research; it does not describe any experiments or their setup, including hyperparameters or training configurations. |