Mitigating Memorization in Language Models
Authors: Mansi Sakarvadia, Aswathy Ajith, Arham Khan, Nathaniel Hudson, Caleb Geniesse, Kyle Chard, Yaoqing Yang, Ian Foster, Michael W Mahoney
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we investigate methods to mitigate memorization: three regularizer-based, three finetuning-based, and eleven machine unlearning-based methods... We demonstrate that the mitigation methods that we develop using Tiny Mem can successfully be applied to production-grade LMs, and we determine via experiment that: regularizer-based mitigation methods are slow and ineffective at curbing memorization; fine-tuning-based methods are effective at curbing memorization, but overly expensive, especially for retaining higher accuracies; and unlearning-based methods are faster and more effective... The main contributions of our work are the following: 2. We provide a comprehensive empirical comparison of three classes of mitigation strategies... 3. We present an extensive analysis of each mitigation strategy under various model training recipes... Results are in Tables 1 and 2 with additional results in Fig. 7 under Appendix A.2.3. |
| Researcher Affiliation | Academia | 1University of Chicago, 2Argonne National Laboratory, 3Lawrence Berkeley National Laboratory 4Dartmouth College, 5International Computer Science Institute, 6University of California, Berkeley |
| Pseudocode | Yes | We detail the weight-level unlearning methods in Algorithms: 1,2,3,4,5,6. In these algorithms, we vary the following input parameters: 1. LM: original language model 2. K: number of weights to drop (ratio num model parameters) 3. num epochs: number of iterations to perform the procedure 4. memorized sequences: set of sequences memorized by the LM 5. random sequences: set of random sequences 6. loss weight: weighting coefficient for Balanced Subnet optimization objective |
| Open Source Code | Yes | 1. We introduce Tiny Mem1, a computationally efficient suite of GPT2-style models that enables rapid development and evaluation of memorization mitigation strategies. 1https://github.com/msakarvadia/memorization |
| Open Datasets | Yes | Tiny Mem 2: Toy Language models. These GPT2-style models are trained on a Wikipedia dataset (Merity et al., 2017)... Production-Grade Language models. We consider the production-grade Pythia models (Biderman et al., 2023), trained on the Pile dataset (Gao et al., 2020)... We employ the memorized sequences extracted from Chang et al. (2024)... |
| Dataset Splits | Yes | Dataset Size: For both additive and multiplicative data, to better understand the effect of dataset size on memorization, we train three versions of each math model with increasing dataset size. Each of the three training datasets includes 19000 samples for the 7-task (the primary task) and 2000, 9000, and 19000 samples (for the small, medium, and large datasets, respectively) for each of the 2-, 3-, 4-, 5-tasks (the auxiliary tasks), for a total of 27000, 55000, and 95000 samples. We also create a test dataset comprising 5000 clean (i.e., non-perturbed) samples: 1000 each for the 2-, 3-, 4-, 5-, and 7-tasks. |
| Hardware Specification | Yes | Table 6: System Parameters for Polaris (ALCF, 2024) and Perlmutter (NERSC, 2024) to estimate energy and carbon usage. Machine cuf ctdp ngpu guf gputdp DRAM PUE Polaris 0.5 225 4 1 250 512 1.58 Perlmutter 0.5 280 4 1 300 256 1.58 |
| Software Dependencies | No | The paper mentions "GPT2-style models", "Pythia models", "Flash Attention", but does not specify version numbers for general software dependencies or libraries used for the experiments. |
| Experiment Setup | Yes | Experimental Design. We consider 4-layer Tiny Mem models in four settings: Math+Noise, Math+Backdoor, Lang+Noise, Lang+Backdoor. Each model is trained with L2 regularization and an additional regularizer from Section 3.1 over three random seeds. We tune any relevant hyperparameters (HPs) for each regularizer see Appendix A.2.1. ...For spectral norm regularization, we varied the hyperparameter lam {0.001, 0.01, 0.1}; lam is the regularization coefficient... For each model in Tiny Mem, we vary these hyperparameters for various methods: Balanced Subnet: ratio {0.00001, 0.0001, 0.001, 0.01, 0.05, 0.1, 0.25, 0.3} num epochs {1, 10, 20} loss weight {0.9, 0.7, 0.5}... |