Perturbation-Restrained Sequential Model Editing

Authors: Jun-Yu Ma, Hong Wang, Hao-Xiang Xu, Zhen-Hua Ling, Jia-Chen Gu

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Systematically, we conduct experiments employing three editing methods on three LLMs across four downstream tasks. The results show that PRUNE can preserve general abilities while maintaining the editing performance effectively in sequential model editing.
Researcher Affiliation Academia Jun-Yu Ma1,2, Hong Wang1, Hao-Xiang Xu1,2, Zhen-Hua Ling1,2, Jia-Chen Gu3 1University of Science and Technology of China 2National Engineering Research Center of Speech and Language Information Processing 3University of California, Los Angeles
Pseudocode No The paper describes methods and formulas but does not include a clearly labeled pseudocode or algorithm block.
Open Source Code Yes The code are available at https://github.com/mjy1111/PRUNE.
Open Datasets Yes For factual knowledge, two popular model editing datasets Zero-Shot Relation Extraction (ZSRE) (Levy et al., 2017) and COUNTERFACT (Meng et al., 2022) were adopted in our experiments. ... For conceptual knowledge, the Concept Edit dataset (Wang et al., 2024) was adopted. ... Reasoning on the GSM8K (Cobbe et al., 2021), Summarization on the SAMSum (Gliwa et al., 2019), Open-domain QA on the Natural Question (Kwiatkowski et al., 2019), and Natural language inference (NLI) on the RTE (Dagan et al., 2005).
Dataset Splits No For each dataset, some examples were randomly sampled for evaluation. Details of prompts for each task were shown in Appendix B.4.
Hardware Specification Yes We used NVIDIA A800 80GB GPU for experiments.
Software Dependencies No The paper mentions using a framework called Easy Edit (Wang et al., 2023) and various LLMs (GPT-2 XL, LLaMA-2, LLaMA-3) but does not provide specific version numbers for programming languages, libraries, or other software dependencies crucial for replication.
Experiment Setup Yes When conducting experiments, for different editing methods, LLMs and editing datasets, the hyperparameter α in function F of PRUNE is different. Table 4 shows the details of this hyperparameter.