Unlocking Efficient, Scalable, and Continual Knowledge Editing with Basis-Level Representation Fine-Tuning

Authors: Tianci Liu, Ruirui Li, Yunzhe Qi, Hui Liu, Xianfeng Tang, Tianqi Zheng, Qingyu Yin, Monica Cheng, Jun Huan, Haoyu Wang, Jing Gao

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on three LLMs with five editing benchmarks in diverse scenarios show the superiority of our method. Extensive experimental results in Sec 3 demonstrate the superiority of our method for conducting knowledge editing at much better parameter efficiency than existing methods.
Researcher Affiliation Collaboration 1Purdue University 2Amazon 3UIUC 4AWS AI Lab 5SUNY Albany
Pseudocode No The paper describes the methodology using mathematical equations (e.g., Eqn (1), Eqn (2), Eqn (3)) and a workflow diagram (Fig 4), but does not include any clearly labeled 'Pseudocode', 'Algorithm', or structured code-like blocks.
Open Source Code No Our experiments are conducted with Easy Edit (Wang et al., 2024e). More implementation details and hyper-parameters can be found in App C.
Open Datasets Yes Tasks. Following previous works (Wang et al., 2023; Zhang et al., 2024b), we edit different kinds of knowledge: Wiki Datarecent, Wiki Datacounterfact (Cohen et al., 2024), Wiki Bio (Hartvigsen et al., 2024), Conv Sent (Mitchell et al., 2022), and Zs RE (Yao et al., 2023).
Dataset Splits No The paper mentions using the Zs RE dataset for continual and batched editing and different editing scenarios (Single, Continual, Batched) but does not provide specific train/test/validation split percentages, sample counts, or explicit instructions for how the datasets were partitioned for their experiments.
Hardware Specification Yes Table 3: Parameter size and editing time with an NVIDIA V100 32-GB GPU (averaged over 100 samples).
Software Dependencies No The paper mentions using Adam W (Loshchilov & Hutter, 2019) as an optimizer and Easy Edit (Wang et al., 2024e) as a framework, but does not specify version numbers for these or other key software components (e.g., Python, PyTorch/TensorFlow).
Experiment Setup Yes Table 5: Hyper-parameters of different methods. For baselines, we only provided settings that were different from Wang et al. (2024e). ... Ba FT & Re FT: Subspace Rank 12, Pos. to Intervene Last 3 of Input + Output, Lay. to Intervene 9;18;24;28, Learning Rate 3e-4 for Single and Continual Editing; 1e-4 for Batched Editing, Maximum Steps 40 for Single and Continual Editing; 70 for Batched Editing, Locality Reg. (Ba FT) α = 0.01, β = 0.05, γ = 0.02 α = 0.01, β = 0.1, γ = 0.05 α = 0.01, β = 0.1, γ = 0.05 Maximum Steps 40 for Single and Continual Editing; 70 for Batched Editing