reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Unlocking Efficient, Scalable, and Continual Knowledge Editing with Basis-Level Representation Fine-Tuning

Authors: Tianci Liu, Ruirui Li, Yunzhe Qi, Hui Liu, Xianfeng Tang, Tianqi Zheng, Qingyu Yin, Monica Cheng, Jun Huan, Haoyu Wang, Jing Gao

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on three LLMs with five editing benchmarks in diverse scenarios show the superiority of our method. Extensive experimental results in Sec 3 demonstrate the superiority of our method for conducting knowledge editing at much better parameter efficiency than existing methods.
Researcher Affiliation	Collaboration	1Purdue University 2Amazon 3UIUC 4AWS AI Lab 5SUNY Albany
Pseudocode	No	The paper describes the methodology using mathematical equations (e.g., Eqn (1), Eqn (2), Eqn (3)) and a workflow diagram (Fig 4), but does not include any clearly labeled 'Pseudocode', 'Algorithm', or structured code-like blocks.
Open Source Code	No	Our experiments are conducted with Easy Edit (Wang et al., 2024e). More implementation details and hyper-parameters can be found in App C.
Open Datasets	Yes	Tasks. Following previous works (Wang et al., 2023; Zhang et al., 2024b), we edit different kinds of knowledge: Wiki Datarecent, Wiki Datacounterfact (Cohen et al., 2024), Wiki Bio (Hartvigsen et al., 2024), Conv Sent (Mitchell et al., 2022), and Zs RE (Yao et al., 2023).
Dataset Splits	No	The paper mentions using the Zs RE dataset for continual and batched editing and different editing scenarios (Single, Continual, Batched) but does not provide specific train/test/validation split percentages, sample counts, or explicit instructions for how the datasets were partitioned for their experiments.
Hardware Specification	Yes	Table 3: Parameter size and editing time with an NVIDIA V100 32-GB GPU (averaged over 100 samples).
Software Dependencies	No	The paper mentions using Adam W (Loshchilov & Hutter, 2019) as an optimizer and Easy Edit (Wang et al., 2024e) as a framework, but does not specify version numbers for these or other key software components (e.g., Python, PyTorch/TensorFlow).
Experiment Setup	Yes	Table 5: Hyper-parameters of different methods. For baselines, we only provided settings that were different from Wang et al. (2024e). ... Ba FT & Re FT: Subspace Rank 12, Pos. to Intervene Last 3 of Input + Output, Lay. to Intervene 9;18;24;28, Learning Rate 3e-4 for Single and Continual Editing; 1e-4 for Batched Editing, Maximum Steps 40 for Single and Continual Editing; 70 for Batched Editing, Locality Reg. (Ba FT) α = 0.01, β = 0.05, γ = 0.02 α = 0.01, β = 0.1, γ = 0.05 α = 0.01, β = 0.1, γ = 0.05 Maximum Steps 40 for Single and Continual Editing; 70 for Batched Editing