reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Representation Shattering in Transformers: A Synthetic Study with Knowledge Editing

Authors: Kento Nishi, Rahul Ramesh, Maya Okawa, Mikail Khona, Hidenori Tanaka, Ekdeep Singh Lubana

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through evaluations of edited models on this task, we show that KE inadvertently affects representations of entities beyond the targeted one, distorting relevant structures that allow a model to infer unseen knowledge about an entity. We further corroborate our findings in naturalistic settings with pre-trained Llama and Mamba models as well.
Researcher Affiliation	Collaboration	1Harvard College 2CBS-NTT Program in Physics of Intelligence, Harvard University 3Physics and Informatics Lab, NTT Research Inc. 4Computer and Information Science, University of Pennsylvania 5Department of Physics, Massachusetts Institute of Technology.
Pseudocode	Yes	Algorithm 1: Generate a single sequence containing a collection of facts.
Open Source Code	Yes	Please find the source code for our experiments at github.com/Kento_Nishi/KE-ICML-2025.
Open Datasets	Yes	To quantify model performance before and after editing, we adopt the MMLU-Redux reasoning benchmark (Gema et al., 2024) with the Zero Eval prompting framework (Lin, 2024) to elicit chain-of-thought reasoning.
Dataset Splits	No	The paper defines concepts like 'edit sub-graph,' 'retain sub-graph,' and 'test sub-graph' for knowledge editing. It also mentions that certain facts are 'held out' for logical and compositional inference tasks. Additionally, for ROME, it states: 'The covariance matrix C is estimated by randomly sampling 10^5 inputs from the validation dataset.' However, specific percentages or absolute sample counts for the main training/validation/test splits of the synthetic data generated are not provided, nor are explicit split details for the MMLU-Redux benchmark used in the LLM experiments.
Hardware Specification	No	The paper states: 'For all experiments (unless stated otherwise), we use a 2-layer nano GPT Transformer (Karpathy, 2021).' It also mentions using 'pre-trained Llama and Mamba models.' However, no specific GPU models, CPU models, or other hardware specifications used for running the experiments are provided.
Software Dependencies	No	The paper mentions: 'Our Transformer model is a fork of the open-source nano GPT repository (https://github.com/karpathy/nano_GPT).' It also states: 'The value optimization is performed using the Adam optimizer, with hyperparameters lr = 10^-3 and weight decay = 10^-4.' While these refer to software components and tools, specific version numbers for these software dependencies (e.g., Python version, PyTorch version, nano GPT version) are not explicitly provided in the text.
Experiment Setup	Yes	We train a Transformer model using next-token prediction on the synthetic data generated from the above data generation process. For all experiments (unless stated otherwise), we use a 2-layer nano GPT Transformer (Karpathy, 2021). Batch size: 256 Context length: 16 Optimizer: Adam Learning rate: 6 × 10−4 Training epochs: 1.5 × 10^5 Decay iterations: 1.5 × 10^5 Momentum: β1 = 0.9, β2 = 0.95 Activation function: GeLU Block size: 16 Embedding dimensions: 24 Heads: 12