reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

UnSTAR: Unlearning with Self-Taught Anti-Sample Reasoning for LLMs

Authors: Yash Sinha, Murari Mandal, Mohan Kankanhalli

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	3 Experiments and Results 3.1 Experiments Experimental Setup. We use the identical experimental settings as in the case of RWHP (Liu et al. (2024a)) using the Wikipedia Person Unlearn (WPU) dataset. The LLM must unlearn multiple individuals simultaneously, capturing the nuances of both forgetting and retaining relevant knowledge. Datasets. Metrics. We utilize multiple metrics to assess the performance of the model across various dimensions. Baselines. We evaluate our method against eight baselines: Models and Implementation. 3.2 Results.
Researcher Affiliation	Academia	Yash Sinha EMAIL School of Computing National University of Singapore Murari Mandal EMAIL Resp AI Lab, School of Computer Engineering KIIT Bhubaneswar, India Mohan Kankanhalli EMAIL School of Computing National University of Singapore
Pseudocode	Yes	Algorithm 1: Un Star: This algorithm outlines how to generate anti-samples from the forget set and fine-tune the model while preserving knowledge from the retain set.
Open Source Code	Yes	Source code: https://github.com/Machine Unlearn/Un Star
Open Datasets	Yes	We use the identical experimental settings as in the case of RWHP (Liu et al. (2024a)) using the Wikipedia Person Unlearn (WPU) dataset. Similar to WPU, the Peter Parker forgetting dataset, is constructed using GPT-4-turbo and GPT-3.5-turbo as presented in Opt-Out Choi et al. (2025). TOFU dataset Maini et al. (2024) contains QA pairs about fictitious authors.
Dataset Splits	Yes	The WPU dataset includes a diverse set of individuals designated as unlearning targets, along with their associated documents and test data in a free-response question-answering (QA) format. This setup assesses three distinct knowledge types. ❶Forget QA (FQA): These questions target the unlearning subjects with answers sourced from the unlearning documents. ... ❷Hard-retain QA (HRQA): ... ❸General-retain QA (GRQA): ... The dataset includes 100 examples for the forgetting set Df and 300 examples for retaining set Dr, generated using a diverse set of prompts. TOFU dataset ... is also divided into retain and forget sets. The detailed statistics are presented in Table 3.
Hardware Specification	Yes	All experiments were conducted on an Apple M3 Pro chip with 18 GB of unified memory.
Software Dependencies	No	We evaluate our approach using the Mistral 7B Instruct v0.3 model, a compact yet powerful language model fine-tuned for instruction-based tasks. We fine-tune the Mistral 7B model using Lo RA (Low-Rank Adaptation) via the mlx-lm library.
Experiment Setup	Yes	For Un Star, we run over multiple iterations. For each iteration, 20 paraphrased questions and incorrect answers are generated. Semantically divergent questions and near-correct incorrect answers are filtered. Misleading justifications are generated for the retained questions, and the model is fine-tuned for 10 epochs. Iterations continue until the target is unlearned. For WPU and Peter Parker, the training hyperparameters are shown in Table 4.