reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

MUC: Machine Unlearning for Contrastive Learning with Black-box Evaluation

Authors: Yihan Wang, Yiwei Lu, Guojun Zhang, Franziska Boenisch, Adam Dziedzic, Yaoliang Yu, Xiao-Shan Gao

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through empirical comparisons with baseline methods on Sim CLR, Mo Co, and CLIP, we demonstrate that AC: (1) achieves state-of-the-art performance, approximating exact unlearning (retraining); (2) enables data owners to clearly visualize unlearning effects through black-box evaluation.
Researcher Affiliation	Collaboration	Yihan Wang EMAIL University of Waterloo Yiwei Lu EMAIL University of Ottawa Guojun Zhang EMAIL Alibaba Franziska Boenisch EMAIL CISPA Helmholtz Center for Information Security Adam Dziedzic EMAIL CISPA Helmholtz Center for Information Security Yaoliang Yu EMAIL University of Waterloo Vector Institute Xiao-Shan Gao EMAIL Academy of Mathematics and Systems Science, Chinese Academy of Sciences University of Chinese Academy of Sciences
Pseudocode	No	The paper describes the unlearning methods (Retraining, Fine-tuning, Gradient Ascent, Neg Grad, ℓ1-Sparsity, and Alignment Calibration) in textual form, for example, 'Alignment Calibration (AC) optimizes a novel loss function that involves three terms...' without presenting them in structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available at https://github.com/EhanW/Alignment-Calibration.
Open Datasets	Yes	For unimodal contrastive unlearning, we perform experiments on CIFAR-10/CIFAR-100 Krizhevsky et al. (2009)/SVHN (Netzer et al., 2011)... For multimodal contrastive unlearning, we evaluate CLIP (Radford et al., 2021) on an Image-Text paired dataset called MS-COCO (Lin et al., 2014)...
Dataset Splits	Yes	We split the 50K training images into a validation set of 5K images and a training set of 45K images. For example, when the unlearning task is to forget 10% of training data, the unlearn dataset Dunlearn has 4.5K images and the retain dataset Dretain has 4.05K images. SVHN consists of 73,257 training images and 26,032 test images of 10 categories. ... We split the 10% of training images into a validation set and the rest into a training set.
Hardware Specification	Yes	For the pre-trained (clean) CLIP, we train the model for 35 epochs on 2 NVIDIA RTX 4090 GPUs... Experiments are conducted on a server with four NVIDIA A100 GPUs.
Software Dependencies	No	The paper mentions optimizers like SGD and Adam W but does not provide specific software library names with version numbers, such as PyTorch or TensorFlow versions, or Python versions.
Experiment Setup	Yes	For the pre-trained (clean) models, we train the encoder for 800 epochs using an SGD optimizer with cosine-scheduled learning rate initialized at 0.06, momentum of 0.9, and weight decay of 0.0005. For unlearning methods: Fine-tuning and Neg Grad updates the pre-trained encoder for 10 epochs with a learning rate searched in [0.003, 0.03]; Gradient Ascent updates the pre-trained encoder... The linear probing stage trains a linear classifier head for 100 epochs using an SGD optimizer with a cosine-scheduled learning rate initialized at 1.0, and a momentum of 0.9. The batch size is set as 512 for both encoder and linear head training.