reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Topological Signatures of Adversaries in Multimodal Alignments

Authors: Minh N. Vu, Geigh Zollicoffer, Huy Mai, Ben Nebgen, Boian Alexandrov, Manish Bhattarai

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide extensive experiments showing the presence of adversaries results in a clear distinction in the TC losses, i.e., in most settings, the TC losses monotonically change when more adversarial data is in the data batch. We conduct extensive experiments in 3 datasets (CIFAR-10, CIFAR-100, and Image Net), 5 CLIP embeddings (Res Net50, Res Net101, Vi T-B/16, Vi T-L/14, and Vi T-L/14@336px), 3 BLIP embeddings ( Vi T-B/14, Vi T-B/129, and Vi T-B/129-Cap Filt-L), and 6 adversarial generation methods (FGSM, PGD, Auto Attack, APGD, BIM, and Carlini-Wagner (CW)) to demonstrate the advantages of the two above findings.
Researcher Affiliation	Academia	Minh Vu 1 Geigh Zollicoffer 1 Huy Mai 2 Ben Nebgen 1 Boian Alexandrov 1 Manish Bhattarai 1 1Theoretical Division, Los ALamos National Laboratory, Los Alamos, NM, USA 2Independent. Correspondence to: Minh Vu <EMAIL>.
Pseudocode	Yes	Algorithm 1 outlines the pseudocode for computing the TP and MK losses.
Open Source Code	No	The code used in this study is currently under review for release by the organization. We are awaiting approval, and once granted, the code will be made publicly available.
Open Datasets	Yes	Using the Image Net (Deng et al., 2009) and CIFAR10 (Krizhevsky, 2009) datasets with CLIP-Vi T-B/32 and CLIP-Vi T-L/14@336px, respectively, we demonstrate our proposed Total Persistence (TP) loss Lα T P and Multi-scale Kernel (MK) loss Lσ MK under varying proportions of adversarial samples in the data batch.
Dataset Splits	Yes	Each MMD test is conducted on two disjoint subsets of clean and adversarial samples, each containing 50 images for CIFAR10 and CIFAR100, and 100 images for Image Net.
Hardware Specification	Yes	Our experiments were conducted on a cluster with nodes featuring four NVIDIA Hopper (H100) GPUs each, paired with NVIDIA Grace CPUs via NVLink-C2C for rapid data transfer essential for intensive computational tasks. Each GPU is equipped with 96GB of HBM2 memory, ideal for handling large models and datasets.
Software Dependencies	No	The gradients are computed by back-propagating Eq. 3 and 5 via Pytorch s implementations of the homologies (Aidos Lab, 2023).
Experiment Setup	Yes	Each test was conducted over 100 trials with Type-I error controlled at α = 0.05. The sizes of the holdout data Z for the topological features computation (Eq. 6) are 1000 and 3000 for CIFAR10/100 and Image Net, respectively. ... We employed torch-attack (Kim, 2020) to generate adversarial perturbations with magnitudes ϵ of 1/255, 2/255, 4/255, and 8/255. ... Each test batch consists of 50 clean or adversarial samples.