reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Explaining the role of Intrinsic Dimensionality in Adversarial Training

Authors: Enes Altinisik, Safa Messaoud, Husrev Taha Sencar, Hassan Sajjad, Sanjay Chawla

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate SMAAT across multiple tasks, including text generation, sentiment classification, safety filtering, and retrieval augmented generation setups, demonstrating superior robustness with comparable generalization to standard training. 6. Experiments
Researcher Affiliation	Academia	1Qatar Computing Research Institute, HBKU, Doha, Qatar 2Faculty of Computer Science,Dalhousie University, Halifax, Canada. Correspondence to: Enes Altinisik <EMAIL>.
Pseudocode	Yes	Algorithm 1 SMAAT
Open Source Code	Yes	1The code is publicly available at: https://github.com/EnesAltinisik/SMAAT-25/tree/main
Open Datasets	Yes	AGNEWS (Zhang et al., 2015), IMDB (Maas et al., 2011), and YELP (Zhang et al., 2015) datasets... LAT dataset (Sheshadri et al., 2024)... MT-Bench (Zheng et al., 2024)... Adv Bench dataset (Zou et al., 2023) and the Helpfulness Harmfulness dataset (HH-RLHF) (Bai et al., 2022)... Natural Questions (NQ) (Kwiatkowski et al., 2019)... Ultra Chat dataset (Ding et al., 2023)... Harm Bench (Mazeika et al., 2024)... GLUE and Adv GLUE benchmarks
Dataset Splits	Yes	For testing, we use a subset of 1000 test samples from each dataset, following previous work practices... In addition, we randomly sample 10% of the training set for validation in all datasets. For the YELP dataset, we created a fine-tuned Ro BERTa model for 2 epochs with a learning rate of 1e 05 and a batch size of 32.
Hardware Specification	Yes	In our evaluation, we use a V100 GPU with 32 GB memory and 64 CPUs.
Software Dependencies	No	The paper mentions using the Text Attack framework (Morris et al., 2020) and Neuro X (Dalvi et al., 2023) but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	To train the last layer of fθ with adversarial samples, we create adversarial samples using 5-step PGD attacks. During training, we use epsilon values of 0.1, 0.1, and 0.8 for the YELP, AGNEWS, and IMDB datasets, respectively, for the BERT models. For the Ro BERTa models, we employ epsilon values of 0.1, 0.6, and 0.03. All models are trained 10 epochs with a learning rate of 0.1. The model is trained with a learning rate of 2e 4, applying LAT at every even-numbered layer with norm bounds ranging from 1 to 5. In the case of SMAAT, we conducted a grid search for the learning rate, ranging from 0.1 to 0.001, and the ϵ value, ranging from 0.8 to 0.01, using 3-PGD steps. In all cases, standard models are trained over 5 epochs with a learning rate of 1e 5. Table 6 details the training hyperparameters for SMAAT.