reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On the Robustness of Dataset Inference

Authors: Sebastian Szyller, Rui Zhang, Jian Liu, N Asokan

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We then confirm empirically that DI in the black-box setting leads to FPs, with high confidence. We also show that black-box DI suffers from false negatives (FNs): an adversary who has in fact stolen a victim model can avoid detection by regularising their model with adversarial training. We provide empirical evidence that an adversary who steals the victim s dataset itself and adversarially trains a model can evade detection by DI by trading off accuracy of the stolen model. We empirically demonstrate the existence of FPs in a realistic black-box DI setting (Section 3.2.2);
Researcher Affiliation	Academia	Sebastian Szyller EMAIL Aalto University Rui Zhang EMAIL Zhejiang University Jian Liu EMAIL Zhejiang University N. Asokan EMAIL University of Waterloo & Aalto University
Pseudocode	No	The paper describes methods and algorithms in prose but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code or a link to a code repository for the methodology described in this paper. It mentions using "the official implementation of DI" for some experiments, but this refers to a third-party tool, not the authors' own code.
Open Datasets	Yes	For the original formulation, e.g. for CIFAR10, CIFAR10-train (50, 000 samples) is used as SV , and CIFAR10-test is used as S0 (10, 000 samples). We use an analogous split for CIFAR100.
Dataset Splits	Yes	1) randomly split CIFAR10-train into two subsets (Atrain and Btrain) of 25, 000 samples each; 2) assign SV = Atrain, and train f V using it; 3) continue using CIFAR10-test as S0 (nothing changes), and train f0 using it; 4) g V is trained using the embedding for S0 and the new SV , obtained from the new f V; 5) assign SI = Btrain, independent data of a third-party I, who trains their model f I.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using "projected gradient descent (Madry et al., 2018) (PGD)" and "the official implementation of DI", but it does not specify any software libraries or frameworks with version numbers.
Experiment Setup	Yes	With the weights initialized to zero, f learns the weights using gradient descent with learning rate 1 until yf(x) is maximized. During adversarial training, each training sample (x, y) is replaced with an adversarial example that is misclassified f A(x + γ) = y. We use projected gradient descent (Madry et al., 2018) (PGD), and we set γ = 10/255 (under l∞).