reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Sample-Level Evaluation and Generative Framework for Model Inversion Attacks

Authors: Haoyang Li, Li Bai, Qingqing Ye, Haibo Hu, Yaxin Xiao, Huadi Zheng, Jianliang Xu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments verify the effectiveness of our framework on improving state-of-the-art MI attacks over various metrics including DDCS, coverage and FID. Finally, we demonstrate that DDCS can also be useful for MI defense, by identifying samples susceptible to MI attacks in an unsupervised manner.
Researcher Affiliation	Collaboration	1Department of Electrical and Electronic Engineering, The Hong Kong Polytechnic University 2Huawei Technology 3 Department of Computer Science, Hong Kong Baptist University
Pseudocode	Yes	Algorithm 1: DDCS Computation Input: Target dataset Dtar, reconstructed dataset Drec, constant c Output: DDCSavg, DDCSbest 1: Initialize Si tar = for xi tar Dtar 2: for xj rec Drec do 3: Record di tar = d(xi tar, xj rec) into Si tar, where xj rec and xi tar achieve the smallest di tar for xi tar Dtar 4: end for 5: Initialize DDCSavg = 0, DDCSbest = 0 6: for xi tar Dtar do 7: DDCSavg+ = mdf[avg(Si tar) + c] 8: DDCSbest+ = mdf[min(Si tar) + c] 9: end for 10: Normalize DDCSavg and DDCSbest with \|Dtar\|
Open Source Code	Yes	1https://github.com/haoyangli ASTAPLE/DDCS MI.git
Open Datasets	Yes	We conduct experiments1 on four highresolution image datasets for two image classification tasks, namely, face recognition and dog breed classification. For face recognition, we use UMDFaces (Bansal et al. 2017) face dataset for training victim models and Celeb A-HQ (Karras et al. 2018; Liu et al. 2015) dataset as the attacker s auxiliary dataset. For dog breed classification, Stanford Dogs (Khosla et al. 2011) dataset is selected as the training dataset of victim models and Tsinghua Dogs (Zou et al. 2020) for the auxiliary datasets. We resume pre-trained Style GAN2 (Karras et al. 2020) of resolution 256 256 trained on two datasets, Celeb A-HQ for face identification task and LSUN-Dogs (Yu et al. 2015) for dog breed classification task.
Dataset Splits	Yes	Similar to previous works (Chen et al. 2021; Zhang et al. 2020), we take a subset of 1000 identities from UMDFaces that contains 50981 training samples and 3000 test samples as Dtar to train the victim models. Following (Struppek et al. 2022), we use all labels for Stanford Dogs and split them into 18780 training samples and 1800 test samples.
Hardware Specification	No	The paper mentions training models like VGG16, ResNet50, IR50-SE, and AlexNet, and resuming pre-trained Style GAN2. It does not provide any specific details about the hardware (GPU/CPU models, memory, or specific computing environments) used for these experiments.
Software Dependencies	No	The paper mentions various models and frameworks such as VGG16, ResNet50, IR50-SE, AlexNet, Style GAN2, KEDMI, PPA, and Inceptionv3, but it does not specify any software libraries or dependencies with their version numbers (e.g., PyTorch 1.9, TensorFlow 2.x).
Experiment Setup	Yes	Similar to previous works (Chen et al. 2021; Zhang et al. 2020), we take a subset of 1000 identities from UMDFaces that contains 50981 training samples and 3000 test samples as Dtar to train the victim models. Following (Struppek et al. 2022), we use all labels for Stanford Dogs and split them into 18780 training samples and 1800 test samples. To prevent strong assumptions about attacks, we limit the attacker s sample size in the auxiliary dataset to 30000 for both tasks. Models. We train 4 types of victim models: (1) VGG16 (Simonyan and Zisserman 2015) with Batch Normalization (Ioffe and Szegedy 2015) (VGG16BN), (2) Res Net50 (He et al. 2016), (3) Improved Res Net50 with Squeeze-and-Excitation (Hu, Shen, and Sun 2018) blocks (IR50-SE), and (4) Alex Net (Krizhevsky 2014). We resume pre-trained Style GAN2 (Karras et al. 2020) of resolution 256 256 trained on two datasets, Celeb A-HQ for face identification task and LSUN-Dogs (Yu et al. 2015) for dog breed classification task. Since both customized datasets can achieve perfect distance to Dtar, we regularize the range of DDCS by setting c in Algorithm 1 to 1.0.