reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Differentially Private Gradient Flow based on the Sliced Wasserstein Distance

Authors: Ilana Sebag, Muni Sreenivas Pydi, Jean-Yves Franceschi, Alain Rakotomamonjy, Mike Gartrell, Jamal Atif, Alexandre Allauzen

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that our proposed model can generate higher-fidelity data at a low privacy budget compared to a generator-based model, offering a promising alternative. [...] 4 Experiments In this section we evaluate our method within a generative modeling context. The primary objective is to validate our theoretical framework and showcase the behavior of our approach, rather than strive for state-of-the-art results in generative modeling. Our study focuses on a specific claim: in a privacy setting, a gradient flow performs better than generator-based models trained with the same metric. The code for the experiments can be found at https://github.com/ilanasebag/dpswgf. [...] Table 1: FID results for each baseline, dataset and privacy setting, averaged over 5 generation runs. [...] Figures Fig. 2 presents the results from our model and the baseline without any differential privacy applied. Figs. 3 and 4 show the results for ε = 10 and ε = 5, respectively.
Researcher Affiliation	Collaboration	Ilana Sebag EMAIL Criteo AI Lab, Paris, France Miles Team, LAMSADE, Université Paris-Dauphine, PSL University, CNRS, Paris, France [...] Mike Gartrell EMAIL Sigma Nova, Paris, France (Work done while working at Criteo AI lab)
Pseudocode	Yes	Algorithm 1: DP Sliced Wasserstein Flow with resampling of θ s: DPSWflow-r. [...] Algorithm 2: DP Sliced Wasserstein Flow without resampling of the θs: DPSWflow.
Open Source Code	Yes	The code for the experiments can be found at https://github.com/ilanasebag/dpswgf.
Open Datasets	Yes	We assessed each method on three datasets: MNIST (Le Cun et al., 1998), Fashion MNIST (F-MNIST, Xiao et al., 2017), and Celeb A (Liu et al., 2015).
Dataset Splits	Yes	In order to uphold the integrity of the differential privacy framework and mitigate potential privacy breaches, we conducted separate pre-training procedures for the autoencoder and the flows / generator using distinct datasets: a publicly available dataset for the autoencoder, and a confidential dataset for the flows / generator. In practice, we partitioned the training set X into two distinct segments of equal size, denoted as Xpub and Xpriv. Subsequently, we conducted training of the autoencoder on Xpub. Then, we compute the encoded representation on Xpriv in the latent space and use it as input to DPSWflow, DPSWflow-r, and DPSWgen. [...] Celeb A is comprised of a training, testing, and validation dataset. After conducting initial experiments and analysis with the validation set, we removed it. We then split the training set into two equally-sized public and private datasets.
Hardware Specification	Yes	For this project we use 1 NVIDIA GPU Tesla V100 which was necessary for the pretraining of the auto-encoder only.
Software Dependencies	No	All neural networks are coded using Py Torch (Paszke et al., 2019). [...] In our code we use the Py Torch implementation of Torch Metrics (Skafte Detlefsen et al., 2022).
Experiment Setup	Yes	Here we set Nθ = 200, h = 1, and λ = 0.001. [...] All three DPSWflow, DPSWflow-r, and DPSWgen models are evaluated with a batch size of 250 and for δ = 10 5 and the latent space (of the autoencoder, used as input of the mechanisms) has size 8. DPSWflow is evaluated over 1500 epochs for all values of ε, while DPSWflow-r and DPSWgen are evaluated on 35 epochs for ε = and ε = 10, and on 20 epochs for ε = 5. DPSWflow uses Nθ = 31 and Mθ = 25, while DPSWflow-r and DPSWgen use Nθ = 70. [...] For DPSWflow-r and DPSWgen we used the following pairs of σ, ε: σ = 0, ε = ; σ = 0.67, ε = 10; and σ = 0.8, ε = 5.