Privacy-Aware Time Series Synthesis via Public Knowledge Distillation
Authors: Penghang Liu, Haibei Zhu, Eleonora Kreacic, Svitlana Vyetrenko
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that Pub2Priv consistently outperforms state-of-the-art benchmarks in improving the privacy-utility trade-off across finance, energy, and commodity trading domains. ... In this section, we assess the performance of Pub2Priv by analyzing the privacy-utility trade-off in comparison to state-of-the-art baselines across multiple domains. |
| Researcher Affiliation | Industry | Penghang Liu EMAIL JPMorgan AI Research; Haibei Zhu EMAIL JPMorgan AI Research; Eleonora Kreacic EMAIL JPMorgan AI Research; Svitlana Vyetrenko EMAIL JPMorgan AI Research |
| Pseudocode | Yes | Algorithm 1 Training Algorithm for Differentially Private Generator θDM |
| Open Source Code | No | The paper does not explicitly state that the code for Pub2Priv is open-source or provide a link to a code repository. It mentions using 'Opacus (Yousefpour et al., 2021)' which is a third-party library, but this does not imply the authors' own implementation code is available. |
| Open Datasets | Yes | Electricity usage: The private data contains the daily electricity consumption of 370 users in Évora, Portugal (Bessa et al., 2015; Trindade, 2015) from 2011 to 2014. ... Artur Trindade. Electricityloaddiagrams20112014. UCI Machine Learning Repository, 10:C58C86, 2015. ... Semiconductor trading: We also collected international trading data from the UN Comtrade dataset 1. ... 1UN Comtrade dataset: https://comtradeplus.un.org/ |
| Dataset Splits | No | The paper describes how data is used for evaluation metrics (e.g., 'The original and synthetic data are evenly distributed in both training and testing datasets' for TSTR discriminative), but it does not provide specific split percentages or sample counts for training, validation, and testing of the primary datasets for their proposed model. |
| Hardware Specification | Yes | All experiments in the paper were conducted on AWS g4dn.4xlarge instances (16 v CPUs, 64 GB RAM, 16 GB GPU). |
| Software Dependencies | No | The paper mentions using 'Opacus (Yousefpour et al., 2021)' but does not specify its version number or any other software dependencies with version numbers. |
| Experiment Setup | Yes | We employ DP-SGD to protect the private data during training, which consists of two major procedures: gradient clipping and gradient noise addition. ... the gradients are clipped according to their ℓ2 norm and the clipping threshold C (we explore C {0.1, 0.5, 1.0, 1.5, 2.0} and select the one with lowest validation loss). ... Table 3: Hyperparamters of Pub2Priv. ... Table 4: Hyperparameters of baseline models. ... Learning rate 1e-4 ... We train all models with batch size 32 for 100 epochs. |