Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

Making Translators Privacy-aware on the User's Side

Authors: Ryoma Sato

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate the effectiveness of PRISM using real-world translators, T5 and Chat GPT, and the datasets with two languages.
Researcher Affiliation Academia Ryoma Sato EMAIL Kyoto University Okinawa Institute of Science and Technology
Pseudocode Yes The pseudo code is shown in Algorithm 1. Algorithm 1: PRISM-R Algorithm 2: PRISM*
Open Source Code Yes Reproducibility: Our code and trained dictionaries are available at https://github.com/joisino/prism.
Open Datasets Yes We use the MCTest dataset [34] for the documents xi, question qij, and answer aij. [34] M. Richardson, C. J. C. Burges, and E. Renshaw. Mctest: A challenge dataset for the open-domain machine comprehension of text. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP, pages 193 203. ACL, 2013.
Dataset Splits Yes Let X = {x1, x2, . . . , x N} be a set of test documents to be translated. We use the MCTest dataset [34] for the documents xi, question qij, and answer aij.
Hardware Specification No The paper mentions using T5 and GPT-3.5-turbo as the translation algorithm and GPT-3.5-turbo as the evaluator, but does not specify the hardware used to run these models or perform the experiments.
Software Dependencies No The paper mentions using T5 and GPT-3.5-turbo as translation models, but does not specify any particular software versions (e.g., Python, PyTorch, TensorFlow versions, etc.) used for implementing PRISM or running the experiments.
Experiment Setup No We change the ratio r of No Decode, PRISM-R, and PRISM* and the parameter λ of PUP to control the trade-off between privacy-preserving score and the quality score. The paper states that it scans these parameters but does not provide specific hyperparameter values or configurations for the experiments.