reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Making Translators Privacy-aware on the User's Side

Authors: Ryoma Sato

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate the effectiveness of PRISM using real-world translators, T5 and Chat GPT, and the datasets with two languages.
Researcher Affiliation	Academia	Ryoma Sato EMAIL Kyoto University Okinawa Institute of Science and Technology
Pseudocode	Yes	The pseudo code is shown in Algorithm 1. Algorithm 1: PRISM-R Algorithm 2: PRISM*
Open Source Code	Yes	Reproducibility: Our code and trained dictionaries are available at https://github.com/joisino/prism.
Open Datasets	Yes	We use the MCTest dataset [34] for the documents xi, question qij, and answer aij. [34] M. Richardson, C. J. C. Burges, and E. Renshaw. Mctest: A challenge dataset for the open-domain machine comprehension of text. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP, pages 193 203. ACL, 2013.
Dataset Splits	Yes	Let X = {x1, x2, . . . , x N} be a set of test documents to be translated. We use the MCTest dataset [34] for the documents xi, question qij, and answer aij.
Hardware Specification	No	The paper mentions using T5 and GPT-3.5-turbo as the translation algorithm and GPT-3.5-turbo as the evaluator, but does not specify the hardware used to run these models or perform the experiments.
Software Dependencies	No	The paper mentions using T5 and GPT-3.5-turbo as translation models, but does not specify any particular software versions (e.g., Python, PyTorch, TensorFlow versions, etc.) used for implementing PRISM or running the experiments.
Experiment Setup	No	We change the ratio r of No Decode, PRISM-R, and PRISM* and the parameter λ of PUP to control the trade-off between privacy-preserving score and the quality score. The paper states that it scans these parameters but does not provide specific hyperparameter values or configurations for the experiments.