Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Privacy-Preserving Q-Learning with Functional Noise in Continuous Spaces
Authors: Baoxiang Wang, Nidhi Hegde
NeurIPS 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments corroborate our theoretical findings and show improvement over existing approaches. |
| Researcher Affiliation | Collaboration | Baoxiang Wang The Chinese University of Hong Kong Borealis AI, Edmonton EMAIL Nidhi Hegde Borealis AI, Edmonton EMAIL |
| Pseudocode | Yes | Algorithm 1 Differentially Private Q-Learning with Functional Noise |
| Open Source Code | Yes | The implementation is attached along with the manuscript submission. |
| Open Datasets | No | The paper states 'The exact MDP we use is described in Appendix E.1.' implying a custom environment, and does not provide concrete access information (link, citation) to a publicly available or open dataset for training. |
| Dataset Splits | No | The paper mentions 'number of samples the agent has trained on' and 'learning curves' but does not provide specific details on training, validation, or test splits for any dataset, nor does it refer to standard predefined splits. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'neural network' and 'deep Q-learning' but does not specify any software dependencies (libraries, frameworks) with version numbers. |
| Experiment Setup | Yes | Parameters: target privacy (ϵ, δ), time horizon T, batch size B, action space size m, learning rate α, reset factor J |