A Kernel Perspective on Behavioural Metrics for Markov Decision Processes
Authors: Pablo Samuel Castro, Tyler Kastner, Prakash Panangaden, Mark Rowland
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We complement our theory with strong empirical results that demonstrate the effectiveness of these methods in practice. |
| Researcher Affiliation | Collaboration | Pablo Samuel Castro EMAIL Google Deep Mind Tyler Kastner , EMAIL University of Toronto Prakash Panangaden EMAIL Mc Gill University Mark Rowland EMAIL Google Deep Mind |
| Pseudocode | No | The paper includes mathematical definitions, theorems, and proofs but does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | All the code for these evaluations are available at https://github.com/google-research/google-research/tree/master/ksme. |
| Open Datasets | Yes | The added loss showed statistically significant performance improvements on the challenging Arcade Learning Environment (Bellemare et al., 2013), as well as on the Deep Mind control suite (Tassa et al., 2018). Due to the computational expense of running these experiments, we selected four representative Atari 2600 games from the ALE suite (Bellemare et al., 2013) |
| Dataset Splits | No | The paper mentions running "5 independent runs" but does not specify any training, validation, or test dataset splits (e.g., percentages or sample counts) for the mentioned datasets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper acknowledges the use of tools like NumPy, Matplotlib, and JAX but does not specify their version numbers, which is required for a reproducible description of software dependencies. |
| Experiment Setup | No | We keep all hyperparameters unchanged from those used by Castro et al. (2021). |