Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
A Theoretical Justification for Asymmetric Actor-Critic Algorithms
Authors: Gaspard Lambrechts, Damien Ernst, Aditya Mahajan
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We propose such a justification for asymmetric actor-critic algorithms with linear function approximators by adapting a finite-time convergence analysis to this setting. The resulting finite-time bound reveals that the asymmetric critic eliminates error terms arising from aliasing in the agent state. |
| Researcher Affiliation | Academia | 1Montefiore Institute, University of Li ege 2Department of Electrical and Computer Engineering, Mc Gill University. Correspondence to: Gaspard Lambrechts <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 m-step temporal difference learning algorithm |
| Open Source Code | No | The paper does not contain any explicit statement about providing source code for the methodology described, nor does it provide a link to a code repository. |
| Open Datasets | No | The paper is theoretical and does not describe experiments involving specific datasets. It references the 'Tiger POMDP' as an example (Figure 1) but not as a dataset used for empirical evaluation with public access information. |
| Dataset Splits | No | The paper is theoretical and does not involve empirical experiments with datasets, thus it does not specify any dataset splits. |
| Hardware Specification | No | The paper focuses on theoretical analysis and does not describe any empirical experiments, therefore no hardware specifications are provided. |
| Software Dependencies | No | The paper is theoretical and does not include empirical experiments, so no specific software dependencies with version numbers are mentioned. |
| Experiment Setup | No | The paper presents a theoretical justification for algorithms and does not include any experimental results or specific details about an experimental setup, such as hyperparameters or training configurations. |