reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Categorical Semantics of Compositional Reinforcement Learning

Authors: Georgios Bakirtzis, Michail Savvas, Ufuk Topcu

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Using a categorical point of view, we develop a knowledge representation framework for a compositional theory of RL. Our approach relies on the theoretical study of the category MDP, whose objects are Markov decision processes (MDPs) acting as models of tasks. The categorical semantics models the compositionality of tasks through the application of pushout operations akin to combining puzzle pieces. We further prove that properties of the category MDP unify concepts, such as enforcing safety requirements and exploiting symmetries, generalizing previous abstraction theories for RL. We construct a unifying compositional theory for engineering RL systems that mechanizes functional composition into subprocess behaviors. To achieve this explicit deﬁnition of compositionality in RL, we give a symbolic and semantic interpretation of compositional phenomena of the problem or task by translating them into categorical properties.
Researcher Affiliation	Academia	Georgios Bakirtzis EMAIL LTCI, T el ecom Paris, Institut Polytechnique de Paris Michail Savvas EMAIL The University of Iowa Ufuk Topcu EMAIL The University of Texas at Austin
Pseudocode	No	The paper uses mathematical definitions, propositions, theorems, and proofs to describe its theoretical framework, but does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code, nor does it provide links to a code repository or mention code in supplementary materials.
Open Datasets	No	The paper mentions 'grid world (Leike et al., 2017)' as a conceptual example, but it does not use or provide concrete access information for any publicly available or open dataset used in empirical experiments.
Dataset Splits	No	The paper focuses on theoretical development and does not conduct experiments involving datasets, thus no dataset split information is provided.
Hardware Specification	No	The paper presents a theoretical framework and does not report on experimental results that would require specific hardware specifications.
Software Dependencies	No	The paper is theoretical and does not describe any experimental implementations that would necessitate detailing specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not present experimental results, therefore no specific experimental setup details like hyperparameter values or training configurations are provided.