Multi-granularity Knowledge Transfer for Continual Reinforcement Learning

Authors: Chaofan Pan, Lingfei Ren, Yihui Feng, Linbo Xiong, Wei Wei, Yonghao Li, Xin Yang

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate the superiority of the proposed MT-Core in handling diverse CRL tasks versus popular baselines. Extensive experiments in Mini Grid provide empirical evidence of MT-Core s effectiveness. In this section, we evaluate our framework in several continual reinforcement learning tasks.
Researcher Affiliation Academia Chaofan Pan1 , Lingfei Ren1 , Yihui Feng1 , Linbo Xiong1 , Wei Wei2 , Yonghao Li 1 and Xin Yang1 1Southwestern University of Finance and Economics 2Shanxi University EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes the methodology using textual explanations and diagrams (Figure 2), but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper states: "More details about the experimental implementation are provided in the supplementary material." This is not an explicit statement about releasing code, nor does it provide a direct link to a repository.
Open Datasets Yes For our experiments, we utilized a suite of Mini Grid environments [Chevalier-Boisvert et al., 2023] to evaluate the efficacy of MT-Core in addressing CRL tasks.
Dataset Splits No The paper mentions: "We crafted a sequence of four distinct tasks within the Mini Grid framework." While this defines the tasks for sequential learning, it does not provide specific training/test/validation splits of a static dataset in terms of percentages, sample counts, or predefined partition files, as typically expected for dataset splits.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments. It mentions "More details about the experimental implementation are provided in the supplementary material," but these details are not present in the main text.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies used in the experiments. It mentions "More details about the experimental implementation are provided in the supplementary material," but these details are not present in the main text.
Experiment Setup Yes Each experiment is trained in 5M steps and replicated with five random seeds of environments to ensure reliability.