reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Pre-trained Language Models Improve the Few-shot Prompt Ability of Decision Transformer

Authors: Yu Yang, Pan Xu

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Comprehensive empirical studies demonstrate that initializing with a pre-trained language model provides the prior knowledge and achieves a similar performance with Prompt-DT under only 10% data in some Mu Jo Co control tasks. We also provide a thorough ablation study to validate the effectiveness of each component, including sequence modeling, language models, prompt regularizations, and prompt strategies.
Researcher Affiliation	Academia	Yu Yang EMAIL Department of Electrical and Computer Engineering Duke University Pan Xu EMAIL Department of Biostatistics and Bioinformatics Department of Computer Science Department of Electrical and Computer Engineering Duke University
Pseudocode	Yes	Algorithm 1 LPDT: Training and Inference
Open Source Code	Yes	All source code and experiment configurations are available at https://github.com/panxulab/LPDT.
Open Datasets	Yes	We conduct extensive experiments to assess the capability of our proposed framework in Mu Jo Co control environments (Fu et al., 2020) and Meta-World ML1 tasks (Yu et al., 2020).
Dataset Splits	Yes	We evaluate our approach over Mu Jo Co control tasks and Meta-World ML1 tasks. We split the tasks in these environments into the training set and the testing set. The tasks in Cheetah-dir and Ant-dir are split by directions. The tasks in Cheetah-vel are split by the goal velocities. The tasks in Point-robot are split by the goal positions which are uniformly distributed in a unit square. In Meta-World, the tasks are defined by different goal positions. The detailed task indexes can be found in Table 9.
Hardware Specification	Yes	All experiments are conducted on a single NVIDIA A6000 GPU with Intel Xeon Ice Lake Gold 5317 processors.
Software Dependencies	No	Table 10 lists 'Language initialization GPT-2' and 'Activation ReLU'. While GPT-2 is a specific model, it's not a software dependency with a version number like a programming language or library. 'Activation ReLU' is a function, not a software component. No other software components with specific version numbers are provided.
Experiment Setup	Yes	In this section, we show the hyperparameters of our LPDT conducted in Table 1. The hyperparameters have two parts which are the hyperparameters around the transformer and prompt regularization. We list these hyperparameters in Table 10. Table 10: Detail on hyperparameters used in our experiments in Table 1. We show that the hyperparameters in two parts which are parameters for model backbone and prompt regularization respectively.