reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ASkDAgger: Active Skill-level Data Aggregation for Interactive Imitation Learning

Authors: Jelle Luijkx, Zlatan Ajanović, Laura Ferranti, Jens Kober

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate the effectiveness of ASk DAgger through language-conditioned manipulation tasks in both simulation and real-world environments.
Researcher Affiliation	Academia	Jelle Luijkx EMAIL Department of Cognitive Robotics Delft University of Technology Zlatan Ajanović EMAIL Department of Computer Science RWTH Aachen University Laura Ferranti EMAIL Department of Cognitive Robotics Delft University of Technology Jens Kober EMAIL Department of Cognitive Robotics Delft University of Technology
Pseudocode	Yes	Algorithm 1: Active Skill-level DAgger (ASk DAgger) [...] Algorithm 2: S-Aware Gating (SAG) [...] Algorithm 3: Foresight Interactive Experience Replay (FIER) [...] Algorithm 4: Prioritized Interactive Experience Replay (PIER)
Open Source Code	Yes	Code, data, and videos are available at https://askdagger.github.io.
Open Datasets	Yes	First, we performed active dataset aggregation on the MNIST dataset (Le Cun et al., 1998) [...] The CLIPort benchmark includes seen and unseen task settings.
Dataset Splits	Yes	The CLIPort benchmark includes seen and unseen task settings. In the unseen setting, test-time commands involve different objects, shapes, or colors than during training.
Hardware Specification	Yes	The MNIST experiments (Sec. 5.1) and real-world experiments (subsection 5.3 and subsection 5.4) were performed using an RTX 3080 Mobile graphics card. The CLIPort simulation benchmark experiments (Sec. 5.2) were performed using multiple A40 graphics cards on a high-performance computing cluster.
Software Dependencies	No	The control scheme is implemented using the EAGERx framework (van der Heijden et al., 2024). The Gradio (Abid et al., 2019) interface allows command input via speech or text. Additionally, existing packages such as Torch Uncertainty (Lafage & Laurent, 2024) facilitate uncertainty quantification in this setting.
Experiment Setup	Yes	For SAG we used mode = sensitivity, σdes = 0.9, Nmin = 15, prand = 0.2 and for PIER α = 1.5, b = 10, β = 1 and λ = 0.5. Each setting involved training ten CLIPort agents without BC pretraining, collecting 300 interactive demonstrations, and evaluating checkpoints every 100 demonstrations.