reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

C3DM: Constrained-Context Conditional Diffusion Models for Imitation Learning

Authors: Vaibhav Saxena, Yotto koga, Danfei Xu

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically show that C3DM is robust to out-of-distribution distractors, and consistently achieves high success rates on a wide array of tasks, ranging from table-top manipulation to industrial kitting that require varying levels of precision and robustness to distractors.1... We evaluated C3DM extensively on 5 tasks in simulation and demonstrated our model on various tasks with real robot arms. For the simulated tasks, we used Implicit Behavioral Cloning (IBC, Florence et al. (2021)), Neural Grasp Distance Fields (NGDF, Weng et al. (2023)), Diffusion Policy (Chi et al., 2023), and an explicit behavior cloning model (Conv. MLP) with the same backbone as our method for the baselines. Additionally, we ablated against the masking version of C3DM, called C3DM-Mask.
Researcher Affiliation	Collaboration	Vaibhav Saxena EMAIL School of Interactive Computing Georgia Institute of Technology Yotto Koga EMAIL Robotics Lab Autodesk Research Danfei Xu EMAIL School of Interactive Computing Georgia Institute of Technology
Pseudocode	Yes	Algorithm 1 C3DM Training f DDP components... Algorithm 2 C3DM Iterative refinement with f DDP for action prediction during testing
Open Source Code	No	1Project website: https://sites.google.com/view/c3dm-imitation-learning. The paper provides a project website which is typically for demonstrations and overview, not a direct link to a source code repository.
Open Datasets	Yes	Place-red-in-green and sweeping-piles were made available by Zeng et al. (2020) as part of the Ravens simulation benchmark... all made available in the SAPIEN dataset (Xiang et al., 2020; Mo et al., 2019; Chang et al., 2015).
Dataset Splits	No	Each task assumed access to 1000 demonstrations from an oracle demonstrator... For the setup that trains on real-world demonstrations, we collected 20 human demonstrations... In the sim-to-real setup, we collected up to 100 demonstrations for the kitting-part and hang-cup tasks (as described in Appendix B) in simulation using a scripted policy. The paper mentions the total number of demonstrations used but does not specify explicit training, validation, or test splits (e.g., percentages, sample counts, or predefined splits).
Hardware Specification	Yes	with a batch size of 100 demonstrations on a single Nvidia 2080 Ti GPU.
Software Dependencies	No	We use a deep convolutional neural network with residual connections (Res Net-18 He et al. (2015)), without any pretraining, to process images... We implemented all baselines with the same backbone architecture, tune learning rate in the range [10 4, 10 3], and train using the Adam optimizer (Kingma & Ba, 2014). The paper mentions software components and algorithms used but does not provide specific version numbers for software libraries or environments (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	tune learning rate in the range [10 4, 10 3], and train using the Adam optimizer (Kingma & Ba, 2014) with a batch size of 100 demonstrations... We use rotation augmentation of the demonstrations... During testing, we evaluated all models using a linear schedule of 10 timesteps. Table 6: Hyperparameters used for simulation experiments. Table 7: Hyperparameters used for real robot experiments. Table 8: Hyperparameters used for sim-to-real robot experiments.