Low-Rank Tensor-Network Encodings for Video-to-Action Behavioral Cloning

Authors: Brian Chen, Doruk Aksoy, David J Gorsich, Shravan Veerapaneni, Alex Gorodetsky

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results on ATARI games demonstrate that our approach leads to a speedup in the time it takes to encode data and train a predictor using the encodings (between 2.6 to 9.6 compared to autoencoders or variational autoencoders).
Researcher Affiliation Collaboration Brian Chen EMAIL Department of Mathematics University of Michigan Doruk Aksoy EMAIL Department of Aerospace Engineering University of Michigan David J. Gorsich EMAIL Ground Vehicle Systems Center U.S. Army Shravan Veerapaneni EMAIL Department of Mathematics University of Michigan Alex A. Gorodetsky EMAIL Department of Aerospace Engineering University of Michigan
Pseudocode Yes Algorithm 1 Mapping unseen data to latent space by orthogonal projections onto TT-cores
Open Source Code No The paper references various existing libraries and packages such as PyTorch, RL-Baselines Zoo, Stable Baselines, and the TT-ICE* package (which implements the TT-ICE algorithm). However, it does not provide any explicit statement about releasing its own source code implementation for the methodology described in the paper, nor does it provide a direct link to a code repository.
Open Datasets Yes The dataset consists of game frames from demonstrations from a set of Atari games, in particular: Beam Rider, Breakout, Ms Pacman, Pong, Qbert, Seaquest, and Space Invaders. The dataset can be found at Chen (2023). Brian Chen. Atari Games Dataset. https://deepblue.lib.umich.edu/data/concern/data_sets/ gq67jr854, 2023.
Dataset Splits Yes For each game, we generate 200 trajectories, a subset of which is used as a training set and a subset of which is used as a validation set. More details about the split are given in Section 4.2. Another 50 trajectories are generated as a test set, which is used to evaluate the action prediction accuracy of the final trained models. Table 2 provides a summary: In the Limited Data case, 5 trajectories are used for training, 1 for encoder validation (AE/VAE), 1 for predictor validation, and 50 for testing. In the Moderate Data case, 160 trajectories are used for training, 20 for encoder validation, 20 for predictor validation, and 50 for testing.
Hardware Specification Yes The predictors and AE / VAE were set up and trained using Py Torch and trained on NVIDIA Tesla V100 GPUs. The TT is trained either using GPU (TT-GPU, NVIDIA Tesla V100) or CPU (TT-CPU, Intel Xeon Gold 6154 processors)...
Software Dependencies No The paper mentions several software tools used, including PyTorch, RL-Baselines Zoo, Stable Baselines, NumPy, and CuPy. However, it does not provide specific version numbers for any of these software dependencies.
Experiment Setup Yes The AEs and VAEs are hyperparameter tuned individually for each game over initial learning rates of 10 2, 10 3, and 10 4 and L2 weight decay penalties of 1, 10 2, and 10 4... The networks are trained using the Adam-W optimizer... with cosine learning rate decay... for fifty epochs in the low data case and five epochs in the moderate data case. Action predictors are trained with the Adam-W optimizer... with a cosine learning rate decay... for fifty epochs in the low data case and five epochs in the moderate data case.