reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Active Acquisition for Multimodal Temporal Data: A Challenging Decision-Making Task

Authors: Jannik Kossen, Cătălina Cangea, Eszter Vértes, Andrew Jaegle, Viorica Patraucean, Ira Ktena, Nenad Tomasev, Danielle Belgrave

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our agents are able to solve a novel synthetic scenario requiring practically relevant cross-modal reasoning skills. On two large-scale, real-world datasets, Kinetics-700 and Audio Set, our agents successfully learn cost-reactive acquisition behavior. However, an ablation reveals they are unable to learn adaptive acquisition strategies, emphasizing the difficulty of the task even for state-of-the-art models.
Researcher Affiliation	Collaboration	Jannik Kossen1 Cătălina Cangea2 Eszter Vértes2 Andrew Jaegle2 Viorica Patraucean2 Ira Ktena2 Nenad Tomasev2 Danielle Belgrave2 1OATML, Department of Computer Science, University of Oxford 2Google Deep Mind
Pseudocode	Yes	Algorithm 1 A2MT
Open Source Code	No	The paper does not contain an explicit statement about the release of source code or a link to a code repository for the methodology described.
Open Datasets	Yes	Further, we propose to study A2MT on audio-visual datasets, concretely Audio Set (Gemmeke et al., 2017) and Kinetics-700 2020 (Smaira et al., 2020). These provide a challenging testbed for A2MT and avoid some of the complications of working with medical data.
Dataset Splits	Yes	We split the training set of each dataset into a subset used for model pretraining and a subset used exclusively for agent training, taking up 80% and 20% of the original training set respectively.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions software components like Perceiver IO and ADAM optimizer but does not specify their version numbers or other library dependencies needed to replicate the experiment.
Experiment Setup	Yes	We train using a batch size of 256. We use the ADAM optimizer with initial learning rate of 3 10 4, weight decay of 1 10 6, and a cosine annealing schedule. For the Perceiver IO encoder we use a single cross-attend block with 4 self-attention operations per Perceiver IO block; we use 128 queries, and the hidden dimension is 128. For the Perceiver IO decoder, we use a single head with 128 queries and hidden dim of 128. We train for a total of 2 105 steps. We set the discount factor to γ = 1.