Modular Deep Learning

Authors: Jonas Pfeiffer, Sebastian Ruder, Ivan Vulić, Edoardo Ponti

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We offer a survey of modular architectures, providing a unified view over several threads of research that evolved independently in the scientific literature. Moreover, we explore various additional purposes of modularity, including scaling language models, causal inference and discovery, programme simulation, and hierarchical reinforcement learning.
Researcher Affiliation Collaboration Jonas Pfeiffer EMAIL Google DeepMind Sebastian Ruder EMAIL Google DeepMind Ivan Vulić EMAIL University of Cambridge Edoardo M. Ponti EMAIL University of Edinburgh University of Cambridge
Pseudocode Yes Algorithm 1: Forward pass of a modular function 1 Inputs: example x, task t 2 α rρ(x, t) // Routing 4 for ϕj Mi do 5 hj f(x; θi, ϕj) // Computation 7 y gγ(α, H) // Aggregation
Open Source Code No The paper does not provide concrete access to source code for the methodology described in this paper. It mentions "Adapter Hub (Pfeiffer et al., 2020a): it provides (re)implementations of representative modular NLP architectures" which refers to other work, not the code for this survey paper.
Open Datasets No The paper is a survey and does not conduct experiments with specific datasets itself. It mentions various datasets (e.g., GLUE, SuperGLUE, CIFAR-10) in the context of discussing other research, but does not provide access information for its own work.
Dataset Splits No The paper is a survey and does not describe experiments that would require dataset splits.
Hardware Specification No The paper is a survey and does not conduct experiments, therefore no hardware specifications are provided.
Software Dependencies No The paper is a survey and does not conduct experiments, therefore no specific software dependencies with version numbers are listed for its own work. It mentions 'Pytorch' in a footnote about sparse tensor classes but not as a dependency for its methodology.
Experiment Setup No The paper is a survey and does not describe any specific experimental setup with hyperparameters or training configurations for its own methodology.