Modular Deep Learning
Authors: Jonas Pfeiffer, Sebastian Ruder, Ivan Vulić, Edoardo Ponti
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We offer a survey of modular architectures, providing a unified view over several threads of research that evolved independently in the scientific literature. Moreover, we explore various additional purposes of modularity, including scaling language models, causal inference and discovery, programme simulation, and hierarchical reinforcement learning. |
| Researcher Affiliation | Collaboration | Jonas Pfeiffer EMAIL Google DeepMind Sebastian Ruder EMAIL Google DeepMind Ivan Vulić EMAIL University of Cambridge Edoardo M. Ponti EMAIL University of Edinburgh University of Cambridge |
| Pseudocode | Yes | Algorithm 1: Forward pass of a modular function 1 Inputs: example x, task t 2 α rρ(x, t) // Routing 4 for ϕj Mi do 5 hj f(x; θi, ϕj) // Computation 7 y gγ(α, H) // Aggregation |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described in this paper. It mentions "Adapter Hub (Pfeiffer et al., 2020a): it provides (re)implementations of representative modular NLP architectures" which refers to other work, not the code for this survey paper. |
| Open Datasets | No | The paper is a survey and does not conduct experiments with specific datasets itself. It mentions various datasets (e.g., GLUE, SuperGLUE, CIFAR-10) in the context of discussing other research, but does not provide access information for its own work. |
| Dataset Splits | No | The paper is a survey and does not describe experiments that would require dataset splits. |
| Hardware Specification | No | The paper is a survey and does not conduct experiments, therefore no hardware specifications are provided. |
| Software Dependencies | No | The paper is a survey and does not conduct experiments, therefore no specific software dependencies with version numbers are listed for its own work. It mentions 'Pytorch' in a footnote about sparse tensor classes but not as a dependency for its methodology. |
| Experiment Setup | No | The paper is a survey and does not describe any specific experimental setup with hyperparameters or training configurations for its own methodology. |