Building, Reusing, and Generalizing Abstract Representations from Concrete Sequences
Authors: Shuchen Wu, Mirko Thalmann, Peter Dayan, Zeynep Akata, Eric Schulz
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We first demonstrate the benefits of abstraction in memory efficiency and sequence parsing by comparing our algorithm with previous chunking models and other dictionary-based compression methods. Then, we show that the model exhibits human-like signatures of abstraction in a memory experiment requiring the transfer of abstract concepts. In the same experiment, we contrast the model s generalization behavior with large language models (LLMs). |
| Researcher Affiliation | Academia | Shuchen Wu Helmholtz Munich Max Planck Institute for Biological Cybernetics EMAIL; Mirko Thalmann Institute for Human-Centered AI Helmholtz Munich EMAIL; Peter Dayan Department of Computational Neuroscience Max Planck Institute for Biological Cybernetics EMAIL; Zeynep Akata Helmholtz Munich Technical University of Munich EMAIL; Eric Schulz Institute for Human-Centered AI Helmholtz Munich EMAIL |
| Pseudocode | Yes | Algorithm 1: Pseudocode to generate sequences with nested abstract hierarchies.; Algorithm 2: HVM (online version, for learning sequences from human experiments); Algorithm 3: Pseudocode for traversing a tree to find a path consistent with an upcoming sequence |
| Open Source Code | Yes | The code used for the algorithm and experiments is available under this link. |
| Open Datasets | Yes | CHILDES (Mac Whinney, 2000); BNC (BNC Consortium, 2007); Gutenberg (Gerlach & Font-Clos, 2020); Open Subtitles (Lison & Tiedemann, 2016) |
| Dataset Splits | No | The paper mentions generating sequences of desired length and taking random snippets of 1000 characters from real-world datasets, as well as distinguishing between 'training block' and 'transfer block' in an experiment. However, it does not provide specific train/test/validation split percentages, sample counts, or citations to predefined splits for the datasets used in model evaluation. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | Memory decay parameter θ = 0.996 |