Towards Automated Knowledge Integration From Human-Interpretable Representations
Authors: Katarzyna Kobalczyk, Mihaela van der Schaar
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To illustrate our claims, we implement an instantiation of informed meta-learning the Informed Neural Process, and empirically demonstrate the potential benefits and limitations of informed meta-learning in improving data efficiency and generalisation. Through empirical evaluation on both synthetic and real-world datasets, we demonstrate the feasibility of this approach to knowledge integration, as evidenced by improvements in predictive performance and data efficiency. |
| Researcher Affiliation | Academia | Katarzyna Kobalczyk University of Cambridge EMAIL Mihaela van der Schaar University of Cambridge EMAIL |
| Pseudocode | No | The paper describes the model architecture and training process in textual and mathematical form, particularly in Section 4 "Informed Neural Processes" and Appendix A.4 "INP MODEL ARCHITECTURE" and A.5 "INP TRAINING". However, it does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks with structured, code-like steps. |
| Open Source Code | Yes | We also provide access the anonymous repository containing the source code together with the instructions to reproduce the experiments from sections 5.1 and 5.2.1 (code for the image classification experiments will be made available upon paper acceptance). We also release the synthetically-generated datasets with GPT-4 used in experiments 5.2.1 and 5.2.2. Code and data can be found at: https://github.com/kasia-kobalczyk/informed-metalearning and at a wider lab repository: https://github.com/vanderschaarlab/informed-meta-learning. |
| Open Datasets | Yes | We use the sub-hourly temperature dataset from the U.S. Climate Reference Network (USCRN)2. 2https://www.ncei.noaa.gov/access/crn/qcdatasets.html We apply INPs to few-shot classification on the CUB-200-2011 dataset Wah et al. (2011). |
| Dataset Splits | Yes | For each task, the number of context points n ranges uniformly between 0 and 10; the number of targets, m = 100. Training, validation, and testing collections of tasks are created by randomly selecting 507, 108, and 110 days, respectively, between the years 2021 and 2022 in Aleknagik, Alaska. For each task, the target dataset consists of all 288 measurements in the 24h range. Context observations are sampled by first uniformly sampling 10 data points and then selecting the chronologically first n observations where n U[0, 10]. We use 100 bird categories for training, 50 for validation, and 50 for testing. For each task, the number of shots k, i.e. the number of example images per class ranges uniformly between 0 and 10. The target set consists of 20 images per class. |
| Hardware Specification | Yes | All experiments were run on a machine with an AMD Epyc Milan 7713 CPU, 120GB RAM, and using a single NVIDIA A6000 Ada Generation GPU accelerator with 48GB VRAM. |
| Software Dependencies | No | The paper mentions several software components and models used (e.g., "Adam optimise r", "Ro BERTa language model", "GPT-4", "CLIP vision and text encoders", "Hugging Face implementation of the CLIP Vi T-B/32 model"). However, it does not provide specific version numbers for general programming languages or core machine learning libraries (e.g., Python, PyTorch, TensorFlow) as required. |
| Experiment Setup | Yes | The data encoder, hθe,D, is implemented as a 3-layer MLP. The knowledge encoder, hθe,K, is implemented with the Deep Set architecture Zaheer et al. (2017), made of two 2-layer MLPs. The decoder is a 4-layer MLP. We set the hidden dimension, d = 128 and use the sum & MLP method for the aggregator, a. We use a learning rate of 1e-3 and set the batch size to 64. During training, knowledge is masked at rate 0.3. |