CoLLIE: Continual Learning of Language Grounding from Language-Image Embeddings

Authors: Gabriel Skantze, Bram Willemsen

JAIR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We verify the model s performance on two different tasks of identifying the targets of referring expressions, where it has to learn new language use. The results show that the model can efficiently learn and generalize from only a few examples, with little interference with the model s original zero-shot performance.
Researcher Affiliation Academia Gabriel Skantze EMAIL Bram Willemsen EMAIL KTH Royal Institute of Technology, Stockholm, Sweden
Pseudocode No The paper describes the Co LLIE transformation using mathematical equations and figures (Figure 4, 5) to illustrate its components (adjustment function a(T), scaling function s(T)) and how they are trained (e.g., "a(T) = βT + m", "We learn s using a regression model"), but it does not include a clearly labeled pseudocode or algorithm block.
Open Source Code Yes The code used for running the experiments and reproducing the results in this paper is provided on Git Hub (https://github.com/ gabriel-skantze/Co LLIE), including necessary data or pointers to data.
Open Datasets Yes First, we use the LAD dataset (Large-scale Attribute Dataset) by Zhao et al. (2018), from which we selected a set of 200 categories... we also use the images from the KTH Tangrams dataset (Shore et al., 2018), which were used for the task depicted in Figure 2.
Dataset Splits Yes We randomly select a set of N categories, Ctrain, (out of the 200 categories) for which we want to teach the model new names. ... At the beginning of each round, we randomly select one image for each of the 200 categories, without ever reusing images between rounds. ... At the end of each round, we add the images from Ctrain and their associated pseudo-words as training examples (i.e., one example per category) to the model, and retrain it. ... This whole procedure is repeated over 50 iterations (with new pseudo-words and categories randomly selected and assigned), in order to get a smooth average performance per round.
Hardware Specification Yes On an Intel Core i7-1065G7 CPU, one iteration of Experiment II (i.e., 30 model updates) takes about 1 second for the standard Co LLIE model.
Software Dependencies No The models were implemented using scikit-learn (https://scikit-learn.org/) with standard parameters unless stated otherwise.
Experiment Setup Yes We learn A using linear regression: a(T) = βT + m, β R512 512, m R512. To avoid overfitting (given the limited number of training examples) we use ridge regression (L2 regularization with λ = 0.001). ... For our initial tests, we learn s using support vector regression (SVR) with a linear kernel (coerced in the range [0, 1]). ... The KNN regressor also shows a good performance.