Zero-Shot Learning with Common Sense Knowledge Graphs
Authors: Nihal V. Nayak, Stephen Bach
TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our results show that ZSL-KG improves over existing Word Net-based methods on five out of six zero-shot benchmark datasets in language and vision. ... We evaluate our framework on three zero-shot learning tasks: fine-grained entity typing, intent classification, and object classification. ... Table 1 shows that, on average, ZSL-KG outperforms the best performing graph-based method (SGCN) by 2.18 strict accuracy points. |
| Researcher Affiliation | Academia | Nihal V. Nayak EMAIL Department of Computer Science Brown University Stephen H. Bach EMAIL Department of Computer Science Brown University |
| Pseudocode | Yes | We provide details related to the mapping of classes to Concept Net, post-processing Concept Net, sampling, and random walk details in Appendix E and the pseudocode for ZSL-KG in Appendix F. ... Algorithm 1: Forward pass with the ZSL-KG framework. |
| Open Source Code | Yes | Our results show that ZSL-KG improves over existing Word Net-based methods on five out of six zero-shot benchmark datasets in language and vision. The code is available at https://github.com/Bats Research/zsl-kg. |
| Open Datasets | Yes | We evaluate on popular fine-grained entity typing datasets: Onto Notes (Gillick et al., 2014) and BBN (Weischedel & Brunstein, 2005). ... We evaluate on the main open-source benchmark for intent classification: SNIPS-NLU (Coucke et al., 2018). ... We evaluate our method on the large-scale Image Net (Deng et al., 2009), Attributes 2 (AWA2) (Xian et al., 2018b), and attribute Pascal Yahoo (a PY) (Farhadi et al., 2009) datasets. |
| Dataset Splits | Yes | We split the dataset into two: coarse-grained labels (e.g., /location) and fine-grained labels (e.g., /location/city). ... The training set has 5 seen classes which we split into 3 train classes and 2 development classes. ... Following prior work (Frome et al., 2013), and evaluate ZSL-KG on three levels of difficulty: 2-hops (1549 classes), 3-hops (7860 classes), and All (20842 classes). ... Following prior work (Xian et al., 2018b), for ZSL, we use the pretrained Res Net101 as the backbone... Following prior work (Min et al., 2020), for GZSL, we use the finetuned Res Net101... |
| Hardware Specification | Yes | We run experiments with the fine-grained entity typing datasets, namely BBN and Onto Notes. We use the same hyperparameters as mentioned in Appendix M and run our experiments on an NVIDIA RTX 3090 with 24GB of GPU memory. |
| Software Dependencies | No | Our framework is built using Py Torch and Allen NLP (Gardner et al., 2018). In all our experiments, we use Adam (Kingma & Ba, 2015) to train our parameters... ...initialized the tokens in the example encoder with 300 dimensional GloVe 840B embeddings. ... we use Res Net50 model (He et al., 2016) in Torchvision (Marcel & Rodriguez, 2010). |
| Experiment Setup | Yes | In all our experiments, we use Adam (Kingma & Ba, 2015) to train our parameters with a learning rate of 0.001, unless provided in the experiments. We set the weight decay to 5e 04 for Onto Notes, Image Net, AWA2, and a PY and 0.0 for BBN. For intent classification, we experiment with a weight decay of 1e 05 and 5e 05. ... For fine-grained entity typing, the methods are trained for 5 epochs by minimizing the cross-entropy loss using Adam with a learning rate 0.001. ... For AWA2 and a PY, we follow the same L2 training scheme and train for 1000 epochs on 950 random classes from 1000 ILSVRC 2012 classes, while the remaining 50 classes are used for validation. ... finetune a pretrained Res Net101-backbone on the individual datasets for 25 epochs using SGD with a learning rate 0.0001 and momentum of 0.9. ... For the Image Net experiment, we train ZSL-KG by minimizing the L2 distance between the learned class representations and the weights fully connected layer of a Res Net50 classifier for 3000 epochs on 1000 classes from the ILSVRC 2012. ... SGD with a learning rate 0.0001 and momentum of 0.9. ... Table 16 details the output dimensions for the layers in the graph neural networks. ... Table 17 details the hyperparameters used in our transformer graph convolutional networks. |