Corpus-Level Fine-Grained Entity Typing
Authors: Yadollah Yaghoobzadeh, Heike Adel, Hinrich Schuetze
JAIR 2018 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we first describe the different setups we use to do our experiments. Next, we present the results of our models followed by further analysis. In this section, we explain the datasets, evaluation metrics and other setups we follow to do our experiments. We also describe the baselines, which we compare our models against. |
| Researcher Affiliation | Collaboration | Yadollah Yaghoobzadeh EMAIL Microsoft Research Montreal Montreal, Canada & Center for Information and Language Processing LMU Munich, Munich, Germany Heike Adel EMAIL Institute for Natural Language Processing University of Stuttgart, Stuttgart, Germany & Center for Information and Language Processing LMU Munich, Munich, Germany Hinrich Sch utze EMAIL Center for Information and Language Processing LMU Munich, Munich, Germany |
| Pseudocode | No | The paper describes algorithms and models using mathematical equations and textual explanations, for example, the definition of BCE, P_GM(t|e), P_CM(t|ci), and different MIML algorithms. However, it does not include any clearly labeled pseudocode blocks or algorithms in a structured, code-like format. |
| Open Source Code | No | The paper does not provide any explicit statements about the release of source code for the methodology described. It only mentions the availability of entity datasets at a URL, but not the code itself. |
| Open Datasets | Yes | The entity datasets are available at http://cistern.cis.lmu.de/figment. |
| Dataset Splits | Yes | We split the entities into train (50%), dev (20%) and test (30%) sets. ... For test and dev sets, we sample 300 and 200 random contexts, respectively, for each entity. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments. It focuses on models, datasets, and hyperparameters. |
| Software Dependencies | No | The paper mentions using several existing tools and models like WORD2VEC (Mikolov et al., 2013), FASTTEXT (Bojanowski et al., 2017), WANG2VEC (Ling et al., 2015a), and Ada Grad (Duchi et al., 2011). However, it does not specify version numbers for these or any other ancillary software components (e.g., programming languages, libraries, frameworks) required for reproduction. |
| Experiment Setup | Yes | We run SSKIP with the setup (100 dimensions, 10 negative samples, window size 5, and word frequency threshold of 100) on this corpus to learn embeddings for words, entities and FIGER types. ... Our hyperparameter values are optimized on dev. We use Ada Grad (Duchi, Hazan, & Singer, 2011) and minibatch training. For each experiment, we select the best model on dev. The values of our hyperparameters are shown in Table 7 in the appendix. |