Corpus-Level Fine-Grained Entity Typing

Authors: Yadollah Yaghoobzadeh, Heike Adel, Hinrich Schuetze

JAIR 2018 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we first describe the different setups we use to do our experiments. Next, we present the results of our models followed by further analysis. In this section, we explain the datasets, evaluation metrics and other setups we follow to do our experiments. We also describe the baselines, which we compare our models against.
Researcher Affiliation Collaboration Yadollah Yaghoobzadeh EMAIL Microsoft Research Montreal Montreal, Canada & Center for Information and Language Processing LMU Munich, Munich, Germany Heike Adel EMAIL Institute for Natural Language Processing University of Stuttgart, Stuttgart, Germany & Center for Information and Language Processing LMU Munich, Munich, Germany Hinrich Sch utze EMAIL Center for Information and Language Processing LMU Munich, Munich, Germany
Pseudocode No The paper describes algorithms and models using mathematical equations and textual explanations, for example, the definition of BCE, P_GM(t|e), P_CM(t|ci), and different MIML algorithms. However, it does not include any clearly labeled pseudocode blocks or algorithms in a structured, code-like format.
Open Source Code No The paper does not provide any explicit statements about the release of source code for the methodology described. It only mentions the availability of entity datasets at a URL, but not the code itself.
Open Datasets Yes The entity datasets are available at http://cistern.cis.lmu.de/figment.
Dataset Splits Yes We split the entities into train (50%), dev (20%) and test (30%) sets. ... For test and dev sets, we sample 300 and 200 random contexts, respectively, for each entity.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments. It focuses on models, datasets, and hyperparameters.
Software Dependencies No The paper mentions using several existing tools and models like WORD2VEC (Mikolov et al., 2013), FASTTEXT (Bojanowski et al., 2017), WANG2VEC (Ling et al., 2015a), and Ada Grad (Duchi et al., 2011). However, it does not specify version numbers for these or any other ancillary software components (e.g., programming languages, libraries, frameworks) required for reproduction.
Experiment Setup Yes We run SSKIP with the setup (100 dimensions, 10 negative samples, window size 5, and word frequency threshold of 100) on this corpus to learn embeddings for words, entities and FIGER types. ... Our hyperparameter values are optimized on dev. We use Ada Grad (Duchi, Hazan, & Singer, 2011) and minibatch training. For each experiment, we select the best model on dev. The values of our hyperparameters are shown in Table 7 in the appendix.