reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Corpus-Level Fine-Grained Entity Typing

Authors: Yadollah Yaghoobzadeh, Heike Adel, Hinrich Schuetze

JAIR 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we ﬁrst describe the different setups we use to do our experiments. Next, we present the results of our models followed by further analysis. In this section, we explain the datasets, evaluation metrics and other setups we follow to do our experiments. We also describe the baselines, which we compare our models against.
Researcher Affiliation	Collaboration	Yadollah Yaghoobzadeh EMAIL Microsoft Research Montreal Montreal, Canada & Center for Information and Language Processing LMU Munich, Munich, Germany Heike Adel EMAIL Institute for Natural Language Processing University of Stuttgart, Stuttgart, Germany & Center for Information and Language Processing LMU Munich, Munich, Germany Hinrich Sch utze EMAIL Center for Information and Language Processing LMU Munich, Munich, Germany
Pseudocode	No	The paper describes algorithms and models using mathematical equations and textual explanations, for example, the definition of BCE, P_GM(t\|e), P_CM(t\|ci), and different MIML algorithms. However, it does not include any clearly labeled pseudocode blocks or algorithms in a structured, code-like format.
Open Source Code	No	The paper does not provide any explicit statements about the release of source code for the methodology described. It only mentions the availability of entity datasets at a URL, but not the code itself.
Open Datasets	Yes	The entity datasets are available at http://cistern.cis.lmu.de/figment.
Dataset Splits	Yes	We split the entities into train (50%), dev (20%) and test (30%) sets. ... For test and dev sets, we sample 300 and 200 random contexts, respectively, for each entity.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments. It focuses on models, datasets, and hyperparameters.
Software Dependencies	No	The paper mentions using several existing tools and models like WORD2VEC (Mikolov et al., 2013), FASTTEXT (Bojanowski et al., 2017), WANG2VEC (Ling et al., 2015a), and Ada Grad (Duchi et al., 2011). However, it does not specify version numbers for these or any other ancillary software components (e.g., programming languages, libraries, frameworks) required for reproduction.
Experiment Setup	Yes	We run SSKIP with the setup (100 dimensions, 10 negative samples, window size 5, and word frequency threshold of 100) on this corpus to learn embeddings for words, entities and FIGER types. ... Our hyperparameter values are optimized on dev. We use Ada Grad (Duchi, Hazan, & Singer, 2011) and minibatch training. For each experiment, we select the best model on dev. The values of our hyperparameters are shown in Table 7 in the appendix.