Text Rewriting Improves Semantic Role Labeling

Authors: K. Woodsend, M. Lapata

JAIR 2014 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply this idea to semantic role labeling and show that a model trained on rewritten data outperforms the state of the art on the Co NLL-2009 benchmark dataset.
Researcher Affiliation Academia Kristian Woodsend EMAIL Mirella Lapata EMAIL Institute for Language, Cognition and Computation School of Informatics, University of Edinburgh 10 Crichton Street, Edinburgh EH8 9AB
Pseudocode Yes Algorithm 1 Learn SRL model Mextended by extending a gold training corpus Cgold through transformation functions G.
Open Source Code No The paper refers to third-party tools like the publicly available system of Bj orkelund et al. (2009) and its software retrieved from https://code.google.com/p/mate-tools/, and Heilman and Smith (2010)'s software available at http://www.ark.cs.cmu.edu/ mheilman/questions/. However, it does not provide concrete access to source code for the methodology described in this specific paper by Woodsend and Lapata.
Open Datasets Yes We apply this idea to semantic role labeling and show that a model trained on rewritten data outperforms the state of the art on the Co NLL-2009 benchmark dataset. [...] We used the English language benchmark datasets from the Co NLL-2009 shared task to train and evaluate the SRL models. We identified and labeled semantic arguments for nouns and verbs (Hajiˇc, Ciaramita, Johansson, R., Kawahara, D., Mart ı, M. A., M arquez, L., Meyers, A., Nivre, J., Pad o, S., ˇStˇep anek, J., Straˇn ak, P., Surdeanu, M., Xue, N., & Zhang, Y., 2009). [...] We also obtain transformation rules from the Para Phrase Data Base (PPDB, Ganitkevitch et al., 2013), where paraphrases were extracted from bilingual parallel corpora. [...] retrieved from http://paraphrase.org.
Dataset Splits Yes We used the training, development, test and out-of-domain test partitions as they were provided, and some statistics on these data sets are shown in Table 6. Training 39,272 sentences [...] Development 1,334 sentences [...] Test in-domain 2,399 sentences [...] Test out-of-domain 425 sentences
Hardware Specification No No specific hardware details (GPU/CPU models, memory, or specific computer specifications) are mentioned for running the experiments. The paper only refers to the general SRL system and NLP pipeline used.
Software Dependencies No We used Lib Linear (Fan, Chang, Hsieh, Wang, & Lin, 2008) to train the SVM, and the hyper-parameters of the SVM were tuned by cross-validation on the training set to maximise the area under ROC curve, using the automatic grid-search utility of the python package scikit-learn (Pedregosa et al., 2011). The paper mentions software names but does not provide specific version numbers for Lib Linear or scikit-learn.
Experiment Setup No We used Lib Linear (Fan, Chang, Hsieh, Wang, & Lin, 2008) to train the SVM, and the hyper-parameters of the SVM were tuned by cross-validation on the training set to maximise the area under ROC curve, using the automatic grid-search utility of the python package scikit-learn (Pedregosa et al., 2011). [...] We chose which transformation functions should form the refined set based on whether their corresponding weight was above a global threshold value, and we set the threshold value by maximizing the performance of the resulting SRL model on the development set. While the paper describes the process of hyperparameter tuning and setting a threshold, it does not explicitly state the specific numerical values of the chosen hyperparameters (e.g., SVM parameters, threshold value) that were used for the reported results.