A Distributed Representation-Based Framework for Cross-Lingual Transfer Parsing
Authors: Jiang Guo, Wanxiang Che, David Yarowsky, Haifeng Wang, Ting Liu
JAIR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our combined contributions achieve an average relative error reduction of 10.9% in labeled attachment score as compared with the delexicalized parser, trained on English universal treebank and transferred to three other languages. It also significantly outperforms state-of-the-art delexicalized models augmented with projected cluster features on identical data. Finally, we demonstrate that our models can be further boosted with minimal supervision (e.g., 100 annotated sentences) from target languages, which is of great significance for practical usage. |
| Researcher Affiliation | Collaboration | Jiang Guo EMAIL Wanxiang Che EMAIL Research Center for Social Computing and Information Retrieval Harbin Institute of Technology Harbin, Heilongjiang, China David Yarowsky EMAIL Center for Language and Speech Processing Johns Hopkins University Baltimore, MD, USA Haifeng Wang EMAIL Baidu Inc., Beijing, China Ting Liu EMAIL Research Center for Social Computing and Information Retrieval Harbin Institute of Technology Harbin, Heilongjiang, China |
| Pseudocode | No | The paper describes the transition actions for arc-standard parsing and presents mathematical equations for the neural network architecture (e.g., h = g(x) = (W1 [xw,xt,xr,xd,xv] + b1)3, y = softmax(W2 h)). However, it does not include a clearly labeled 'Pseudocode' or 'Algorithm' block with structured, step-by-step instructions in a code-like format. |
| Open Source Code | Yes | Our system is made publicly available at: https://github.com/jiangfeng1124/acl15-clnndep. |
| Open Datasets | Yes | We evaluate our models on the universal multilingual treebanks v2.0 (Mc Donald et al., 2013). Case studies include transferring from English (EN) to German (DE), Spanish (ES) and French (FR). Experiments show that by incorporating lexical features, the performance of cross-lingual dependency parsing can be improved significantly. By further embedding cross-lingual cluster features (T ackstr om et al., 2012), we achieve an average relative error reduction of 10.9% in labeled attachment score (LAS), as compared with the delexicalized parsers. |
| Dataset Splits | Yes | We follow the standard split of the treebanks for all languages. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments. |
| Software Dependencies | No | The paper mentions several software tools and algorithms like 'word2vec', 'fast-align toolkit in cdec', 'Zpar (Zhang & Clark, 2011)', and 'Mini-batch adaptive stochastic gradient descent (Ada Grad)', but it does not specify any version numbers for these software components or libraries. |
| Experiment Setup | Yes | For the training of the neural network dependency parser, we set the number of hidden units to 400. The dimension of embeddings for different features are shown in Table 2. Word POS Label Distance Valency Cluster Dim. 50 50 50 5 5 8 Mini-batch adaptive stochastic gradient descent (Ada Grad) (Duchi, Hazan, & Singer, 2011) is used for optimization. So we set the window size to 1 in our parsing task. Specifically, in each word-aligned sentence pair of D, we keep all alignments with conditional alignment probability exceeding a threshold δ = 0.95 and discard the others. To reduce noise, we choose a small edit distance threshold τ = 1. We use the same word cluster feature templates from T ackstr om et al. (2012), and set the number of Brown clusters to 256. No development data is used during this process, thus we simply perform parameter updating for 2,000 iterations. |