Two-phase Multi-document Event Summarization on Core Event Graphs
Authors: Zengjian Chen, Jin Xu, Meng Liao, Tong Xue, Kun He
JAIR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | For experiments in the new task, we construct two large-scale real-world datasets for training and assessment. Extensive evaluations show that the proposed framework significantly outperforms the related baseline methods, with the most dominant event of the articles effectively identified and correctly summarized. (Abstract) ... 4. Experiments |
| Researcher Affiliation | Collaboration | Zengjian Chen EMAIL We Chat, Tencent Inc. Huazhong University of Science and Technology Shenzhen, Guangdong, China Jin Xu (corresponding author) EMAIL School of Future Technology, South China University of Technology Guangzhou, Guangdong, China Meng Liao EMAIL Tong Xue EMAIL We Chat, Tencent Inc. Shenzhen, Guangdong, China Kun He (corresponding author) EMAIL Huazhong University of Science and Technology, Wuhan, Hubei, China |
| Pseudocode | No | The paper describes methodologies through text and mathematical formulas but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or provide a link to a code repository for the described methodology. |
| Open Datasets | Yes | To facilitate evaluation and further research on MES, we have created two large-scale datasets, one annotated by professional editors, while the other be collected from crawling and search results. (Abstract) ... 2https://drive.google.com/drive/folders/1QX28zDhkh_oHzi_Vy_Ovt_Ym0GvmcPTAdI99 |
| Dataset Splits | Yes | We randomly select 80% of the data as the training data, and use the remaining data for development and test (10% for each). |
| Hardware Specification | Yes | All models are trained on a single Tesla M40 GPU |
| Software Dependencies | No | We implement all the mentioned models in Tensorflow except Trunc., ILP and Graph-gen. (Section 4.3) ... For the two new Chinese datasets (TMES, SMES), we do word segmentation with the Jieba (Sun, 2012) tool for word counting. (Table 1) |
| Experiment Setup | Yes | We use a two-layer bi-directional LSTM-RNN encoder and a one-layer uni-directional LSTM-RNN decoder along with the attention mechanism... The vocabulary size is set to 50k... We initialize a 128-dimensional word embedding... optimized with Ada Grad (batch size = 128). The initial learning rate and the accumulator value were set to 0.15 and 0.1, respectively. We use gradient clipping with a maximum gradient norm of 2... For hyper-parameter settings, we tune γ = 0.2 and λ = 0.3 for our model. At the test time, our short event summaries are produced with a decoder whose beam search size is set to 8 and the maximum decoding step size is set to 15. |