reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Importance-Aware Learning for Neural Headline Editing

Authors: Qingyang Wu, Lei Li, Hao Zhou, Ying Zeng, Zhou Yu9282-9289

AAAI 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that our method significantly improves the quality of headline editing comparing against previous methods.
Researcher Affiliation	Collaboration	1University of California, Davis, 2Byte Dance, EMAIL, EMAIL
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	We have released the pre-trained Chinese-GPT model. 1 http://https://github.com/qywu/Chinese-GPT
Open Datasets	No	The paper describes collecting the Professional Headline Editing Dataset (PHED) and a Large Scale Chinese Corpus for NLP, providing a link for the latter: 'We collect our corpus from Large Scale Chinese Corpus for NLP 2. http://github.com/brightmart/nlp chinese corpus'. However, the PHED dataset, which is central to their main task, is described as 'constructed' by the authors, and no concrete access information (link, DOI, specific citation for access) is provided for it.
Dataset Splits	No	The paper mentions stopping pre-training based on 'validation perplexity' and selecting samples 'from the test set (1,500 samples in total)', but it does not provide specific training/validation/test split percentages or explicit counts for all splits of the PHED dataset.
Hardware Specification	No	The paper does not provide specific details about the hardware used, such as GPU/CPU models or specific cloud instances.
Software Dependencies	No	The paper mentions using Transformer architecture and BERT, but does not provide specific version numbers for any software libraries or dependencies.
Experiment Setup	Yes	We conduct hyper-parameters search for finding the best α and β for SIA. ... We set SIA s α = 0.2 and β = 40.0. ... during training the batches will be fed in the same order. ... During inference, we apply beam search decoding with beam size 10 for all models. We add the length normalization technique (Wu et al. 2016). The temperature is set to be 1.0 as it yields the best result.