reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Abstract Rule Learning for Paraphrase Generation

Authors: Xianggen Liu, Wenqiang Lei, Jiancheng Lv, Jizhe Zhou

IJCAI 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results demonstrate the superiority of RULER over previous state-of-the-art methods in terms of paraphrase quality, generalization ability and interpretability. We evaluate the effectiveness of our method on two benchmark paraphrasing datasets, namely, the Quora question pairs and Wikianswers datasets.
Researcher Affiliation	Academia	College of Computer Science, Sichuan University EMAIL, EMAIL
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement about open-sourcing the code or a link to a code repository.
Open Datasets	Yes	We evaluate RULER on two widely used datasets, namely, the Quora question pairs and Wikianswers datasets. The Wikianswers dataset [Fader et al., 2013] comprises 2.3M pairs of question paraphrases scraped from the Wikianswers website.
Dataset Splits	Yes	On these two datasets, we adopt the same data splits with Hosking and Lapata [2021] for a fair comparison.
Hardware Specification	No	The paper does not specify any hardware details (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions software like "Transformer architecture" and "SEPARATOR", but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	The paraphrase generators PGα and PGβ adopt the Transformer architecture and the previous best performing paraphraser model (i.e., SEPARATOR), respectively. To have a fair comparison with SEPARATOR, we use the same hyperparameters with it. ... The minimum number Cmin of matched samples was set to 16. The improvement threshold τ of the generator loss is 0.2.