Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Abstract Rule Learning for Paraphrase Generation
Authors: Xianggen Liu, Wenqiang Lei, Jiancheng Lv, Jizhe Zhou
IJCAI 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results demonstrate the superiority of RULER over previous state-of-the-art methods in terms of paraphrase quality, generalization ability and interpretability. We evaluate the effectiveness of our method on two benchmark paraphrasing datasets, namely, the Quora question pairs and Wikianswers datasets. |
| Researcher Affiliation | Academia | College of Computer Science, Sichuan University EMAIL, EMAIL |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement about open-sourcing the code or a link to a code repository. |
| Open Datasets | Yes | We evaluate RULER on two widely used datasets, namely, the Quora question pairs and Wikianswers datasets. The Wikianswers dataset [Fader et al., 2013] comprises 2.3M pairs of question paraphrases scraped from the Wikianswers website. |
| Dataset Splits | Yes | On these two datasets, we adopt the same data splits with Hosking and Lapata [2021] for a fair comparison. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions software like "Transformer architecture" and "SEPARATOR", but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | The paraphrase generators PGα and PGβ adopt the Transformer architecture and the previous best performing paraphraser model (i.e., SEPARATOR), respectively. To have a fair comparison with SEPARATOR, we use the same hyperparameters with it. ... The minimum number Cmin of matched samples was set to 16. The improvement threshold τ of the generator loss is 0.2. |