Contradiction Retrieval via Contrastive Learning with Sparsity
Authors: Haike Xu, Zongyu Lin, Kai-Wei Chang, Yizhou Sun, Piotr Indyk
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct contradiction retrieval experiments on Arguana, MSMARCO, and Hotpot QA, where our method produces an average improvement of 11.0% across different models. We also validate our method on downstream tasks like natural language inference and cleaning corrupted corpora. |
| Researcher Affiliation | Academia | 1MIT 2University of California, Los Angeles. Correspondence to: Haike Xu <EMAIL>. |
| Pseudocode | No | The paper describes the SPARSECL method and its components in Section 3 and details training procedures in Section 4, but it does not present any formal pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not explicitly state that code is being released, nor does it provide a link to a code repository in the main text. |
| Open Datasets | Yes | We first evaluate our method on the counter-argument detection dataset Arguana (Wachsmuth et al., 2018) and two contradiction retrieval datasets adapted from Hotpot QA (Yang et al., 2018) and MSMARCO (Nguyen et al., 2016). |
| Dataset Splits | Yes | The dataset is split into the training set (60% of the data), the validation set (20%), and the test set (20%). This ensures that data from each individual debate is included in only one set and that debates from every theme are represented in every set. ... We generate the paraphrases and contradictions for the validation set, test set, and a randomly sampled 10000 documents from the training set. Please refer to Appendix G for details. |
| Hardware Specification | Yes | Most of our experiments are not so computationally extensive, which can be run by one single A6000 GPU. We run our major experiments on A6000 and A100 GPUs. |
| Software Dependencies | No | Table 10 mentions specific models like GTE-large-en-v1.5, UAE-Large-V1, and bge-base-en-v1.5, and their backbones (BERT + Ro PE + GLU), but does not provide specific version numbers for underlying software libraries like Python, PyTorch, or TensorFlow. |
| Experiment Setup | Yes | Please refer to Table 10 for our training parameters. ... We set max sequence length to be 512 for Arguana dataset and 256 for Hotpot QA and MSMARCO datasets. ... We select α based on the best NDCG@10 score on the validation set. |