Contradiction Retrieval via Contrastive Learning with Sparsity

Authors: Haike Xu, Zongyu Lin, Kai-Wei Chang, Yizhou Sun, Piotr Indyk

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct contradiction retrieval experiments on Arguana, MSMARCO, and Hotpot QA, where our method produces an average improvement of 11.0% across different models. We also validate our method on downstream tasks like natural language inference and cleaning corrupted corpora.
Researcher Affiliation Academia 1MIT 2University of California, Los Angeles. Correspondence to: Haike Xu <EMAIL>.
Pseudocode No The paper describes the SPARSECL method and its components in Section 3 and details training procedures in Section 4, but it does not present any formal pseudocode or algorithm blocks.
Open Source Code No The paper does not explicitly state that code is being released, nor does it provide a link to a code repository in the main text.
Open Datasets Yes We first evaluate our method on the counter-argument detection dataset Arguana (Wachsmuth et al., 2018) and two contradiction retrieval datasets adapted from Hotpot QA (Yang et al., 2018) and MSMARCO (Nguyen et al., 2016).
Dataset Splits Yes The dataset is split into the training set (60% of the data), the validation set (20%), and the test set (20%). This ensures that data from each individual debate is included in only one set and that debates from every theme are represented in every set. ... We generate the paraphrases and contradictions for the validation set, test set, and a randomly sampled 10000 documents from the training set. Please refer to Appendix G for details.
Hardware Specification Yes Most of our experiments are not so computationally extensive, which can be run by one single A6000 GPU. We run our major experiments on A6000 and A100 GPUs.
Software Dependencies No Table 10 mentions specific models like GTE-large-en-v1.5, UAE-Large-V1, and bge-base-en-v1.5, and their backbones (BERT + Ro PE + GLU), but does not provide specific version numbers for underlying software libraries like Python, PyTorch, or TensorFlow.
Experiment Setup Yes Please refer to Table 10 for our training parameters. ... We set max sequence length to be 512 for Arguana dataset and 256 for Hotpot QA and MSMARCO datasets. ... We select α based on the best NDCG@10 score on the validation set.