DOGR: Leveraging Document-Oriented Contrastive Learning in Generative Retrieval
Authors: Penghao Lu, Xin Dong, Yuansheng Zhou, Lei Cheng, Chuan Yuan, Linjian Mo
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that DOGR achieves state-of-the-art performance compared to existing generative retrieval methods on two public benchmark datasets. Further experiments have shown that our framework is generally effective for common identifier construction techniques. |
| Researcher Affiliation | Industry | Ant Group EMAIL |
| Pseudocode | No | The paper describes the proposed scheme and two-stage learning strategy in paragraph form and through a visual diagram (Figure 1), but it does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states: "Our experimental code is implemented on Python 3.8 using transformers 4.37.0, while experiments are conducted on 6 NVIDIA A100 GPUs with 80 GB of memory." This describes the implementation environment but does not explicitly state that the code for the methodology is being released or provide a link to a repository. |
| Open Datasets | Yes | Natural Questions(NQ320k) (Kwiatkowski et al. 2019) contains 320k training data (relevant query-document pairs), 100k documents, and 7,830 test queries... MS MARCO passage ranking (MS MARCO) (Nguyen et al. 2016) is a large-scale benchmark dataset that includes 8.8 million passages collected from Bing search results and 1 million real-world queries, with the test set containing 6980 queries. |
| Dataset Splits | Yes | Natural Questions(NQ320k) (Kwiatkowski et al. 2019) contains 320k training data (relevant query-document pairs), 100k documents, and 7,830 test queries... We follow the same setup as previous work (Lee et al. 2023) and split the test set into two subsets: seen test and unseen test. |
| Hardware Specification | Yes | experiments are conducted on 6 NVIDIA A100 GPUs with 80 GB of memory. |
| Software Dependencies | Yes | Our experimental code is implemented on Python 3.8 using transformers 4.37.0 |
| Experiment Setup | Yes | In the training phase, batch size is set to 256 and 32, and the model is optimized for up to 3M and 1M steps using the Adam optimizer with learning rates 5e-5 for identifier generation stage and document-level ranking stage, respectively. The number of negatives from retrieval-augmented negative sampling is set to 4 per query, while the prefix-oriented negative sampling employs in-batch negatives. In the document ranking stage, τ is set to 0.5 as the temperature parameter for contrastive learning, and λg is set to 0.1 to balance the generative task and the contrastive learning task. In the inference phase, we use the beam search with constrained decoding during inference and set the beam size to 100. |