reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Distribution-Driven Dense Retrieval: Modeling Many-to-One Query-Document Relationship

Authors: Junfeng Kang, Rui Li, Qi Liu, Zhenya Huang, Zheng Zhang, Yanjiang Chen, Linbo Zhu, Yu Su

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we conduct extensive experiments on real-world datasets, which demonstrate that our method significantly outperforms traditional dense retrieval methods.
Researcher Affiliation	Academia	1State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China 2Institute of Artificial Intelligence, Hefei Comprehensive National Science Center 3School of Computer Science and Artificial Intelligence, Hefei Normal University EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology through textual explanations, mathematical equations, and a framework diagram (Figure 2), but does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code https://github.com/tojunfeng/DDR
Open Datasets	Yes	We conducted experiments on MS MARCO (Nguyen et al. 2016), TREC Track 2019 and 2020 (Craswell et al. 2020), followed by additional experiments on zero-shot datasets (Thakur et al. 2021).
Dataset Splits	No	The paper uses standard datasets like MS MARCO and TREC DL tracks but does not explicitly provide details on how these datasets were split into training, validation, and test sets for their experiments, nor does it specify exact percentages, sample counts, or the methodology for data partitioning.
Hardware Specification	No	The paper mentions 'Time for Retrieval per Query on GPU' in Figure 3, but does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions using PyTorch for implementation and training, AdamW optimizer, DistilBERT, ELECTRA architecture, and tools like FAISS, but it does not specify any version numbers for these software dependencies.
Experiment Setup	Yes	We implemented and trained our model using Py Torch, optimizing the network parameters with the Adam W (Loshchilov and Hutter 2017) optimizer. We applied a linear learning rate schedule with a warmup phase of 1,000 steps, setting the learning rate to 2 10 5. The parameter β was selected from {0.1, 0.2, 0.5, 1, 2, 5, 10}. The mean and variance vectors of the document distribution are set to 768 dimensions. For fair comparison with existing single-vector models, we add dense projection layers on mean vector and variance vector to make their dimensions to be 768 2 1 = 383. Following previous work(Zamani and Bendersky 2023), we used the pre-trained checkpoints provided by TAS-B (Hofst atter et al. 2021) for initialization and used Distil BERT (Sanh et al. 2019) as our initial model.