reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Locality Preserving Markovian Transition for Instance Retrieval

Authors: Jifei Luo, Wenzheng Wu, Hantao Yao, Lu Yu, Changsheng Xu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results across diverse tasks confirm the effectiveness of LPMT for instance retrieval.
Researcher Affiliation	Academia	1University of Science and Technology of China, Hefei, China 2Tianjin University of Technology, Tianjin, China 3Institute of Automation, Chinese Academy of Sciences, Beijing, China. Correspondence to: Hantao Yao <EMAIL>.
Pseudocode	Yes	Algorithm 1 Bidirectional Collaborative Diffusion Input: extended adjacency matrices set {W v}m v=1, hyperparameters λ, µ, max number of iterations maxiter. Output: adaptive smoothed similarity matrix F . Algorithm 2 Effective Solution of Eq. (A.13) Input: Adjacency matrix set {W v}m v=1, initial estimation of similarity matrix F (0), normalized Kronecker matrix { Sv}m v=1, identity matrix I Rn2 n2, max number of iterations maxiter, hyper-parameter µ, λ, iteration tolerance δ. Output: F = vec 1(f ).
Open Source Code	No	The paper does not provide explicit statements about releasing source code or direct links to code repositories.
Open Datasets	Yes	Datasets. To demonstrate the effectiveness of the proposed Locality Preserving Markovian Transition (LPMT) method, we conduct experiments on the revised (Radenovi c et al., 2018) Oxford5k (ROxf) (Philbin et al., 2007) and Paris6k (RPar) (Philbin et al., 2008) datasets, respectively. To further evaluate performance at scale, an extra collection of one million distractor images is incorporated, forming the large-scale ROxf+1M and RPar+1M datasets. Additionally, following the split strategy of Hu et al. (2020), we perform unsupervised content-based image retrieval on datasets like CUB200 (Wah et al., 2011), Indoor (Quattoni & Torralba, 2009), and Caltech101 (Fei-Fei et al., 2004) to identify images belonging to the same classes.
Dataset Splits	Yes	For classical instance retrieval tasks, the image database is further divided into Easy (E), Medium (M), and Hard (H) categories based on difficulty levels. Given that the positive samples are limited in unsupervised content-based retrieval tasks, Recall@1 (R@1) is also reported to quantify the accuracy of the first retrieved image. ... following the split strategy of Hu et al. (2020), we perform unsupervised content-based image retrieval on datasets like CUB200 (Wah et al., 2011), Indoor (Quattoni & Torralba, 2009), and Caltech101 (Fei-Fei et al., 2004) to identify images belonging to the same classes.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions using 'advanced deep retrieval models' (R-Ge M, MAC/R-MAC, DELG, DOLG, CVNet, SENet) but does not provide specific version numbers for these models or the underlying software frameworks (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	The hyper-parameters k1 and k2 introduced in LSE determine the size of the local region and the number of confident neighborhoods. As shown in Fig. 5(a), retrieval performance peaks at k1 = 60, suggesting that selecting a moderate region size is crucial to incorporate sufficient informative instances. Similarly, Fig. 5(b) shows that k2 = 7 yields optimal performance, highlighting the importance of balancing neighborhood size and the proportion of correct samples for improved representation. ...the reciprocal neighbors enhance the LSE distribution, controlled by a hyper-parameter κ. As illustrated in Fig. 5(c), performance increases with κ and reaches its maximum at κ = 2. Meanwhile, the hyper-parameter θ serves as a balancing weight to fuse the original Euclidean distance with the thermodynamic transition flow cost. Fig. 5(d) reveals that θ = 0.5 yields the optimal result, demonstrating that incorporating the original distance enhances the retrieval robustness. Additional analyses of hyper-parameters such as µ and σ are provided in Fig. 5(e) and (f), confirming the robustness of our approach towards their variations.