reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Unaligned Message-Passing and Contextualized-Pretraining for Robust Geo-Entity Resolution

Authors: Yuwen Ji, Wenbo Xie, Jiaqi Zhang, Chao Wang, Ning Guo, Lei Shi, Yue Zhang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that our method surpasses the baselines, achieving higher F1 scores on 8 real-world geodatasets in terms of robustness, with an improvement of up to 7.9%. The ablation study further justifies our proposal.
Researcher Affiliation	Collaboration	Yuwen Ji1,2, Wenbo Xie1*, Jiaqi Zhang1, Chao Wang1, Ning Guo1, Lei Shi2, Yue Zhang3 1Amap, China 2Beihang University, China 3Westlake University, China EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes methods and equations for UMP and CP (e.g., Eq. (6) to (11)), but it does not present any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code and processed data can be found in this url: https://github.com/2022neo/ger_ump_cp
Open Datasets	Yes	Our experiments use three renowned geo-entity databases: Open Street Map (OSM) is a collaborative open-source mapping project with points of interest such as landmarks; Yelp and Foursquare (FSQ) provide user-generated content, offering insights into business, urban mobility, social dynamics.
Dataset Splits	Yes	The annotated data is divided into training, validation, and test sets, as detailed in Table 1.
Hardware Specification	No	The paper does not specify any particular hardware (e.g., GPU models, CPU types, memory amounts) used for conducting the experiments.
Software Dependencies	No	The paper mentions integrating with 'Geo-ER' and 'BERT' but does not provide specific version numbers for these or other software libraries/dependencies (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup	Yes	For all additional attention layers introduced, the dimensions of the query and key are set to 256 (corresponding to dq and dk in Eq. (6)); the dimension of the value dv = d matches the output feature dimension of the expanded transformer; in Eq. (11), dimension for aggregation d is set to 256, and the activation function σ is set to be sigmoid or softmax. During pretraining, we empirically retrieve neighbors with a random cutoff: we randomly select 50 to 150 of the nearest neighbors within 1000 meters of the pivot entity. The number of algorithm runs for each reported result is 10. Given a perturbation ratio ρ, we randomly select ρ% of attribute/value token positions in all entities for perturbation, replacing tokens with either [MASK] (50%) or random tokens (50%), to simulate value missing and spelling error respectively.