reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Fully Dynamic Euclidean Bi-Chromatic Matching in Sublinear Update Time

Authors: Gramoz Goranci, Peter Kiss, Neel Patel, Martin P. Seybold, Eva Szilagyi, Da Wei Zheng

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To complement our primary contribution (Theorem 1.1), we present experimental evaluations of the performance of our dynamic algorithm for estimating the 1-Wasserstein distance in contrast to static state of the art approximate algorithms. Our experimental results reveal that in the dynamic setting the total running time of our algorithm over a series of updates outperforms periodic static recomputation by orders of magnitudes. Furthermore, we find that the approximation ratio obtained by our algorithm on real-world datasets is smaller than that of our theoretical guarantees and almost matches that of static algorithms. Our results confirm the expected approximation ratio drop-off and running time decrease we would expect with the reduction of the ε parameter.
Researcher Affiliation	Academia	1Faculty of Computer Science, University of Vienna, Austria. 2Faculty of Computer Science, University of Vienna, Austria. 3University of Southern California, CA, US 4Faculty of Computer Science, University of Vienna, Austria. 5Faculty of Computer Science, Uni Vie Doctoral School Computer Science Do CS, University of Vienna, Austria. Correspondence to: Eva Szilagyi <EMAIL>. 6University of Illinois at Urbana-Champaign, IL, US.
Pseudocode	Yes	Algorithm 1 INSERT-UPDATE(v, a) ... Algorithm 2 INSERT(a) ... Algorithm 3 DELETE-UPDATE(v, a) ... Algorithm 4 DELETE(a) ... Algorithm 5 INSERT-UPDATE-1(v, a) ... Algorithm 6 INSERT(a)
Open Source Code	Yes	With this work, we provide1 an opensource, single-threaded C++ implementation of our algorithm in Appendix E. 1https://github.com/Zhengdw/dyn-euc-match
Open Datasets	Yes	For the synthetic data, we drew the point coordinates from the uniform distribution on integers between 1 and 500 to obtain the Uniform datasets. For the Gaussian datasets, we drew the point coordinates from the normal distribution (µ = 0.5, σ = 0.25) prior to scaling by a factor of 500 and rounding to integers. For the real data, we extracted all trips from the Yellow-Cap Taxi dataset2 of December 2009... 2See https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page.
Dataset Splits	No	The paper describes using a sliding window for dynamic updates on the Taxi dataset, which is a specific methodology for data usage over time (width of 10 000 samples). However, it does not provide traditional train/test/validation splits for any of the datasets used, nor does it refer to standard benchmark splits for these datasets. For synthetic data, no explicit split information is provided.
Hardware Specification	Yes	we conducted experiments on a 2.2 GHz Ubuntu 22.04.4 system with AMD Opteron 6174, using synthetic and real world data sets.
Software Dependencies	No	With this work, we provide1 an opensource, single-threaded C++ implementation of our algorithm in Appendix E. While C++ is mentioned, no specific version of C++ or any libraries with their version numbers are provided.
Experiment Setup	Yes	Figure 2 shows the speedup factors (p = 8) for inserting 8 000 samples from Uniform vs Guassian (top) and Pickup vs Dropoff (bottom)... Figure 3 shows the empirical 1-Wasserstein distance... using p = 2, p = 8, and p = 32, for up to 2 000 samples... Figure 4 (top) shows the convergence of empirical 1-Wasserstein distance, using p = 2, 4, 8 for approximations... Figure 5 (top) shows the empirical 1-Wasserstein distance... using approximations with p = 4 and p = 16.