reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Keep your distance: learning dispersed embeddings on $\mathbb{S}_{m}$

Authors: Evgeniia Tokarchuk, Hua Chang Bakker, Vlad Niculae

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate ( 4) old and new methods on synthetic small and large scale problems, as well as real-world large-scale applications in computer vision and natural language processing, revealing different trade-offs and throughout confirming the importance of representation dispersion for task performance. ... We demonstrate the application of dispersion objectives and provide a comparative analysis on both synthetic and real-world tasks.
Researcher Affiliation	Academia	Evgeniia Tokarchuk EMAIL Language Technology Lab University of Amsterdam Hua Chang Bakker EMAIL University of Amsterdam Vlad Niculae EMAIL Language Technology Lab University of Amsterdam
Pseudocode	No	The paper describes algorithms like Lloyd's algorithm and Sliced Dispersion through mathematical formulations and textual descriptions (e.g., in Sections 3.2 and 3.3), but it does not include any clearly labeled pseudocode blocks or algorithm figures.
Open Source Code	Yes	A reusable library for spherical dispersion is available as open-source software: https://github.com/ltl-uva/ledoh-torch
Open Datasets	Yes	Mettes et al. (2019) showed that learning prototypes with dispersion encouraged by minimizing the maximum cosine similarity on a hypersphere improves classification results on Image Net-200 (Le & Yang, 2015). ... We report results on two WMT translation tasks:4 WMT 2016 Romanian English (ro-en) with 612K training samples and WMT 2019 English German (en-de) with 9.1M training samples (including back-translated data).
Dataset Splits	Yes	We report results on two WMT translation tasks:4 WMT 2016 Romanian English (ro-en) with 612K training samples and WMT 2019 English German (en-de) with 9.1M training samples (including back-translated data). We measure translation accuracy on the best checkpoint according to validation BLEU score using Sacre BLEU (Papineni et al., 2002; Post, 2018) and COMET (Rei et al., 2020).
Hardware Specification	Yes	The authors also thank SURF (www.surf.nl) for the support in using the National Supercomputer Snellius.
Software Dependencies	Yes	We used fairseq (Ott et al., 2019) framework for training our models. Baseline discrete models (Euclidean baseline) are trained with cross-entropy loss, label smoothing equal to 0.1 and effective batch size 65.5K tokens. All models are trained with learning rate 5 10 4 and 10k warm-up steps for 50k steps in total. ... We used Sacre BLEU (Post, 2018) with the following signature nrefs:1\|case:mixed\|eff:no\|tok:13a\|smooth:exp\|version:2.3.1 and COMET (Rei et al., 2020) with unbabel-comet library version 2.2.25 and Unbabel-wmt22-comet-da model.
Experiment Setup	Yes	Baseline discrete models (Euclidean baseline) are trained with cross-entropy loss, label smoothing equal to 0.1 and effective batch size 65.5K tokens. All models are trained with learning rate 5 10 4 and 10k warm-up steps for 50k steps in total. Spherical baseline and models with dispersion regularizer are trained by defining decoder s embeddings layer as a manifold parameter. We tune learning rate for Riemannian Adam (Becigneul & Ganea, 2019) in the range [5 10 5,5 10 4,5 10 3] and report results with the learning rate 5 10 3.