reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Creating Coherence in Federated Non-Negative Matrix Factorization

Authors: Sebastian Dalleiger, Aristides Gionis

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On a diverse set of real-world and synthetic datasets, we demonstrate the effectiveness of our methods. 4 Experiments Having introduced our algorithms, we now systematically evaluate their practical performance. We implemented our methods in the Julia programming language and ran experiments on 16 cores of an AMD EPYC 7702 CPU and a single NVIDIA A40 GPU, reporting wall-clock time.
Researcher Affiliation	Academia	Sebastian Dalleiger, Aristides Gionis KTH Royal Institute of Technology, Stockhom, Sweden EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Fixed-point Barycenter Algorithm 2: FNMF: Federated NMF
Open Source Code	Yes	Replication Material doi.org/10.5281/zenodo.14501532 We provide the source code, datasets, synthetic dataset generator, hyperparameters, and other information needed for reproducibility.1 https://doi.org/10.5281/zenodo.14501532
Open Datasets	Yes	Replication Material doi.org/10.5281/zenodo.14501532 We provide the source code, datasets, synthetic dataset generator, hyperparameters, and other information needed for reproducibility.1 https://doi.org/10.5281/zenodo.14501532 Goodreads (Kotkov et al. 2022) (user books), Netflix Prize (Netflix, Inc. 2009), and Movielens 25M (Harper and Konstan 2015) (user movies). For the biomedical domain, we use TCGA (National Cancer Institute 2005) cancer gene expressions, as well as use single-cell brain protein data HPA from the Human Protein Atlas (Sj ostedt, Zhong, and et. al 2020). To cover computer vision, we use MNIST handwritten digits (Le Cun et al. 1998), Fashion MNIST clothing images (Xiao, Rasul, and Vollgraf 2017), and CIFAR-10 tiny images (Krizhevsky and Hinton 2009). In the area of natural language processing, we extract 200 000 random rows from the tf-idf matrix for all stopword-free, and lemmatized Ar Xiv abstracts (Ar Xiv.org Collaborations 2024).
Dataset Splits	No	The paper describes generating synthetic data and distributing data to clients (e.g., "distribute a fixed amount of data (left) and proportionally-growing data (right) to an increasing number of clients"). For real-world datasets, it mentions using them with 50 clients. However, it does not explicitly provide details on how these datasets were split into training, validation, or test sets with specific percentages, sample counts, or methodologies for reproduction.
Hardware Specification	Yes	We implemented our methods in the Julia programming language and ran experiments on 16 cores of an AMD EPYC 7702 CPU and a single NVIDIA A40 GPU, reporting wall-clock time.
Software Dependencies	No	The paper mentions that methods were implemented in the "Julia programming language" but does not provide version numbers for Julia or any other specific software libraries, frameworks, or solvers used in the implementation.
Experiment Setup	No	The paper states: "We provide the source code, datasets, synthetic dataset generator, hyperparameters, and other information needed for reproducibility." (Replication Material doi.org/10.5281/zenodo.14501532). However, it does not explicitly list concrete hyperparameter values, training configurations, or system-level settings within the main text of the paper.