reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Aggregating Algorithm and Axiomatic Loss Aggregation

Authors: Armando J Cabrera Pacheco, Rabanus Derr, Robert Williamson

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we conceptually and empirically argue that our generalized loss aggregation functions express the attitude of the learner towards losses. We provide experimental evidence corroborating this statement, which closes the loop by motivating the use of generalized aggregations in the first place. [...] We provide the log-loss (sometimes called cross-entropy loss) profile of several applications of the AA-QS for different aggregations on a sequential weather classification task. We use the Aggregating Algorithm to aggregate probabilistic predictions of 9 simple classification algorithms.The AA-QS was implemented using the log-loss. The task is detailed in Appendix E. Table 2 summarizes our findings for the data collection from the Zugspitze (Germany). See Appendix E.1 for further experiments.
Researcher Affiliation	Academia	Armando J. Cabrera Pacheco EMAIL University of Tübingen and Tübingen AI Center Tübingen, 72070 Germany Rabanus Derr EMAIL University of Tübingen and Tübingen AI Center Tübingen, 72070 Germany Robert C. Williamson EMAIL University of Tübingen and Tübingen AI Center Tübingen, 72070 Germany
Pseudocode	Yes	Algorithm 1: Aggregating Algorithm (AA) Algorithm 2: Aggregating Pseudo-Algorithm for Quasi-Sums (APA-QS)
Open Source Code	No	The paper mentions that "The code was run on Mac Book Pro with Intel Core i9." but does not provide any statement or link for the open-sourcing of their own implementation code for the methodology described.
Open Datasets	Yes	We run a tabular classification task on weather data collection. The data is curated and provided by the DWD (German Weather Agency). The data collections are publicly available at https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/daily/kl/historical/. For the file names see Table 3.
Dataset Splits	Yes	We use the first (in chronological order) 80% of the data points as training set for the following classifiers... Then, we run the classifiers on the remaining 20% of the data. [...] For statistic of the data collections see Table 3. [Table 3 includes 'Final Number of Days (Train/Test)' with specific counts].
Hardware Specification	Yes	The code was run on Mac Book Pro with Intel Core i9.
Software Dependencies	Yes	We used Python 3.10.2, scikit learn 1.4.0, pandas 1.4.1, numpy 1.22.2, matplotlib 3.5.1.
Experiment Setup	Yes	So slightly arbitrarily, we chose η = 1 for sum and focal aggregation, we chose η = 2 for L0.5, η = 0.001 for L10 and η = 0.5 for L2.