reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Scenario-Based Robust Optimization of Tree Structures

Authors: Spyros Angelopoulos, Christoph Dürr, Alex Elenter, Georgii Melidi

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 5, we provide an experimental evaluation of all our algorithms over real data. To evaluate and compare our algorithms from Sections 2 and 3, we also provide mixed integer linear program formulations for all objectives, which allows us to compute the optimal trees for small instances. GUROBI solved the MILPs for our experimental setting in times ranging from 7 to 15 seconds on a standard laptop. In contrast, R-BST and R-HT run in less than 0.1 seconds, using a Python implementation. Table 1 summarizes the results of the experiments. We report experiments on our Pareto-optimal algorithm of Section 4, denoted by PO.
Researcher Affiliation	Academia	1Sorbonne University, LIP6, Paris, France 2CNRS 3International Laboratory on Learning Systems, Montreal, Canada EMAIL
Pseudocode	Yes	Algorithm 1: R-BST Input: k scenarios described by frequency vectors F 1, . . . , F k. Output: A robust BST represented as a level vector L. Algorithm 2: Recursive procedure A Algorithm 3: R-HT Input: k scenarios described by frequency vectors F 1, . . . , F k. Output: A robust Huffman tree T.
Open Source Code	Yes	Code https://gitlab.lip6.fr/gmelidi/robust-tree
Open Datasets	Yes	We used open data from (Wikipedia 2024). Specifically, we chose ten European languages as corresponding to ten different scenarios, based on the frequency of each letter in the corresponding language.
Dataset Splits	No	The paper defines how different scenarios (datasets for evaluation) are generated based on language letter frequencies or city names, but it does not specify any training/validation/test splits for these datasets. For example, 'For all a, b {0, . . . , 11}, we generated all strings s of size a + b, i.e., a keys from scenario 0 (country 0) and b keys from scenario 1 (country 1).'
Hardware Specification	No	GUROBI solved the MILPs for our experimental setting in times ranging from 7 to 15 seconds on a standard laptop. The term 'standard laptop' does not provide specific hardware details (e.g., CPU, GPU model, RAM).
Software Dependencies	No	To evaluate and compare our algorithms from Sections 2 and 3, we also provide mixed integer linear program formulations for all objectives, which allows us to compute the optimal trees for small instances. This allows us to compute optimal trees with the help of commercial MILP solvers, such as GUROBI. R-BST and R-HT run in less than 0.1 seconds, using a Python implementation. The paper mentions GUROBI and Python but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	We used open data from (Wikipedia 2024). Specifically, we chose ten European languages as corresponding to ten different scenarios, based on the frequency of each letter in the corresponding language. The universe of all possible keys is represented by the names of cities from 10 different countries (the same counties as in Section 5.1). From this data, a string s of length 2n, for some chosen n, is generated by selecting two of the above countries (which we call country 0 and country 1) and the n largest in population cities from each country. We run algorithm PO and found the regret-based Pareto front for each string s generated as above, choosing n = 30.