Scenario-Based Robust Optimization of Tree Structures

Authors: Spyros Angelopoulos, Christoph Dürr, Alex Elenter, Georgii Melidi

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 5, we provide an experimental evaluation of all our algorithms over real data. To evaluate and compare our algorithms from Sections 2 and 3, we also provide mixed integer linear program formulations for all objectives, which allows us to compute the optimal trees for small instances. GUROBI solved the MILPs for our experimental setting in times ranging from 7 to 15 seconds on a standard laptop. In contrast, R-BST and R-HT run in less than 0.1 seconds, using a Python implementation. Table 1 summarizes the results of the experiments. We report experiments on our Pareto-optimal algorithm of Section 4, denoted by PO.
Researcher Affiliation Academia 1Sorbonne University, LIP6, Paris, France 2CNRS 3International Laboratory on Learning Systems, Montreal, Canada EMAIL
Pseudocode Yes Algorithm 1: R-BST Input: k scenarios described by frequency vectors F 1, . . . , F k. Output: A robust BST represented as a level vector L. Algorithm 2: Recursive procedure A Algorithm 3: R-HT Input: k scenarios described by frequency vectors F 1, . . . , F k. Output: A robust Huffman tree T.
Open Source Code Yes Code https://gitlab.lip6.fr/gmelidi/robust-tree
Open Datasets Yes We used open data from (Wikipedia 2024). Specifically, we chose ten European languages as corresponding to ten different scenarios, based on the frequency of each letter in the corresponding language.
Dataset Splits No The paper defines how different scenarios (datasets for evaluation) are generated based on language letter frequencies or city names, but it does not specify any training/validation/test splits for these datasets. For example, 'For all a, b {0, . . . , 11}, we generated all strings s of size a + b, i.e., a keys from scenario 0 (country 0) and b keys from scenario 1 (country 1).'
Hardware Specification No GUROBI solved the MILPs for our experimental setting in times ranging from 7 to 15 seconds on a standard laptop. The term 'standard laptop' does not provide specific hardware details (e.g., CPU, GPU model, RAM).
Software Dependencies No To evaluate and compare our algorithms from Sections 2 and 3, we also provide mixed integer linear program formulations for all objectives, which allows us to compute the optimal trees for small instances. This allows us to compute optimal trees with the help of commercial MILP solvers, such as GUROBI. R-BST and R-HT run in less than 0.1 seconds, using a Python implementation. The paper mentions GUROBI and Python but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes We used open data from (Wikipedia 2024). Specifically, we chose ten European languages as corresponding to ten different scenarios, based on the frequency of each letter in the corresponding language. The universe of all possible keys is represented by the names of cities from 10 different countries (the same counties as in Section 5.1). From this data, a string s of length 2n, for some chosen n, is generated by selecting two of the above countries (which we call country 0 and country 1) and the n largest in population cities from each country. We run algorithm PO and found the regret-based Pareto front for each string s generated as above, choosing n = 30.