HiClass: a Python Library for Local Hierarchical Classification Compatible with Scikit-learn

Authors: Fábio M. Miranda, Niklas Köhnecke, Bernhard Y. Renard

JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Figure 2, we compare the hierarchical F-score, computational resources (measured with the command time) and disk usage. This comparison was performed between two flat classifiers from the library scikit-learn and Microsoft's Light GBM (Ke et al., 2017) versus the local hierarchical classifiers implemented in Hi Class. In order to avoid bias, cross-validation and hyperparameter tuning were performed on the local hierarchical classifiers and flat classifiers. For comparison purposes, we used a snapshot from 02/11/2022 of the consumer complaints data set provided by the Consumer Financial Protection Bureau of the United States (Bureau and General, 2022).
Researcher Affiliation Academia Fabio M. Miranda EMAIL Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, 14482 Potsdam, Germany Department of Mathematics and Computer Science, Free University of Berlin, 14195 Berlin, Germany Niklas K ohnecke EMAIL Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, 14482 Potsdam, Germany Bernhard Y. Renard EMAIL Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, 14482 Potsdam, Germany
Pseudocode No The paper describes the algorithms for Local Classifier Per Node, Local Classifier Per Parent Node, and Local Classifier Per Level in Appendix C using descriptive text and figures, and defines training policies in tables, but does not provide structured pseudocode or algorithm blocks.
Open Source Code Yes Source code and documentation are available at https://github.com/scikit-learn-contrib/hiclass.
Open Datasets Yes For comparison purposes, we used a snapshot from 02/11/2022 of the consumer complaints data set provided by the Consumer Financial Protection Bureau of the United States (Bureau and General, 2022), which after preprocessing contained 727,495 instances for cross-validation and hyperparameter tuning as well as training and 311,784 more for validation.
Dataset Splits Yes First the data set was split with 70% of the data being used for hyperparameter tuning and training, while 30% was held for a final evaluation. The subset with 70% of data held for training was further split into 5 subsets for 5-fold cross-validation and identification of best hyperparameter combination.
Hardware Specification Yes The benchmark was computed on multiple cluster nodes running GNU/Linux with 512 GB physical memory and 128 cores provided by two AMD EPYC 7742 processors.
Software Dependencies No The paper mentions 'Packages for Python 3.7-3.9' and various libraries like 'scikit-learn', 'NumPy', 'NetworkX', 'Ray', 'Joblib', 'Hydra', 'Optuna', and 'Light GBM'. While Python has a version range, specific version numbers for the other key software components used in the methodology are not explicitly provided in the text.
Experiment Setup Yes For hyperparameter tuning, the models were trained using 4 folds as training data and validated on the remaining one. This process was repeated 5 times, with different folds combinations being used in each iteration, and the average hierarchical F-score was reported as the performance metric. The selection of the best hyperparameters was assisted by Hydra (Meta, 2022) and its plugin Optuna (Akiba et al., 2019), through a grid search using the combinations of hyperparameters described in Tables 2-4.