Hierarchical Vector Quantization for Unsupervised Action Segmentation
Authors: Federico Spurio, Emad Bahrami, Gianpiero Francesca, Juergen Gall
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on three public datasets, namely Breakfast, You Tube Instructional and IKEA ASM. Our approach outperforms the state of the art in terms of F1 score, recall and JSD. ... We provide additional implementation details in the supp. material, which also includes additional ablation studies on the impact of the hyperparameters. |
| Researcher Affiliation | Collaboration | 1University of Bonn, Germany 2Toyota Motor Europe, Belgium 3Lamarr Institute for Machine Learning and Artificial Intelligence, Germany |
| Pseudocode | No | The paper describes the method's steps within the main text and equations (1-9) but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/Fede Spu/HVQ |
| Open Datasets | Yes | We evaluate the proposed method on three public datasets: Breakfast (Kuehne, Arslan, and Serre 2014), You Tube Instructional (Alayrac et al. 2016) and IKEA ASM (Ben Shabat et al. 2021). |
| Dataset Splits | No | Following the protocol that has been introduced by Kukleva et al. (2019), we apply our approach to all videos of each activity separately. K is set to the max number of subactions that appear for each activity. This is required for a fair comparison of the methods. We establish the mapping between predicted cluster segments and ground-truth segments via Hungarian matching, which is computed over all videos and their frames of one activity. |
| Hardware Specification | Yes | We measured the runtime on the same workstation with Intel i9-13900k CPU with 24 cores and one NVIDIA RTX 3090 GPU. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | The overall loss for training our model is then the combination of the commitment loss for the vector quantization and the reconstruction loss of the auto-encoder: L = Lcommitz + Lcommitq + λrec Lrec (7) where λrec weights the two loss terms. ... We also study the impact of the weight parameter λrec for the loss term (7) and we report the results in Tab. 5. Using λrec = 0.002 performs well and we use it for all other experiments. ... In Tab. 6, we show the results on the Breakfast dataset considering different numbers of prototypes in Z, which is steered by different values of α. |