reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Higher Order Transformers With Kronecker-Structured Attention

Authors: Soroush Omranpour, Guillaume Rabusseau, Reihaneh Rabbany

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on 2D and 3D datasets show that HOT achieves competitive performance in multivariate time series forecasting and image classification, with significantly reduced computational and memory costs. Visualizations of mode-wise attention matrices further reveal interpretable high-order dependencies learned by HOT, demonstrating its versatility for complex multiway data across diverse domains. The implementation of our proposed method is publicly available at https://github.com/s-omranpour/HOT.
Researcher Affiliation	Academia	Soroush Omranpour EMAIL Mila Mc Gill University Reihaneh Rabbany EMAIL Mila, CIFAR AI Chair Mc Gill University Guillaume Rabusseau EMAIL Mila DIRO, CIFAR AI Chair University of Montreal
Pseudocode	No	The paper describes methods and formulas but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The implementation of our proposed method is publicly available at https://github.com/s-omranpour/HOT.
Open Datasets	Yes	Datasets We include four real-world datasets in our experiments, including ECL, Traffic, Weather used by Autoformer (Wu et al., 2021), and Solar-Energy proposed in LSTNet (Lai et al., 2017). Further dataset details are in the Appendix. (...) Dataset Med MNIST v2 (Yang et al., 2023) is a large-scale benchmark for medical image classification on standardized MNIST-like 2D and 3D images with diverse modalities, dataset scales, and tasks. (...) Dataset The SSL4EO-L Benchmark dataset (Stewart et al., 2023) is a collection of Landsat images paired with land cover classification masks.
Dataset Splits	Yes	We follow the data processing and train-validation-test split protocol used in Times Net (Wu et al., 2023), ensuring datasets are chronologically split to prevent any data leakage. For forecasting tasks, we use a fixed lookback window of 96 time steps for the Weather, ECL, Solar-Energy, and Traffic datasets, with prediction lengths of 96, 192, 336, 720. Further dataset details are presented in Table 4. (...) We follow the official split of training/validation/test sets. (...) The dataset includes 25,000 labeled images divided into training (20,000), validation (2,500), and test (2,500) splits, ensuring balanced data distribution.
Hardware Specification	Yes	All the experiments are implemented in Py Torch and conducted on a single NVIDIA A100 GPU (80 GB) with an x86_64 CPU (6 cores) and 64 GB of RAM.
Software Dependencies	No	All the experiments are implemented in Py Torch and conducted on a single NVIDIA A100 GPU with 80 GB of memory, x86_64 CPU with 6 cores and 64GB of RAM. We utilize ADAM (Kingma & Ba, 2017) with an initial learning rate of 2 10 4 and L2 loss for the timeseries forecasting task and cross-entropy loss for the medical image classification task. The paper mentions PyTorch and ADAM, but does not provide specific version numbers for them.
Experiment Setup	Yes	We utilize ADAM (Kingma & Ba, 2017) with an initial learning rate of 2 10 4 and L2 loss for the timeseries forecasting task and cross-entropy loss for the medical image classification task. Our experiments show that using weight decay is crucial for avoiding overfitting in most cases. The batch size is uniformly set to 32, and the number of training epochs is fixed to 100. We conduct hyperparameter tuning based on the search space shown in Table 6.