A Spectral Algorithm for Inference in Hidden semi-Markov Models

Authors: Igor Melnyk, Arindam Banerjee

JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluations on synthetic and real data demonstrate the advantage of the algorithm over EM in terms of speed and accuracy, especially for large data sets. Keywords: Graphical models, hidden semi-Markov model, spectral algorithm, tensor analysis, aviation safety
Researcher Affiliation Collaboration Igor Melnyk EMAIL IBM T. J. Watson Research Center Yorktown Heights, NY 10598, USA Arindam Banerjee EMAIL Department of Computer Science and Engineering University of Minnesota Minneapolis, MN 55414, USA
Pseudocode Yes Algorithm 1 Basic Spectral Algorithm for HSMM inference Input: Training sequences: Si = {oi 1, . . . , oi Ti}, i = 1, . . . , N. Testing sequence: Stest = {otest 1 , . . . , otest T }. Output: p(Stest)
Open Source Code No The paper does not provide any explicit statements about code availability, nor does it include links to a code repository. Mentions of algorithms like EM are references to existing methods, not the authors' implementation code.
Open Datasets Yes NASA. Flight data set. Available at https://c3.nasa.gov/dashlink/projects/85/.
Dataset Splits Yes For this, we defined two HSMMs, one with no = 3, nx = 2, nd = 2 and another with no = 5, nx = 4, nd = 6. For each model, we generated a set of Ntrain = {500, 1000, 5000, 104, 105} training and Ntest ==1000 testing sequences, each of length T = 100.
Hardware Specification No The paper thanks the Minnesota Supercomputing Institute (MSI) for computing support, but does not specify any particular hardware components such as CPU or GPU models, or memory.
Software Dependencies No The paper mentions algorithms like 'expectation maximization (EM)' and 'Baum-Welch algorithm' but does not specify any software libraries or packages with version numbers used for implementation.
Experiment Setup Yes For this, we defined two HSMMs, one with no = 3, nx = 2, nd = 2 and another with no = 5, nx = 4, nd = 6. For each model, we generated a set of Ntrain = {500, 1000, 5000, 104, 105} training and Ntest ==1000 testing sequences, each of length T = 100.