reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Frontiers to the learning of nonparametric hidden Markov models

Authors: Kweku Abraham, Elisabeth Gassiat, Zacharie Naulet

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	This is the ﬁrst work conducting a precise nonasymptotic, nonparametric analysis of the minimax risk taking into account all aspects of the hardness of the problem, in the case of two populations. Our analysis reveals an unexpected interplay between the distance to the i.i.d. boundary and the relative smoothnesses of the two populations: a surprising and intriguing transition occurs in the rate when the two densities have diﬀering smoothnesses. We obtain upper and lower bounds revealing that, close to the i.i.d. boundary, it is possible to borrow strength from the estimator of the smoother density to improve the risk of the other. Keywords: Hidden Markov Models, Mixture Models, Inverse Problems, Nonparametric Estimation, Minimax
Researcher Affiliation	Academia	Kweku Abraham EMAIL University of Cambridge, Statistical Laboratory Wilberforce Road Cambridge CB3 0WB, UK Elisabeth Gassiat EMAIL Universit e Paris-Saclay, CNRS Laboratoire de math ematiques d Orsay 91405, Orsay, France Zacharie Naulet EMAIL Universit e Paris-Saclay, INRAE Ma IAGE 78350, Jouy-en-Josas, France
Pseudocode	Yes	Algorithm 1 Estimation procedure Require: An observed chain (Y1, . . . , Yn). Ensure: Estimators ˆp, ˆq, ˆf0 and ˆf1. 1: Estimation of a good separating function ψ2 (see Section 3.3) 2: Estimation of (φ1, φ2) and then (p, q) (see Section 3.4) 3: Estimation of (f0, f1) using block thresholding with estimators of the wavelet coeﬃcients (see Section 3.5 for the case s0 = s1, or Section 3.6 otherwise).
Open Source Code	No	Our algorithm is thus simple and computationally eﬃcient, avoiding any non-convex optimization step. It thus provides a promising alternative to existing methods. Further practical implementation may require additional work on tuning the hyperparameters, which is beyond the scope of this paper and a consideration for future research.
Open Datasets	No	The paper is theoretical, focusing on the mathematical analysis of minimax risk in Hidden Markov Models. It does not describe experiments using any specific, named datasets, nor does it provide links or citations for data access.
Dataset Splits	No	The paper is theoretical and focuses on minimax rates and estimation procedures for nonparametric HMMs. It does not involve empirical experiments with specific datasets, and therefore no mention of dataset splits is present.
Hardware Specification	No	The paper focuses on theoretical analysis, presenting minimax rates and estimation algorithms. It discusses computational complexity in general terms (e.g., O(n log(n))) but does not describe any specific hardware used for running experiments.
Software Dependencies	No	The paper is theoretical, describing algorithms (Algorithm 1, 2, 3) and mathematical methods like wavelet analysis. While it references theoretical work on wavelets (Cohen et al., 1993) and other statistical methods, it does not specify any software libraries or tools with version numbers required for implementation or reproduction.
Experiment Setup	No	The paper is theoretical, focused on developing and analyzing minimax estimation rates for Hidden Markov Models. It describes estimation procedures and algorithms but does not include an experimental setup with hyperparameters, training configurations, or system-level settings for empirical evaluation.