reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Dimension-free Concentration Bounds on Hankel Matrices for Spectral Learning

Authors: François Denis, Mattias Gybels, Amaury Habrard

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The theoretical bounds described in the previous Sections have been evaluated on the benchmark of PAutoma C (Verwer et al., 2012). These experiments show that the bounds derived from our theoretical results for bounded random matrices are quite tight compared to the exact values and that they significantly improve existing bounds, even on matrices of fixed dimensions. (Section 1 - Introduction)
Researcher Affiliation	Academia	Fran cois Denis EMAIL Mattias Gybels EMAIL Aix Marseille Universit e, CNRS LIF UMR 7279 13288 Marseille Cedex 9, FRANCE Amaury Habrard EMAIL Universit e de Lyon, UJM-Saint-Etienne, CNRS, UMR 5516 Laboratoire Hubert Curien F-42023 Saint-Etienne, FRANCE
Pseudocode	No	The paper describes the 'Spectral Algorithm for Learning Rational Stochastic Languages' in Section 2.3, listing steps verbally (e.g., 'choose sets U,V Σ and build the Hankel matrix HU V S', 'compute a SVD of HU V S') rather than presenting them in a structured pseudocode block or algorithm environment.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code for the methodology described, nor does it provide any links to a code repository. It only mentions the use of 'standard SVD algorithms available from Num Py or Sci Py', referring to third-party tools.
Open Datasets	Yes	The theoretical bounds described in the previous Sections have been evaluated on the benchmark of PAutoma C (Verwer et al., 2012). The benchmark of PAutoma C provides samples of strings generated from probabilistic automata and designed to evaluate probabilistic automata learning.
Dataset Splits	No	Each problem comprises a sample S of strings independently drawn from the target. We provide the cardinal of S, the maximal length and the average length of strings in S. The paper uses a single sample (S) for evaluation and does not specify any training/test/validation splits for the datasets.
Hardware Specification	No	The paper mentions that 'the sparsity of the Hankel matrices makes the use of standard SVD algorithms available from Num Py or Sci Py possible,' and refers to 'our computing resources,' but it does not specify any particular hardware (e.g., GPU/CPU models, memory, or cloud instances) used for the experiments.
Software Dependencies	No	The paper mentions the use of 'Num Py or Sci Py' for SVD algorithms, but it does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	For each problem, the exact value of HU,V S HU,V p 2 is computed for sets U and V of the form Σ l, where we have maximized l according to our computing resources. This value is compared to the bounds provided by Theorem 7 and Equation (3), with δ = 0.05 (Table 2).