Sample Complexity of Branch-length Estimation by Maximum Likelihood

Authors: David Clancy Jr., Hanbaek Lyu, Sebastien Roch

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this work, we provide the first theoretical guarantee as to why this might be the case. We show that deep inside the Kesten-Stigum reconstruction regime, provided with polynomially many m samples (assuming the tree is balanced), there exists a universal parameter regime (independent of the size of the tree) where the log-likelihood function is strongly concave and smooth with high probability. On this high-probability likelihood landscape event, we show that the standard coordinate maximization algorithm converges exponentially fast to the maximum likelihood estimator, which is within O(1/ m) from the true parameter, provided a sufficiently close initial point.
Researcher Affiliation Academia 1Department of Mathematics, University of Wisconsin-Madison, WI, USA. Correspondence to: David Clancy, Jr. <EMAIL>, Hanbaek Lyu <EMAIL>, Sebastien Roch <EMAIL>.
Pseudocode Yes See Alg. 1 in Appendix A for a detailed implementation of the algorithm.
Open Source Code No No explicit statement about code availability or a link to a code repository is provided in the paper.
Open Datasets No The paper describes generating synthetic data for a theoretical problem: 'm samples σ(1), . . . , σ(m) from the process as described above sampled from the true model Pθ'. It does not refer to any external or publicly available datasets.
Dataset Splits No The paper is theoretical and focuses on mathematical proofs and algorithm convergence. It describes a simulated process ('m independent samples σ(1), . . . , σ(m) from the process as described above') rather than using pre-existing datasets that would require explicit training/test/validation splits.
Hardware Specification No The paper presents theoretical analysis and mathematical proofs; it does not describe any empirical experiments or the hardware used to run them.
Software Dependencies No The paper is theoretical and does not mention any specific software dependencies with version numbers required to replicate its findings or algorithms.
Experiment Setup No The paper is theoretical, providing mathematical guarantees for maximum likelihood estimation and algorithm convergence. It does not describe an experimental setup, hyperparameters, or training configurations.