Uniform Mean Estimation for Heavy-Tailed Distributions via Median-of-Means

Authors: Mikael Møller Høgsgaard, Andrea Paudice

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this work, we analyze its performance in the task of simultaneously estimating the mean of each function in a class F when the data distribution possesses only the first p moments for p (1, 2]. We prove a new sample complexity bound using a novel symmetrization technique that may be of independent interest. Additionally, we present applications of our result to k-means clustering with unbounded inputs and linear regression with general losses, improving upon existing works.
Researcher Affiliation Academia 1Department of Computer Science, Aarhus University, Denmark. Correspondence to: Mikael Møller Høgsgaard <EMAIL>, Andrea Paudice <EMAIL>.
Pseudocode Yes Algorithm 1 Median of Means (Mo M) Estimator Input: Sample X = (X1, . . . , Xκ), Xi X m m, κ N, function f : X R Return: MOM(f, X) = MEDIAN(µf,X1, . . . , µf,Xκ).
Open Source Code No The paper does not contain any explicit statements or links indicating that source code for the methodology is openly available.
Open Datasets No The paper discusses theoretical applications to 'k-means clustering over unbounded spaces' and 'linear regression with general losses' but does not specify or provide access information for any particular datasets used for experimental purposes.
Dataset Splits No The paper is theoretical and does not describe any experimental evaluation using specific datasets, therefore, there are no dataset splits mentioned.
Hardware Specification No The paper presents theoretical analysis and does not involve experimental work requiring specific hardware. Therefore, no hardware specifications are provided.
Software Dependencies No The paper focuses on theoretical contributions and does not describe any implementation details or experiments that would require specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical, presenting mathematical proofs and applications of a novel symmetrization technique, and therefore does not include details on experimental setup, hyperparameters, or training configurations.