Uniform Mean Estimation for Heavy-Tailed Distributions via Median-of-Means
Authors: Mikael Møller Høgsgaard, Andrea Paudice
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work, we analyze its performance in the task of simultaneously estimating the mean of each function in a class F when the data distribution possesses only the first p moments for p (1, 2]. We prove a new sample complexity bound using a novel symmetrization technique that may be of independent interest. Additionally, we present applications of our result to k-means clustering with unbounded inputs and linear regression with general losses, improving upon existing works. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Aarhus University, Denmark. Correspondence to: Mikael Møller Høgsgaard <EMAIL>, Andrea Paudice <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Median of Means (Mo M) Estimator Input: Sample X = (X1, . . . , Xκ), Xi X m m, κ N, function f : X R Return: MOM(f, X) = MEDIAN(µf,X1, . . . , µf,Xκ). |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that source code for the methodology is openly available. |
| Open Datasets | No | The paper discusses theoretical applications to 'k-means clustering over unbounded spaces' and 'linear regression with general losses' but does not specify or provide access information for any particular datasets used for experimental purposes. |
| Dataset Splits | No | The paper is theoretical and does not describe any experimental evaluation using specific datasets, therefore, there are no dataset splits mentioned. |
| Hardware Specification | No | The paper presents theoretical analysis and does not involve experimental work requiring specific hardware. Therefore, no hardware specifications are provided. |
| Software Dependencies | No | The paper focuses on theoretical contributions and does not describe any implementation details or experiments that would require specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical, presenting mathematical proofs and applications of a novel symmetrization technique, and therefore does not include details on experimental setup, hyperparameters, or training configurations. |