Variance-Aware Estimation of Kernel Mean Embedding

Authors: Geoffrey Wolfer, Pierre Alquier

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 6.2, we put our methods into practice, first in the context of hypothesis testing, and second by improving the results of Briol et al. (2019) and Ch erief-Abdellatif and Alquier (2022) in the context of robust parametric maximum mean discrepancy estimation. ... Figure 1: Comparison of the test based on the Bernstein empirical (Emp Ber) bound, versus the test based on Mc Diarmid bound (Mc Dia), and the test based on the Monte-Carlo estimation of the quantile q1 α. Frequency of rejection of H0 : P {N((1, 1), I2)} as a function of σ with P = N(0, σ2I2).
Researcher Affiliation Academia Geoffrey Wolfer EMAIL Center for Data Science Waseda University 1-6-1 Nishiwaseda, Shinjuku-ku Tokyo 169-8050, Japan Pierre Alquier EMAIL ESSEC Business School Asia-Pacific campus 5 Nepal Park 575749 Singapore
Pseudocode No The paper does not contain any explicit pseudocode or algorithm blocks. The methods are described through mathematical formulations, theorems, and proofs.
Open Source Code No The paper includes a license statement: 'License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v26/23-0161.html.' This refers to the paper's license, not the availability of source code for the methodology. No other concrete statement or link regarding code release is present.
Open Datasets No The experiments in Section 6 describe using synthetic data, such as 'P = N(0, σ2I2) with Pθ = N(θ, I2)' for simulations, rather than referring to established public datasets with access information.
Dataset Splits No The paper describes simulation-based experiments using generated data (e.g., Gaussian distributions) and does not specify any training, testing, or validation splits for a dataset.
Hardware Specification No The paper does not provide any specific hardware details used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies or version numbers needed to replicate the experiments.
Experiment Setup Yes The kernel used is a Gaussian kernel with γ = 1, and we consider sample sized n {16, 40, 100, 250} (Section 6.1.1). For comparisons, experiments are run for 'fixed sample size (n = 10000), fixed confidence level (δ = 0.1), fixed variance parameter (σ = 3) and two different contamination levels (ξ = 0.01 and ξ = 0.2)' (Section 6.2.2).