Variance-Aware Estimation of Kernel Mean Embedding
Authors: Geoffrey Wolfer, Pierre Alquier
JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 6.2, we put our methods into practice, first in the context of hypothesis testing, and second by improving the results of Briol et al. (2019) and Ch erief-Abdellatif and Alquier (2022) in the context of robust parametric maximum mean discrepancy estimation. ... Figure 1: Comparison of the test based on the Bernstein empirical (Emp Ber) bound, versus the test based on Mc Diarmid bound (Mc Dia), and the test based on the Monte-Carlo estimation of the quantile q1 α. Frequency of rejection of H0 : P {N((1, 1), I2)} as a function of σ with P = N(0, σ2I2). |
| Researcher Affiliation | Academia | Geoffrey Wolfer EMAIL Center for Data Science Waseda University 1-6-1 Nishiwaseda, Shinjuku-ku Tokyo 169-8050, Japan Pierre Alquier EMAIL ESSEC Business School Asia-Pacific campus 5 Nepal Park 575749 Singapore |
| Pseudocode | No | The paper does not contain any explicit pseudocode or algorithm blocks. The methods are described through mathematical formulations, theorems, and proofs. |
| Open Source Code | No | The paper includes a license statement: 'License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v26/23-0161.html.' This refers to the paper's license, not the availability of source code for the methodology. No other concrete statement or link regarding code release is present. |
| Open Datasets | No | The experiments in Section 6 describe using synthetic data, such as 'P = N(0, σ2I2) with Pθ = N(θ, I2)' for simulations, rather than referring to established public datasets with access information. |
| Dataset Splits | No | The paper describes simulation-based experiments using generated data (e.g., Gaussian distributions) and does not specify any training, testing, or validation splits for a dataset. |
| Hardware Specification | No | The paper does not provide any specific hardware details used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies or version numbers needed to replicate the experiments. |
| Experiment Setup | Yes | The kernel used is a Gaussian kernel with γ = 1, and we consider sample sized n {16, 40, 100, 250} (Section 6.1.1). For comparisons, experiments are run for 'fixed sample size (n = 10000), fixed confidence level (δ = 0.1), fixed variance parameter (σ = 3) and two different contamination levels (ξ = 0.01 and ξ = 0.2)' (Section 6.2.2). |