reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Parametric Distributions from Samples and Preferences

Authors: Marc Jourdan, Gizem Yüce, Nicolas Flammarion

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6. Experiments In this section, we compare the empirical performance of the different estimators introduced in this paper. For preferences based on rθ = log pθ, we conduct a set of experiments for Gaussian distributions, and defer to Appendix H.1 for experiments on Laplace and Rayleigh distributions. In particular, we consider a uniformly drawn mean parameter θ U([1, 2]d) and the isotropic covariance Σ = Id. For sample size n [Nmax] with Nmax = 104, we compute the estimation errors bθn θ 2. We repeat this process for Nruns different instances and for various choices of d.
Researcher Affiliation	Academia	1School of Computer and Communication Sciences, EPFL, Lausanne, Vaud, Switzerland. Correspondence to: Marc Jourdan <EMAIL>.
Pseudocode	No	The paper describes methods and mathematical formulations (e.g., in sections '3. Preference-based M-estimator' and '4. Beyond M-estimators') but does not contain any explicit 'Pseudocode' or 'Algorithm' blocks, nor structured steps formatted like code.
Open Source Code	Yes	Reproducibility. Code for reproducing our empirical results is available at https://github.com/tml-epfl/ learning-parametric-distributions-from-samples-and-preferences.
Open Datasets	No	The paper conducts experiments based on generating data from "Gaussian distributions" (Section 6) and "Laplace and Rayleigh distributions" (Appendix H.1), specifying parameters like "uniformly drawn mean parameter θ U([1, 2]d)" and "isotropic covariance Σ = Id". It does not reference or provide access information for any publicly available or open datasets.
Dataset Splits	No	The paper describes generating synthetic data for experiments (e.g., "For sample size n [Nmax] with Nmax = 104, we compute the estimation errors bθn θ 2"). It mentions sample sizes but does not specify any training, testing, or validation dataset splits, as it generates data from parametric distributions rather than using pre-defined datasets.
Hardware Specification	Yes	Our experiments are conducted on 12 Intel(R) Core(TM) Ultra 7 165U 4.9GHz CPU.
Software Dependencies	No	The paper states: "Our code is implemented in Julia (Bezanson et al., 2017), version 1.11.5." While Julia has a specific version, other mentioned software like Stats Plots, Ju MP, Ipopt, and Hi GHS are listed without explicit version numbers within the text, failing to provide specific version numbers for multiple key components or the solvers themselves.
Experiment Setup	Yes	The paper provides details for its experimental setup, including the parameters for synthetic data generation: "we consider a uniformly drawn mean parameter θ U([1, 2]d) and the isotropic covariance Σ = Id." It also specifies the sample size and number of runs: "For sample size n [Nmax] with Nmax = 104, we compute the estimation errors bθn θ 2. We repeat this process for Nruns different instances and for various choices of d." and for Figure 2, "n = 10^4 and Nruns = 10^3".