reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Counterfactual Mean Embeddings

Authors: Krikamol Muandet, Motonobu Kanagawa, Sorawit Saengkyongam, Sanparith Marukatat

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results on synthetic data and oﬀ-policy evaluation tasks demonstrate the advantages of the proposed estimator.
Researcher Affiliation	Academia	Krikamol Muandet EMAIL Max Planck Institute for Intelligent Systems Tübingen, Germany Motonobu Kanagawa EMAIL Data Science Department, EURECOM Sophia Antipolis, France Sorawit Saengkyongam EMAIL University of Copenhagen Copenhagen, Denmark Sanparith Marukatat EMAIL National Electronics and Computer Technology Center National Science and Technology Development Agency Pathumthani, Thailand
Pseudocode	Yes	Algorithm 1 Sampling from a counterfactual mean embedding estimate; Algorithm 2 Oﬀ-Policy Evaluation using the CME estimator (18)
Open Source Code	Yes	The codes to reproduce the experiments are available at https://github.com/sorawitj/counterfactual-mean-embedding.
Open Datasets	Yes	For our real data experiment, we use the data from the Microsoft Learning to Rank Challenge dataset (MSLR-WEB30K) (Qin and Liu, 2013) and treat them as an oﬀ-policy evaluation problem.
Dataset Splits	Yes	We set the kernel ℓon the outcome space as the Gaussian kernel ℓ(y, y ) = exp( y y 2 2/2σ2 Y ) whose bandwidth parameter σY is chosen by the median heuristic using (yi)n i=1. We also set the kernel k on the covariate space as the Gaussian kernel k(x, x ) = exp( x x 2 2/2σ2 X), whose parameter σX as well as the regularization constant ε in the CME estimator are chosen by 5-fold cross validation from σX {0.01, 0.1, 1, 10} and ε {0.01, 0.1, 1, 10}.
Hardware Specification	No	The paper does not explicitly mention specific hardware details such as CPU/GPU models, memory, or cloud computing resources used for experiments.
Software Dependencies	No	The paper does not explicitly list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, scikit-learn 0.x).
Experiment Setup	Yes	Throughout the experiment, we set β = [0.1, 0.2, 0.3, 0.4, 0.5] , α = [0.05, 0.04, 0.03, 0.02, 0.01] , α0 = 0.05, and σ2 ε = σ2 x = 0.1. We set b = 0 for the Scenario I and b = 2 for the Scenario II. For Scenario III, we set b = 2z 1, where z {0, 1} is an independent Bernoulli random variable z Bernoulli(0.5) generated for every observation. We perform 5-fold CV over parameter grids, i.e., the number of hidden units nh {50, 100, 150, 200} for the Direct and DR estimators, and the regularization parameter ε {10 8, . . . , 100} for our CME.