Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Quantifying Treatment Effects: Estimating Risk Ratios via Observational Studies
Authors: Ahmed Boughdiri, Julie Josse, Erwan Scornet
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through analyses on simulated and real-world datasets, we evaluate the performance of these estimators in terms of bias, efficiency, and robustness to generative data models. We also examine the coverage and length of the associated confidence intervals. ... In Section 4, we evaluate all estimators on observational data, and study the empirical properties of confidence intervals in terms of coverage and lengths. In Section 5, we extend our analysis to a semi-synthetic and a real-world dataset. |
| Researcher Affiliation | Academia | 1INRIA Sophia-Antipolis 2Sorbonne Universit e and Universit e Paris Cit e. Correspondence to: Ahmed Boughdiri <EMAIL>. |
| Pseudocode | No | The paper describes methods mathematically and textually but does not include any explicitly labeled pseudocode or algorithm blocks with structured steps. |
| Open Source Code | No | The paper does not provide any statement or link indicating the release of source code for the methodology described. |
| Open Datasets | Yes | To better illustrate the practical application and behavior of our estimators, we include a real-world study from Mayer et al. (2020) involving 8,270 patients with traumatic brain injury (TBI), using data extracted from the Traumabase. |
| Dataset Splits | No | The paper mentions generating datasets for simulations and subsampling real data for different sample sizes, but it does not provide specific training/test/validation splits with percentages, counts, or references to predefined splits for reproduction. |
| Hardware Specification | Yes | All our experiments were run on a 8GB M1 Mac. |
| Software Dependencies | No | For the simulations we have implemented all estimators in Python using Scikit-Learn for our regression and classification models. While Python and Scikit-Learn are mentioned, specific version numbers for these software dependencies are not provided. |
| Experiment Setup | No | The paper mentions estimating nuisance components via parametric (linear/logistic regression) or non-parametric methods (random forests) and using a 'high regularization parameter' or 'parameters determined by the training data size'. However, it does not provide concrete hyperparameter values (e.g., learning rates, batch sizes, specific regularization strengths) or detailed optimizer settings in the main text. |