reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Unlinked Monotone Regression

Authors: Fadoua Balabdaoui, Charles R. Doss, Cécile Durot

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We develop a second order algorithm for its computation, and we demonstrate its use on synthetic data. Finally, we apply our method (in a fully data driven way, without knowledge of the error distribution) on longitudinal data from the US Consumer Expenditure Survey.
Researcher Affiliation	Academia	Fadoua Balabdaoui EMAIL Seminar f ur Statistik ETH, Zurich R amistrasse 101 8092, Zurich, Switzerland, Charles R. Doss EMAIL School of Statistics University of Minnesota Minneapolis, MN 55455, USA, C ecile Durot EMAIL Modal x Universit e Paris Nanterre F92000, Nanterre, France
Pseudocode	Yes	Algorithm 1: Active set algorithm, Algorithm 2: Activate constraints: group (approximately) non-unique entries in m
Open Source Code	No	The paper does not contain an explicit statement about the release of source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets	Yes	Finally, we apply our method (in a fully data driven way, without knowledge of the error distribution) on longitudinal data from the US Consumer Expenditure Survey. Steven Ruggles, Sarah Flood, Ronald Goeken, Josiah Grover, Erin Meyer, Jose Pacas, and Matthew Sobek. IPUMS, USA: Version 10.0 [dataset]. Minneapolis, MN: IPUMS, 2020.
Dataset Splits	No	For synthetic data, the paper mentions sample sizes n=100 and n=1000 but does not specify train/test/validation splits. For the CEX data, it describes a selection criterion for individuals (surveyed in both quarter 2 and quarter 3) but no explicit data splits for experimental reproduction.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies	No	The paper discusses algorithms and cites references for optimization methods, but does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	Our simulations were performed taking both Laplace and Gaussian errors with standard deviation 1 (and both unlinked methods are well-speciﬁed). The error distribution is assumed to be Laplace distributed with λ = p bσ2/2 where bσ2 is the empirical variance of ϵ1, . . . , ϵ2164. We used a four component Gaussian mixture (with no variance constraint) to approximate the error distribution... The estimate has weights (.56, .05, .33, .06), means/locations ( 345, 416, 326, 1710) and standard deviations (322, 75, 562, 1286).