reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Stage-Aware Learning for Dynamic Treatments

Authors: Hanwen Ye, Wenzhuo Zhou, Ruoqing Zhu, Annie Qu

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we evaluate the proposed method in extensive simulated environments and a real case study for the COVID-19 pandemic.
Researcher Affiliation	Academia	Hanwen Ye EMAIL Department of Statistics University of California Irvine, CA 92617, USA; Wenzhuo Zhou EMAIL Department of Statistics University of California Irvine, CA 92617, USA; Ruoqing Zhu EMAIL Department of Statistics University of Illinois Urbana-Champaign, IL 61820, USA; Annie Qu EMAIL Department of Statistics University of California Irvine, CA 92617, USA
Pseudocode	Yes	Algorithm 1 Stage Weighted Learning
Open Source Code	Yes	We also include the reproducible code implementations in this Github repository: https://github.com/hanweny/SAL.git.
Open Datasets	Yes	In this section, we apply the proposed method to UC COVID Research Data Sets (UC CORDS) (University of California Health), which combines timely COVID-related testing and hospitalization healthcare data from six University of California schools and systems. As of December 2022, UC CORDS include a total number of 108,914 COVID patients, where 31,520 of them had been hospitalized and 2,333 of them had been admitted to the ICU. ... University of California Health. University of California Health creates centralized data set to accelerate COVID-19 research.
Dataset Splits	Yes	All methods are trained using 80% of the simulated training data, and evaluated on the 20% testing set via value functions and the matching accuracy between the estimated and optimal treatment regimes. Additionally, since the true data-generating process is known in the simulation, we progress the patient s health variables according to the treatments assigned by the regime and calculate their total rewards accordingly. ... In addition, we randomly select 80% of the data as a training set and repeat the process 20 times to obtain a Monte-Carlo sample of the model performance scores.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions software components like LSTMs, FC-network, logistic regression, SGD, Adam optimizer, cosine annealing warm-restart schedule, and random forests but does not provide specific version numbers for any of these. It only describes the general tools used without versioned dependencies.
Experiment Setup	Yes	For illustration purposes, we adopt neural networks and optimize via the standard SGD method. But one can also choose to ﬁrst parameterize the functions and estimate the function parameters through the more conventional Broyden Fletcher Goldfarb Shanno (BFGS) method (Head and Zerner, 1985), or the Nelder Mead method (Olsson and Nelson, 1975) on the computed loss. In the following, we further introduce techniques that could be considered to improve neural network convergence and estimation results. For instance, the Adam optimizer (Kingma and Ba, 2014) could be a suitable alternative to improve the convergence performance of SGD on highly-complex and non-convex objective functions. In addition, instead of setting learning rates to be constant as presented in the algorithm, utilizing the cosine annealing warm-restart schedule (Loshchilov and Hutter, 2016) and diﬀerent initialization seeds (Diamond et al., 2016) could improve the optimization to achieve better local convergence. Furthermore, we can also tune the hyper-parameters, such as the learning rate and the number of network hidden layers, by conducting a d-fold cross-validation on the dataset.