reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Bayesian Nonparametrics Meets Data-Driven Distributionally Robust Optimization

Authors: Nicola Bariletto, Nhat Ho

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we provide insights into the workings of our method by applying it to a variety of tasks based on simulated and real datasets.
Researcher Affiliation	Academia	Nicola Bariletto Department of Statistics and Data Sciences The University of Texas at Austin Austin, TX 78712 EMAIL Nhat Ho Department of Statistics and Data Sciences The University of Texas at Austin Austin, TX 78712 EMAIL
Pseudocode	Yes	Algorithm 1 in Appendix B details the procedure...
Open Source Code	Yes	Code to replicate our experiments can be found at the folllowing link: https://github.com/nbariletto/BNP_ for_DRO.
Open Datasets	Yes	we applied our method to predict diabetes development based on a host of features, as collected in the popular and public Pima Indian Diabetes dataset. ... The Wine Quality dataset [8] and the Liver Disorders dataset [15].
Dataset Splits	Yes	To test our method, we randomly select 300 training observations and leave out the rest for as a test sample. Then, we randomly split the training data into 15 folds of size 20 and select, via k-fold cross validation, the optimal DP concentration parameter α over a wide grid of values.
Hardware Specification	Yes	All experiments were performed on a desktop with 12th Gen Intel(R) Core(TM) i9-12900H, 2500 Mhz, 14 Core(s), 20 Logical Processor(s) and 32.0 GB RAM.
Software Dependencies	No	The paper mentions 'scikit-learn' [31] as a used library, but does not provide specific version numbers for it or any other software dependencies crucial for reproduction.
Experiment Setup	Yes	Robust Criterion Parameters. For each simulated sample, we run our robust procedure setting the following parameter values: ϕ(t) = β exp(t/β) β, β {1, }, α = a/n for a {1, 2, 5, 10}, and p0 = N(0, I), where the β = setting corresponds to Ridge regression with regularization parameter α (see Proposition 2.1). Finally, we run 300 Monte Carlo simulations to approximate the criterion, and truncate the Multinomial-Dirichlet approximation at T = 50. Stochastic Gradient Descent Parameters We initialize the algorithm at θ = (0, . . . , 0) and set the step size at ηt = 50/(100 + t). The number of passes over data is set after visual inspection of convergence of the criterion value.