Stochastic-Constrained Stochastic Optimization with Markovian Data

Authors: Yeongjong Kim, Dabeen Lee

JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 5, we provide numerical results from experiments on a classification problem with fairness constraints. Specifically, we take the logistic regression formulation proposed in Zafar et al. (2019). The numerical results on random problem instances demonstrate the efficacy of our proposed algorithmic frameworks for solving SCSO with Markovian data. ... The results on the regret and the cumulative constraint violations are shown in Figures 4 and 5, respectively. We set parameters to p = 0.001 and c = 0.5 with which we ran the list of algorithms with the same initial parameters and sequence of states. We first ran MDPP with 25,000 iterations which created 101,034 samples. The results on the optimality gap are summarized in Figure 2, and the results on the infeasibility are presented in Figure 3 and Table 3.
Researcher Affiliation Academia Yeongjong Kim EMAIL Department of Mathematics POSTECH Pohang 37673, South Korea Dabeen Lee EMAIL Department of Industrial and Systems Engineering KAIST Daejeon 34141, South Korea
Pseudocode Yes Algorithm 1 Ergodic Drift-Plus-Penalty (EDPP) Algorithm 2 Drift-Plus-Penalty with Data Drop (DPP-DD) Algorithm 3 MLMC Adaptive Drift-Plus-Penalty (MDPP) Algorithm 4 Adaptive Drift-Plus-Penalty
Open Source Code No The paper does not explicitly state that source code for the methodology is provided, nor does it include a link to a code repository or mention code in supplementary materials.
Open Datasets No We examine the performance of the ergodic drift-plus-penalty algorithm (Algorithm 1) for the known mixing time case and the MLMC adaptive drift-plus-penalty algorithm (Algorithm 3) for the unknown mixing time case on a linear classification problem with fairness constraints using synthetic data. We adopt Zafar et al. (2019) for creating data points and sensitive features (with φ = π/2) and imposing fairness constraints. ... For each set of data points, we generate two clusters, each of which has 1,000 data points in R2 sampled from a multivariate normal distribution.
Dataset Splits No The paper describes generating synthetic data and its characteristics ('two clusters, each of which has 1,000 data points in R2 sampled from a multivariate normal distribution') but does not specify how this data is split into training, validation, or test sets for the experiments.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, memory).
Software Dependencies No The paper mentions using 'logistic regression classifiers' but does not specify any software libraries, frameworks, or their version numbers (e.g., Python, PyTorch, TensorFlow, scikit-learn versions) that were used for implementation.
Experiment Setup Yes We use logistic regression classifiers with the following loss functions... We set parameters to p = 0.001 and c = 0.5 with which we ran the list of algorithms with the same initial parameters and sequence of states. We first ran MDPP with 25,000 iterations which created 101,034 samples. ... For Algorithm 1, we use the parameters Vt = (τmixt)β and αt = τmixt as before. For Algorithm 3, we define the MLMC estimator gt,i using {g(j) t,i }Nt j=1 for each i [n] and define a0 = S0 = δ, at = F 2 t 4 + i=1 R2G2 t,i + i=1 H2 t,i, St = δ + where Gt,i = gt,i(xt) , Ht,i = |gt,i(xt)|. The parameters Vt and αt are defined as in (1).