Accelerating Stochastic Composition Optimization
Authors: Mengdi Wang, Ji Liu, Ethan X. Fang
JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the application of ASC-PG to reinforcement learning and conduct numerical experiments. Section 4 describes an application of ASC-PG to reinforcement learning and gives numerical experiments. Figures 1, 2, and 3 show empirical convergence rates. |
| Researcher Affiliation | Academia | Mengdi Wang: Department of Operations Research and Financial Engineering Princeton University Princeton, NJ 08544, USA; Ji Liu: Department of Computer Science and Department of Electrical and Computer Engineering University of Rochester Rochester, NY 14627, USA; Ethan X. Fang: Department of Statistics and Department of Industrial and Manufacturing Engineering Pennsyvania State University University Park, PA 16802, USA. All listed institutions are universities. |
| Pseudocode | Yes | Algorithm 1 Accelerated Stochastic Compositional Proximal Gradient (ASC-PG) |
| Open Source Code | No | The paper does not provide any specific statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | The paper uses "Baird s example (Baird, 1995)" and "generate a Markov decision problem (MDP) using similar setup as in White and White (2016)". Baird's example is a well-known benchmark, but not a dataset with direct access information. For the MDP, the paper states, "In each instance, we randomly generate an MDP which contains S = 100 states..." indicating data generation rather than the use of a pre-existing, openly accessible dataset with explicit access details. |
| Dataset Splits | No | The paper describes generating MDP instances randomly for experiments 2 and 3, and using a known example (Baird's example) for experiment 1. It does not mention any explicit training/test/validation splits for any dataset, as the data is either generated or part of a small illustrative example. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments. |
| Software Dependencies | No | The paper does not list any specific software dependencies, libraries, or solvers with their version numbers. |
| Experiment Setup | Yes | We choose the step sizes via comparison studies as in Dann et al. (2014):... for ASC-PG αk = k 1 and βk = k 1, and for a-SCGD αk = k 1 and βk = k 4/5. In the third experiment, we add an ℓ1-regularization term, λ w 1. Figure 3's legend shows 'lambda = 1e-3 lambda = 5e-4'. |