SIFAR: A Simple Faster Accelerated Variance-Reduced Gradient Method
Authors: Zhize Li
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, the numerical experiments show that SIFAR converges faster than the previous state-of-the-art Varag, validating our theoretical results and confirming the practical superiority of SIFAR. In Figure 1, the x-axis and y-axis represent the number of data passes (i.e., we compute n stochastic gradients for each data pass) and the training loss, respectively. The numerical results presented in Figure 1 are conducted on different datasets. Each plot corresponds to one dataset (six datasets in total). |
| Researcher Affiliation | Academia | Zhize Li Singapore Management University EMAIL |
| Pseudocode | Yes | Algorithm 1 SIFAR: SImple Faster Accelerated variance Reduced gradient |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that the source code for the methodology described is open-source or publicly available. |
| Open Datasets | Yes | All datasets used in our experiments are downloaded from LIBSVM [Chang and Lin, 2011]. |
| Dataset Splits | No | The paper mentions using datasets for a logistic regression problem but does not provide specific details on training, validation, or test splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper describes numerical experiments but does not specify any hardware details such as CPU, GPU models, or memory. |
| Software Dependencies | No | The paper mentions 'LIBSVM [Chang and Lin, 2011]' as the source for datasets but does not provide specific version numbers for LIBSVM or any other software libraries or programming languages used for implementation. |
| Experiment Setup | No | Given the parameter L, we are ready to set all other hyperparameters for GD (see Corollary 2.1.2 in [Nesterov, 2004]), for Varag (see Theorem 1 in [Lan et al., 2019]) and for SIFAR (see our Theorem 1). Note that all of these three algorithms only require L for setting their (hyper)parameters. |