HVAdam: A Full-Dimension Adaptive Optimizer
Authors: Yiheng Zhang, Shaowu Wu, Yuanzhuo Xu, Jiajun Wu, Shang Xu, Steve Drew, Xiaoguang Niu
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validated HVAdam in extensive experiments, showing its faster convergence, higher accuracy, and more stable performance on image classification, image generation, and natural language processing tasks. Particularly, HVAdam achieves a significant improvement on GANs compared with other state-of-the-art methods, especially in Wasserstein-GAN (WGAN) and its improved version with gradient penalty (WGAN-GP). |
| Researcher Affiliation | Academia | 1 School of Computer Science, Wuhan University 2 Department of Electrical and Software Engineering, University of Calgary 3 Department of Computer Science, University College London EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: Adam Optimizer Input: α1, β1 , β2, ϵ Initialize θ0, m0 0 , v0 0, t 0 1: while θt not converged do... Algorithm 2: HVAdam Optimizer Input: α1, β1 , β2, ϵ, γ Initialize θ0, α1 α2, m0 0 , s0 0, v0 0, t 0, t2 1, δ0 0 1: while θt not converged do... |
| Open Source Code | No | The paper does not explicitly state that the authors' implementation code for HVAdam is open-source or provide a direct link to a code repository. It mentions using 'the official implementation of Ada Belief' for some experiments, but this is not their own code. |
| Open Datasets | Yes | The experiments are conducted on CIFAR-10 dataset and CIFAR-100 dataset (Krizhevsky, Hinton et al. 2009) with VGG (Simonyan and Zisserman 2014), Res Net (He et al. 2016) and Dense Net (Huang et al. 2017); (b) natural language processing tasks with LSTM (Ma et al. 2015) on Penn Tree Bank dataset (Marcus, Santorini, and Marcinkiewicz 1993) and Transformer on IWSLT14 dataset; (c) WGAN (Arjovsky, Chintala, and Bottou 2017), WGANGP (Gulrajani et al. 2017) and Spectral-Norm GAN (SNGAN) (Miyato et al. 2018) on CIFAR-10. We then train a Res Net50 on Image Net (Deng et al. 2009) and report the accuracy on the validation set in Table 2. |
| Dataset Splits | No | The paper mentions using well-known datasets like CIFAR-10, CIFAR-100, Penn Tree Bank, and ImageNet. While these datasets have standard splits, the paper does not explicitly provide the specific percentages, sample counts, or detailed methodology of how the authors split the data for their experiments in the main text. For instance, it says 'report the accuracy on the validation set' but doesn't specify the size of this validation set or how it was created relative to the training set. |
| Hardware Specification | No | The numerical calculations in this paper have been done on the supercomputing system in the Supercomputing Center of Wuhan University. This statement is too general and does not specify any particular GPU or CPU models, processor types, or memory details. |
| Software Dependencies | No | The paper mentions using 'the official implementation of Ada Belief' and refers to various models (VGG, ResNet, DenseNet, LSTM, Transformer, WGAN, etc.) which imply underlying frameworks like PyTorch or TensorFlow, but it does not provide specific version numbers for any software libraries or frameworks used in their own implementation. |
| Experiment Setup | No | The hyperparameter settings and searching are shown in supplementary material. This indicates that experimental setup details exist, but they are not provided within the main body of the paper as requested. |