Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Global Optimization with a Power-Transformed Objective and Gaussian Smoothing
Authors: Chen Xu
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The majority of experiments in Section 5 show that the GS-Power Opt-based algorithm (e.g., EPGS, introduced later) outperforms other algorithms that also apply the smoothing technique. |
| Researcher Affiliation | Academia | 1Department of Engineering, Shenzhen MSU-BIT University, Shenzhen, P. R. China. Correspondence to: Chen Xu <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 PGS/EPGS for Solving (1) |
| Open Source Code | Yes | All our codes can be found at http://github.com/chen-research/GS-Power Transform. |
| Open Datasets | Yes | Experiments on MNIST. For each image am that is randomly drawn from the dataset... MNIST handwritten digits (Le Cun et al., 1998) and the CIFAR-10 set (Krizhevsky & Hinton, 2009). |
| Dataset Splits | Yes | In the MNIST attacks, our trained classifier C has a classification accuracy of 97.4% on the testing images. In the CIFAR-10 attacks, the trained C has a test accuracy of 86.2%. With 100 randomly drawn images from the CIFAR-10 test set, we repeat the per-image targeted adversarial attacks. |
| Hardware Specification | No | The paper mentions using TensorFlow for model training but does not provide any specific details about the hardware used (e.g., CPU, GPU models, or memory). |
| Software Dependencies | No | The paper mentions 'Tensor Flow (Abadi et al., 2015)' and 'CMA-ES/pycma on Github, 2019' but does not provide specific version numbers for these or any other software components used in their methodology. |
| Experiment Setup | Yes | For experiments on the benchmark test functions (Table 1 and 2), the set of hyper-parameter values with the smallest mean square error (averaged over the 100 experiments) between the true and estimated solutions are selected. The set of candidate values, as well as the selected values, are listed in Table 6 and 7. |