A stochastic gradient descent algorithm with random search directions
Authors: Eméric Gbaguidi
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The purpose of this section is to show the behavior of the SCORS algorithm on simulated data. For that goal, we consider the logistic regression model (Bach, 2014; Bercu et al., 2020) associated with the classical minimization problem (P) of the convex function f given, for all x Rd, by [...] We carry out the experiments on simulated data in order to illustrate the almost sure convergence (Theorem 1), the central limit theorem (Theorem 2) and the non-asymptotic L2 rate of convergence (Theorem 4). [...] Figure 1 illustrates the almost sure convergence of the algorithms. [...] Figure 2: We used 1000 samples, where each one was obtained by running the associated algorithm for n = 500000 iterations. [...] In Table 1, we report a comparison of the algorithm performances based on the computational cost. [...] Figure 3: Mean squared error with respect to epochs. |
| Researcher Affiliation | Academia | Eméric Gbaguidi EMAIL Institut de Mathématiques de Bordeaux Université de Bordeaux |
| Pseudocode | No | The paper describes algorithms (SGD, SCGD, SCORS) using mathematical formulations and prose, but does not include a dedicated or clearly labeled pseudocode or algorithm block with structured steps. |
| Open Source Code | No | The paper does not contain any explicit statement about the release of source code for the described methodology, nor does it provide any links to a code repository. |
| Open Datasets | No | In Section 5, the authors state: 'We carry out the experiments on simulated data in order to illustrate the almost sure convergence [...] Therefore, we consider an independent and identically distributed collection {(w1, y1), . . . , (w N, y N)} where the covariate w Nd(0, Id) and the response y {0, 1} is sampled such that P(y = 1|w) = 1/(1 + e^-<w,x>). The true model parameter x Rd is selected uniformly from the unit sphere. Furthermore, we set the sample size N = 50000 and the parameter dimension d = 50.' The dataset is simulated based on a described procedure rather than being an external, publicly available dataset with a direct link, DOI, or citation. |
| Dataset Splits | No | The paper describes the generation of simulated data but does not specify any training, validation, or test splits for this data. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to conduct the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions or specific solver versions) used in the experiments. |
| Experiment Setup | Yes | The stepsize γn = 1/n is used where n 1 stands for the iterations. Here, we will compare the four methods (U), (NU), (G) and (S) described in Section 3. Let us define the initial value g1,k given, for any k = 1, . . . , N, by g1,k = fk(X1). Moreover, the sequence (gn,k) is updated, for all n 1 and 1 k N, as ( fk(Xn) if Un+1 = k, gn,k otherwise. For the non-uniform search distribution (NU), the probabilities pn,j are computed for all n 2 and j [[1, d]], as follows |g(j) 1 | / sum(|g(i) 1 |) if j = j*, d - 1 otherwise, where j* = argmax |g(j) 1 | and g1 = PN k=1 g1,k. [...] we set the sample size N = 50000 and the parameter dimension d = 50. [...] Each epoch consists of running 1000 iterations. |