reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Convergence and Implicit Bias of Gradient Descent on Continual Linear Classification

Authors: Hyunji Jung, Hanseul Cho, Chulhee Yun

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we theoretically study continual linear classification via sequentially running gradient descent (GD) on the unregularized logistic loss for a fixed budget of iterations at every stage.2 When all tasks are jointly separable and revealed in the cyclic order (as studied by Evron et al. (2023)), we show that sequential GD converges in the direction of the offline max-margin solution, unlike SMM. ... Lastly, in Section 5 we consider the case where the tasks are no longer jointly separable... Experiments on a Real-World Dataset. For those interested, we also provide an experiment on a real-world dataset CIFAR-10 (Krizhevsky, 2009), which is not guaranteed to be linearly separable: see Appendix C.5.
Researcher Affiliation	Academia	Hyunji Jung Graduate School of Artificial Intelligence POSTECH EMAIL Hanseul Cho , Chulhee Yun Kim Jaechul Graduate School of AI KAIST EMAIL
Pseudocode	No	The paper describes the Sequential Gradient Descent algorithm using mathematical equations and prose (Section 2.2 'ALGORITHM: SEQUENTIAL GRADIENT DESCENT') but does not present it in a structured pseudocode or algorithm block.
Open Source Code	Yes	We ran numerical experiments running the SMM iterations, which are done by solving the constrained minimization problems using fmincon in MATLAB Optimization Toolbox. The code is provided in our supplementary material.
Open Datasets	Yes	For those interested, we also provide an experiment on a real-world dataset CIFAR-10 (Krizhevsky, 2009), which is not guaranteed to be linearly separable: see Appendix C.5.
Dataset Splits	No	The paper mentions generating synthetic datasets (Appendix C.2.1) and using CIFAR-10 (Appendix C.5), stating, 'We choose two classes from the CIFAR-10 dataset and design 3 tasks that have 512 data points from the two classes ( airplane and automobile )'. However, it does not explicitly specify train/test/validation splits (e.g., percentages, sample counts, or references to standard splits) for these datasets in the main text or appendices.
Hardware Specification	No	The paper does not explicitly describe the specific hardware used to run the experiments. General terms like 'training' are used, but no details regarding GPU models, CPU types, or memory are provided.
Software Dependencies	Yes	We ran numerical experiments running the SMM iterations, which are done by solving the constrained minimization problems using fmincon in MATLAB Optimization Toolbox.
Experiment Setup	Yes	Optimization. We run sequential GD for 300 stages in total. Since there are three tasks, for the cyclic ordering case, it is equivalent to J = 100. The step size we used is η = 0.1. Also, we allow and conduct K = 1,000 updates per stage. For the joint training case, we run full-batch GD on the union of all datasets for MJK = 300,000 steps.