reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Last Iterate Convergence of Incremental Methods as a Model of Forgetting

Authors: Xufeng Cai, Jelena Diakonikolas

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We further provide illustrative numerical results in Fig. 2 to facilitate our discussion. In particular, we choose L = 2, T {100, 150, 200}, δt = 1/t (t [T 1]) and δT = T for the example f(x) = L 2T PT t=1(x δt)2 used in the proof of Theorem 3. In Fig. 2(a), we plot the optimality gap at the last iterate, i.e., the excess forgetting, against the step sizes after K = 104 epochs.
Researcher Affiliation	Academia	Xufeng Cai Jelena Diakonikolas Department of Computer Sciences, University of Wisconsin Madison EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 Incremental Gradient Descent (IGD) [...] Algorithm 2 Incremental Proximal Method (IPM)
Open Source Code	No	The paper does not contain any explicit statements about releasing source code, nor does it provide links to a code repository.
Open Datasets	No	The numerical results use a synthetically generated function: "we choose L = 2, T {100, 150, 200}, δt = 1/t (t [T 1]) and δT = T for the example f(x) = L 2T PT t=1(x δt)2 used in the proof of Theorem 3." This does not refer to a publicly available dataset.
Dataset Splits	No	The paper uses a synthetically generated function for numerical illustration rather than a traditional dataset; therefore, the concept of training/test/validation splits does not apply in the conventional sense.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the numerical experiments.
Software Dependencies	No	The paper does not mention any specific software names or version numbers (e.g., programming languages, libraries, or solvers) used for implementation or experimentation.
Experiment Setup	Yes	In particular, we choose L = 2, T {100, 150, 200}, δt = 1/t (t [T 1]) and δT = T for the example f(x) = L 2T PT t=1(x δt)2 used in the proof of Theorem 3. In Fig. 2(a), we plot the optimality gap at the last iterate, i.e., the excess forgetting, against the step sizes after K = 104 epochs.