reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning to Stop: Deep Learning for Mean Field Optimal Stopping

Authors: Lorenzo Magnino, Yuchen Zhu, Mathieu Lauriere

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness and the scalability of our work through numerical experiments on 6 different problems in spatial dimension up to 300. To the best of our knowledge, this is the first work to formalize and computationally solve MFOS in discrete time and finite space, opening new directions for scalable MAOS methods.
Researcher Affiliation	Academia	1Shanghai Frontiers Science Center of Artificial Intelligence and Deep Learning, NYU Shanghai 2Machine Learning Center, Georgia Institute of Technology 3NYU-ECNU Institute of Mathematical Sciences, NYU Shanghai.
Pseudocode	Yes	Short versions of the pseudocodes are presented in Alg. 2 and 1. Long versions are in Appx. C (see Alg. 3 and 4).
Open Source Code	No	Code is available at Learning-to-Stop. This statement is ambiguous as it does not provide a direct link to a repository or explicitly state it's in supplementary materials.
Open Datasets	No	We demonstrate the effectiveness and the scalability of our work through numerical experiments on 6 different problems in spatial dimension up to 300. The paper defines its own synthetic problems/environments (e.g., Example 1: Towards the Uniform, Example 2: Rolling a Die) rather than using external publicly available datasets.
Dataset Splits	No	The paper does not use external datasets that would typically require train/test/validation splits. Instead, it defines problems and environments for simulation. For Example 6, it mentions "Tested with the randomly sampled initial distribution" but this is not a traditional dataset split.
Hardware Specification	Yes	Computational Resources: We run all the numerical experiments on an RTX 4090 GPU and a Mac Book Pro with M2 Chip. For any of the test cases, one run took at most 3 minutes on GPUs and 7 minutes on CPUs.
Software Dependencies	No	The paper mentions using "neural networks" and the "Adam W optimizer" but does not provide specific version numbers for software like Python, PyTorch/TensorFlow, or other libraries used in implementation.
Experiment Setup	Yes	Training Hyperparameters: For all the experiments, we choose an initial learning rate 10 4 of the Adam W optimizer. Each training is at most 104 iterations, with a batch size 128. Our neural network takes an input pair (x, t)... Our neural network has the following structure... with k residual blocks, each containing 4 linear layers with hidden dimension D and Si LU activation. For all the test cases we have experimented with, we use k = 3, D = 128 for all the 1D experiments and k = 5, D = 256 for the 2D experiments.