reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Dynamics under Environmental Constraints via Measurement-Induced Bundle Structures

Authors: Dongzhe Zheng, Wenjie Mei

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive simulations demonstrate significant improvements in both learning efficiency and constraint satisfaction over traditional methods, especially under limited and uncertain sensing conditions. We design a comprehensive experimental framework to evaluate our proposed method against state-of-the-art approaches, focusing on three interconnected research directions: Learning-based safety control, geometric structure learning, and safe control under uncertainty. The experiments are constructed to highlight key methodological differences while ensuring fair comparison through standardized implementations and evaluation protocols. Table 1 presents the quantitative results.
Researcher Affiliation	Academia	1Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China 2School of Automation and Key Laboratory of MCCSE of Ministry of Education, Southeast University, Nanjing, China. Correspondence to: Wenjie Mei <EMAIL; EMAIL>.
Pseudocode	No	The paper describes the learning framework and algorithms using mathematical equations and descriptive text, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our implementation is publicly available at https://github.com/Continuum Coder/Measurement-Induc ed-Bundle-for-Learning-Dynamics/.
Open Datasets	No	The experimental tasks are conducted in a simulation environment built on the Genesis physics engine. While Appendix D mentions studying
Dataset Splits	No	The experiments involve generating scenarios randomly for each trial within a simulation environment, rather than using a fixed, pre-split dataset. For example: "For all three tasks, the workspace is configured as a 2m 2m 2m arena with randomly placed obstacles. The obstacles positions are sampled uniformly within the workspace..." and "We test the worm robot (500 trials), Franka arm (400 trials), and quadrotor (300 trials) under various task scenarios." While Appendix C.7 mentions a "20% validation split for monitoring training progress" for the neural networks, this pertains to the internal training of the models within the RL framework, not the partitioning of a fixed dataset for the primary experimental evaluation.
Hardware Specification	Yes	All networks are trained with Adam optimizer using mixed precision training on an NVIDIA RTX 3090 GPU. Our experiments are conducted on a workstation equipped with an Intel Xeon CPU, NVIDIA RTX 3090 GPU (24GB GDDR6X), and 64GB DDR4 RAM.
Software Dependencies	Yes	All experiments are implemented in Python using Py Torch... The software stack consists of Python 3.9 and Py Torch 1.12.0, supported by CUDA 11.7 and cu DNN 8.5 for GPU acceleration.
Experiment Setup	Yes	Our SAC implementation follows the standard architecture with carefully tuned hyperparameters. The framework uses a discount factor γ of 0.99 and a soft update coefficient τ of 0.005. The target entropy is set to the negative dimension of the action space, following common practice. All policy components (actor, critic, and entropy networks) use a learning rate of 3 10 4. The replay buffer maintains 1 106 transitions... The neural network architecture consists of three hidden layers (128-64-32 units) with Re LU activations throughout. Layer normalization is applied after each hidden layer to stabilize training. For barrier functions, we add a tanh activation in the output layer to ensure the boundedness of safety certificates... The training process employs the Adam optimizer with β1 = 0.9 and β2 = 0.999, coupled with a cosine annealing learning rate schedule starting at 5 10 4. We use a batch size of 256 to fully utilize the GPU memory while maintaining stable gradients. Gradient clipping with a maximum norm of 1.0 prevents extreme parameter updates. Early stopping with a patience of 20 epochs prevents overfitting, and we maintain a 20% validation split for monitoring training progress. To ensure reproducibility, we fix random seeds to 42 across all experiments...