reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

MIND over Body: Adaptive Thinking using Dynamic Computation

Authors: Mrinal Mathur, Barak Pearlmutter, Sergey Plis

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of this method on language modeling and computer vision tasks. Notably, our model achieves 96.62% accuracy on Image Net with just a three-layer network, surpassing much larger Res Net-50 and Efficient Net. When applied to a transformer architecture, the approach achieves 95.8%/88.7% F1 scores on the SQu AD v1.1/v2.0 datasets at negligible parameter cost. These results showcase the potential for dynamic and reflective computation, contributing to the creation of intelligent systems that efficiently manage resources based on input data complexity.
Researcher Affiliation	Academia	Mrinal Mathur TRe NDS Center Georgia State University Atlanta, GA, USA Barak A. Pearlmutter Dept of Computer Science Maynooth University Co. Kildare, W23 A3HY, Ireland Sergey Plis TRe NDS Center Georgia State University Atlanta, GA, USA
Pseudocode	Yes	Algorithm 1 Training Procedure for the MIND Model Algorithm 2 Backward Propagation in MIND model Algorithm 3 Forward Propagation for MIND model
Open Source Code	No	The paper does not contain any explicit statements about the release of source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	In language modeling, we evaluate on SQu AD (Rajpurkar et al., 2016) and Wiki Text (Gardent et al., 2017). To demonstrate that it is domain-agnostic, we also validate on vision benchmarks including CIFAR-100 (Krizhevsky, 2009) and Image Net (Deng et al., 2009). ... Wiki Text-2 and Wiki Text-103 datasets (Merity et al., 2016).
Dataset Splits	Yes	All models were validated using 9-fold cross-validation with 10 different random seeds to ensure stability and robustness of results. ... For vision tasks, we evaluated the MIND model on CIFAR-100 and Image Net datasets. CIFAR-100 consists of 60,000 32 x 32 images in 100 classes, while Image Net has 1.28M images in 1,000 classes.
Hardware Specification	Yes	All experiments were conducted using Py Torch (Paszke et al., 2019) on NVIDIA A40 GPUs with 20GB memory. ... the experiments were conducted in a controlled environment using the Image Net dataset for classification tasks and NVIDIA A100 GPUs for training and inference.
Software Dependencies	No	All experiments were conducted using Py Torch (Paszke et al., 2019) on NVIDIA A40 GPUs with 20GB memory. The paper mentions Py Torch but does not specify a version number.
Experiment Setup	Yes	The MIND model was optimized using the Adam optimizer (Kingma & Ba, 2014) with an initial learning rate of 1 x 10^-3, decayed by a factor of 0.1 every 30 epochs. The batch size was set to 64. Hyperparameters α, β, γ, and δ in Equation 5 were fine-tuned to 0.5, 0.2, 0.2, and 0.1 respectively, while λ for Lintrospect was set to 0.6. Each model was trained for 100 epochs with early stopping, triggered when validation loss did not improve over 10 epochs. Fixed-point iteration (FPI) tolerance for the MIND architecture was set to 1 x 10^-5, with a maximum of 100 iterations per layer.