reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Nonparametric Modern Hopfield Models

Authors: Jerry Yao-Chieh Hu, Bo-Yu Chen, Dennis Wu, Feng Ruan, Han Liu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we validate our framework in both synthetic and realistic settings for memory retrieval and learning tasks. Code is available at Git Hub; future updates are on ar Xiv. ... We conduct numerical experiments to support our framework in Appendix G. ... G Experimental Studies
Researcher Affiliation	Academia	1Department of Computer Science, Northwestern University, Evanston, IL, USA 2Department of Physics and Computer Science, National Taiwan University, Taipei, Taiwan 3Department of Statistics and Data Science, Northwestern University, Evanston, IL, USA. Correspondence to: Jerry Yao-Cheih Hu <EMAIL>, Bo-Yu Chen <EMAIL>, Dennis Wu <EMAIL>, Feng Ruan <EMAIL>, Han Liu <EMAIL>.
Pseudocode	No	The paper describes methods using mathematical equations and prose. It does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code.
Open Source Code	Yes	Empirically, we validate our framework in both synthetic and realistic settings for memory retrieval and learning tasks. Code is available at Git Hub; future updates are on ar Xiv.
Open Datasets	Yes	In the memory retrieval task, we examine two datasets: MNIST (sparse) and CIFAR10 (dense). ... We evaluate Dense Hopfield, Sparse Hopfield, and Top-K Hopfield models on a Multiple Instance Learning (MIL) task using MNIST bags. ... We utilize four datasets: Elephant, Fox, and Tiger for image annotation (Ilse et al., 2018), and UCSB breast cancer classification (Kandemir et al., 2014). ... We conduct the experiments on four multivariate time series real-world datasets: ETTh1 (Electricity Transformer Temperature-hourly), ETTm1 (Electricity Transformer Temperature-minutely), WTH (Weather), ECL (Electricity Consuming Load), Traffic.
Dataset Splits	Yes	In each fold, we utilize a stratified sampling process to partition the data into a training set and a validation set, with a split rate of 0.1.
Hardware Specification	Yes	All experiments are conducted on the platform with NVIDIA GEFORCE RTX 2080 Ti and INTEL XEON SILVER 4214 @ 2.20GHz.
Software Dependencies	Yes	We use Py Torch 1.8.0 for all experiments, and use Ray Tune for hyperparameter search.
Experiment Setup	Yes	All models are trained using the Adam W optimizer for 150 epochs, with a cosine annealing learning rate decay applied to all models. ... Table 2. Hyperparameter used in the MIL MNIST experiment. parameter values batch size 256 learning rate 1e-3 embedding dimension 256 number of heads 4 head dimension 64 test set size 500 train set size 2000 scaling 0.1 num of pattern 2 epochs 150