Nonparametric Modern Hopfield Models
Authors: Jerry Yao-Chieh Hu, Bo-Yu Chen, Dennis Wu, Feng Ruan, Han Liu
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we validate our framework in both synthetic and realistic settings for memory retrieval and learning tasks. Code is available at Git Hub; future updates are on ar Xiv. ... We conduct numerical experiments to support our framework in Appendix G. ... G Experimental Studies |
| Researcher Affiliation | Academia | 1Department of Computer Science, Northwestern University, Evanston, IL, USA 2Department of Physics and Computer Science, National Taiwan University, Taipei, Taiwan 3Department of Statistics and Data Science, Northwestern University, Evanston, IL, USA. Correspondence to: Jerry Yao-Cheih Hu <EMAIL>, Bo-Yu Chen <EMAIL>, Dennis Wu <EMAIL>, Feng Ruan <EMAIL>, Han Liu <EMAIL>. |
| Pseudocode | No | The paper describes methods using mathematical equations and prose. It does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code. |
| Open Source Code | Yes | Empirically, we validate our framework in both synthetic and realistic settings for memory retrieval and learning tasks. Code is available at Git Hub; future updates are on ar Xiv. |
| Open Datasets | Yes | In the memory retrieval task, we examine two datasets: MNIST (sparse) and CIFAR10 (dense). ... We evaluate Dense Hopfield, Sparse Hopfield, and Top-K Hopfield models on a Multiple Instance Learning (MIL) task using MNIST bags. ... We utilize four datasets: Elephant, Fox, and Tiger for image annotation (Ilse et al., 2018), and UCSB breast cancer classification (Kandemir et al., 2014). ... We conduct the experiments on four multivariate time series real-world datasets: ETTh1 (Electricity Transformer Temperature-hourly), ETTm1 (Electricity Transformer Temperature-minutely), WTH (Weather), ECL (Electricity Consuming Load), Traffic. |
| Dataset Splits | Yes | In each fold, we utilize a stratified sampling process to partition the data into a training set and a validation set, with a split rate of 0.1. |
| Hardware Specification | Yes | All experiments are conducted on the platform with NVIDIA GEFORCE RTX 2080 Ti and INTEL XEON SILVER 4214 @ 2.20GHz. |
| Software Dependencies | Yes | We use Py Torch 1.8.0 for all experiments, and use Ray Tune for hyperparameter search. |
| Experiment Setup | Yes | All models are trained using the Adam W optimizer for 150 epochs, with a cosine annealing learning rate decay applied to all models. ... Table 2. Hyperparameter used in the MIL MNIST experiment. parameter values batch size 256 learning rate 1e-3 embedding dimension 256 number of heads 4 head dimension 64 test set size 500 train set size 2000 scaling 0.1 num of pattern 2 epochs 150 |