InstaTrain: Adaptive Training via Ultra-Fast Natural Annealing within Dynamical Systems
Authors: Chuan Liu, Ruibing Song, Chunshu Wu, Pouya Haghi, Tong Geng
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on highly dynamic datasets demonstrate that our method achieves orders-of-magnitude improvements in training speed and energy efficiency while delivering superior accuracy compared to baselines running on GPUs. |
| Researcher Affiliation | Academia | University of Rochester, Rochester, NY, USA EMAIL {phaghi}@ur.rochester.edu |
| Pseudocode | Yes | Algorithm 1 Iterative Natural Annealing Training Input: Training set T = {s1, s2, . . . , s M}, initial J0, h0, learning rate η, and training epochs Niter. Output: Trained Hamiltonian parameters J, h. 1: Initialize J J0, h h0. 2: for i = 1 to Niter do 3: for each sj = (st j, st+1 j ) in T do 4: Clamp the first half nodes to st j 5: Perform natural annealing to obtain ˆst+1 j 6: Get σiσj model and σ2 i model based on st j,ˆst+1 j 7: Get σiσj data and σ2 i data based on st j, st+1 j 8: Update Jij Jij η σiσj model σiσj data) 9: Update hi hi η ( σ2 i model + σ2 i data) 10: end for 11: end for 12: return J, h |
| Open Source Code | No | The paper does not contain any explicit statement about providing source code or a link to a code repository. |
| Open Datasets | Yes | Carbon-Oxide consists of sampled time series data collected from a gas delivery platform facility, capturing readings from chemical sensors exposed to varying concentrations of carbon oxide and ethylene mixtures (Fonollosa et al., 2015b). Similarly, Methane includes sampled data from chemical sensors exposed to mixtures of methane and ethylene at varying concentration levels (Fonollosa et al., 2015b). Stock contains sampled stock data of S&P-500 (Nasdaq). Ammonia includes sampled time series recordings from a chemical detection platform, featuring data from 72 metal-oxide sensors across six different locations, all maintained under consistent wind speed and operating temperatures (Fonollosa et al., 2015a). Toluene comprises sampled time series recordings from 72 sensors at one location, collected under ten varying conditions (two wind speeds and five operating temperatures) from a chemical detection platform (Fonollosa et al., 2015a). |
| Dataset Splits | Yes | Models are trained on the first 25% of each dataset and evaluated on the remaining 75%. We compare against Graph Neural Networks (GNNs), Transformer-based time series prediction models, and NPGL (Wu et al., 2024). ... In particular, the models are updated once after observing 1,000 snapshots, equivalent to 10 seconds in the real world. After each update, the model is tested on the subsequent 1,000 snapshots. ... Insta Train is updated once every 100 snapshots (equivalent to 1 second in real time), leveraging its rapid adaptation capability. After each update, the model is evaluated on the subsequent 100 snapshots. |
| Hardware Specification | Yes | We evaluate the accuracy and inference latency of GNNs, Transformer-based models, and online learning models using an NVIDIA A100-40GB GPU. ... The accuracy and inference latency of NPGL, along with the accuracy, training latency, and inference latency of Insta Train, are assessed using a CUDA-accelerated Finite Element Analysis (FEA) software simulator implemented based on BRIM (Afoakwa et al., 2021). |
| Software Dependencies | No | The paper mentions a CUDA-accelerated Finite Element Analysis (FEA) software simulator and the Cadence Mixed-Signal Design Environment, but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | Algorithm 1 Iterative Natural Annealing Training Input: Training set T = {s1, s2, . . . , s M}, initial J0, h0, learning rate η, and training epochs Niter. ... In particular, the models are updated once after observing 1,000 snapshots, equivalent to 10 seconds in the real world. ... Insta Train is updated once every 100 snapshots (equivalent to 1 second in real time). |