reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

NExUME: Adaptive Training and Inference for DNNs under Intermittent Power Environments

Authors: Cyan Subhra Mishra, Deeksha Chaudhary, Jack Sampson, Mahmut Kandemir, Chita Das

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our results show a 6% to 22% improvement in accuracy over current methods, with an increase of less than 5% in computational overhead. This paper details the development of the adaptive training framework, describes the integration of energy profiles with dropout and quantization adjustments, and presents a comprehensive evaluation using real-world data.
Researcher Affiliation	Academia	Cyan Subhra Mishra, Deeksha Chaudhary, Jack Sampson, Mahmut Taylan Knademir, Chita Das CSE Department, The Pennsylvania State University EMAIL
Pseudocode	Yes	F PSEUDO CODES F.1 DEPTH-WISE SEPARABLE CONVOLUTION 2D USING TI LEA Algorithm 1 Implementing Depth-wise Separable Convolution DWSep Conv2D() using CONV1D () Algorithm 2 depth-wise Separable Convolution 2D Using TI LEA Algorithm 3 Task-Based CONV2D Using TI LEA
Open Source Code	No	The paper does not explicitly provide a link to the source code for the methodology described. It provides a link for a novel dataset developed by the authors.
Open Datasets	Yes	We have developed a first-of-its-kind machine status monitoring dataset, available at https://hackmd.io/@Galben/rk7YN6jmR, which involves mounting multiple types of sensors at various locations on a Bridgeport machine to monitor its activity status. For image data, we consider the Fashion-MNIST (Xiao et al., 2017) and CIFAR10 (Alex, 2009) datasets; for time series sensor data, we focus on popular human activity recognition (HAR) datasets, MHEALTH (Banos et al., 2014) and PAMAP2 (Reiss & Stricker, 2012); and for audio, we use the Audio MNIST (Becker et al., 2023) dataset.
Dataset Splits	No	The paper mentions several datasets are used (e.g., Fashion-MNIST, CIFAR10, MHEALTH, PAMAP2, Audio MNIST, and a new machine status monitoring dataset) and presents accuracy results. However, it does not explicitly state the specific training, validation, or test splits (e.g., percentages, sample counts, or references to predefined splits) used for these datasets, which would be necessary for reproducibility.
Hardware Specification	Yes	Our training infrastructure utilizes NVIDIA A6000 GPUs with 48 Gi B of memory, supported by a 24-core Intel Xeon Gold 6336Y CPU. ... For commercially off-the-shelf micro-controllers, we choose Texas Instruments MSP430FR5994 (Instruments, 2024a), and Arduino Nano 33 BLE Sense (Arduino, 2024) as our deployment platforms with a Pixel-5 phone as the host device. ... Table 2 presents the energy efficiency in MOps/Joule for each dataset on different hardware platforms using piezoelectric and thermal energy harvesting. NEx UME achieves the highest energy efficiency across all platforms and datasets. This demonstrates that NEx UME not only improves accuracy but also enhances energy utilization, making it highly suitable for deployment in energy-constrained intermittent environments.
Software Dependencies	Yes	Our training infrastructure utilizes NVIDIA A6000 GPUs with 48 Gi B of memory, supported by a 24-core Intel Xeon Gold 6336Y CPU. We employ Py Torch v2.3.0 coupled with CUDA version 11.8 as our primary training framework.
Experiment Setup	No	The paper describes dynamic adjustment mechanisms for parameters like dropout rates and quantization levels based on energy availability (equations 1 and 2), and an adaptive regularization strategy. However, it does not provide specific concrete hyperparameters (e.g., initial learning rates, batch sizes, number of epochs) or fixed experimental settings (e.g., specific values for dmax, Emax, qmin, qmax, lambda1, lambda2, theta_low, theta_high, random seeds) that would be needed to reproduce the main experimental results tables. The latency SLO (500ms) for baselines is mentioned as a constraint for evaluation, not a hyperparameter for training NEx UME. The sensitivity study varies latency and capacitance but doesn't provide the fixed values for the reported main results.