reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

PerfSeer: An Efficient and Accurate Deep Learning Models Performance Predictor

Authors: Xinlong Zhao, Jiande Sun, Jia Zhang, Tong Liu, Ke Liu

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We constructed a dataset containing performance metrics for 53k+ model configurations, including execution time, memory usage, and Streaming Multiprocessor (SM) utilization during both training and inference. The evaluation results show that Perf Seer outperforms nn-Meter, Brp-NAS, and DIPPM.
Researcher Affiliation	Collaboration	Xinlong Zhao1 , Jiande Sun1 , Jia Zhang1 , Tong Liu2 and Ke Liu1 1Shandong Normal University 2IEIT SYSTEMS Co., Ltd.
Pseudocode	No	The paper describes the workflow and update functions of the Seer Block using equations and textual descriptions (e.g., "e j = ϕe ej, vsj, vtj", "v i = ϕv ( e i, vi, z, u)") and an architectural diagram (Figure 2). However, it does not present these as a structured pseudocode or algorithm block with typical formatting elements like loops, conditional statements, or explicit labels like "Algorithm 1".
Open Source Code	Yes	We construct a performance dataset1. 1https://github.com/upuuuuuu/Perf Seer
Open Datasets	Yes	We constructed a dataset with over 53k model configurations, covering key performance metrics such as execution time, memory usage, and Streaming Multiprocessor (SM) utilization during both training and inference in Nvidia Ge Force RTX 3090. [...] We construct a performance dataset1. 1https://github.com/upuuuuuu/Perf Seer
Dataset Splits	Yes	The dataset is divided into 2:1:1 for training, validation, and testing.
Hardware Specification	Yes	We constructed a dataset with over 53k model configurations, covering key performance metrics such as execution time, memory usage, and Streaming Multiprocessor (SM) utilization during both training and inference in Nvidia Ge Force RTX 3090. [...] We evaluated the overhead of Perf Seer on an Intel i7-11700 CPU
Software Dependencies	No	Perf Seer is compatible with multiple DL frameworks, such as Py Torch, Tensor Flow, and MXNet, unlike other predictors that support only a few. [...] We use a batch size of 128 and an initial learning rate of 1e-3, halving it after five epochs without improvement, down to 1e-6. Training runs for up to 500 epochs, with Mean Squared Error (MSE) as the loss function and Adam as the optimizer. Although deep learning frameworks and an optimizer are mentioned, no specific version numbers for these software components are provided to ensure reproducibility.
Experiment Setup	Yes	We use a batch size of 128 and an initial learning rate of 1e-3, halving it after five epochs without improvement, down to 1e-6. Training runs for up to 500 epochs, with Mean Squared Error (MSE) as the loss function and Adam as the optimizer.