Open-CK: A Large Multi-Physics Fields Coupling benchmarks in Combustion Kinetics

Authors: Zaige Fei, Fan Xu, Junyuan Mao, Yuxuan Liang, Qingsong Wen, Kun Wang, Hao Wu, Yang Wang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we use the Fire Dynamics Simulator (FDS) combined with the supercomputer support to create a Combustion Kinetics (CK) dataset for machine learning and scientific research. This dataset captures the development of fires in industrial parks with high-precision Computational Fluid Dynamics (CFD) simulations. It includes various physical fields such as temperature and pressure, and covers multiple environmental combinations for exploring multi-physics field coupling phenomena. Additionally, we evaluate several advanced machine learning architectures across our Open-CK benchmark using a substantial computational setup of 64 NVIDIA A100 GPUs... We also introduce three benchmarks to demonstrate their potential in enhancing the exploration of downstream tasks... The Open-CK dataset and benchmarks aim to advance research in combustion kinetics driven by machine learning, providing a reliable baseline for developing and comparing cutting-edge technologies and models.
Researcher Affiliation Collaboration Zaige Fei ,1, Fan Xu ,1, Junyuan Mao1, Yuxuan Liang2, Qingsong Wen3, Kun Wang ,1,4, Hao Wu ,1, Yang Wang ,1 1 University of Science and Technology of China 2 The Hong Kong University of Science and Technology (Guangzhou) 3 Squirrel Ai Learning 4 Nanyang Technological University
Pseudocode Yes Figure 8: Example of Python code for processing multiple Numpy files and creating a sliding window view of the data. Figure 14: Pseudocode for processing an fds file and adding sensor data in Python. Figure 15: Pseudocode for filtering and cleaning a CSV file, transforming its dimensions, and finally saving it as a .npy file in Python.
Open Source Code Yes The Large ST benchmark dataset is released under a CC BY-NC 4.0 International License: https: //creativecommons.org/licenses/by-nc/4.0. Our code implementation is released under the MIT License: https://opensource.org/licenses/MIT.
Open Datasets Yes Open-CK (") is the first open-source benchmark dedicated to the study of combustion fluid dynamics, created through over 360 hours of numerical simulations supported by supercomputers... The Large ST benchmark dataset is released under a CC BY-NC 4.0 International License: https: //creativecommons.org/licenses/by-nc/4.0.
Dataset Splits No We select partial data for our main experiment. Specifically, with a heat release rate of 5MW, a single fire source, and one wind direction, we simulate wind speeds of 1m/s, 2m/s, 3m/s, 4m/s, and 5m/s... Through numerical simulation, we obtain temperature data during the fire evolution, which we use as the original training and testing dataset. The paper mentions using generated data for training and testing but does not specify explicit percentages, counts, or methodologies for splitting the dataset into distinct train, validation, and test sets.
Hardware Specification Yes Additionally, we evaluate several advanced machine learning architectures across our Open-CK benchmark using a substantial computational setup of 64 NVIDIA A100 GPUs... We train on 64 NVIDIA 40G-A100 GPUs.
Software Dependencies No We generate this data using version 6.9.1 of the FDS... This process generates the FDS files... Python5 is a widely used programming language... We used Auto CAD software... Python scripts (see Figure 15) to filter and clean the data... we used Python scripts... The process uses the Numpy library in Py Torch for numerical computations. The paper mentions software like FDS (with version 6.9.1), Auto CAD, Pyro Sim, Python, Numpy, and Py Torch, but only FDS is provided with a specific version number. Other key libraries or frameworks used for the ML experiments do not have explicit version numbers.
Experiment Setup Yes All backbones in this paper train with MSE loss, use the ADAM optimizer Kingma & Ba (2014), and set the learning rate to 10 3. The batch size is 50, and training early stops within 500 epochs. We train on 64 NVIDIA 40G-A100 GPUs.