reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Unisoma: A Unified Transformer-based Solver for Multi-Solid Systems

Authors: Shilong Tao, Zhe Feng, Haonan Sun, Zhanxing Zhu, Yunhuai Liu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentally, Unisoma achieves consistent state-of-the-art performance across seven well-established datasets and two complex multi-solid tasks. Code is avaiable at https://github.com/therontau0054/Unisoma. We evaluate Unisoma on extensive experiments, including two essential tasks and seven well-established datasets, covering multi-solid systems with varying complexity. Our experiments span different complexity from few solids to multiple solids. All of them are in 3D space. Deforming Plate (Pfaff et al., 2021), Cavity Grasping (Linkerh agner et al., 2023), Tissue Manipulation (Linkerh agner et al., 2023) and Rice Grip (Li et al., 2019) are public datasets, widely followed on autoregressive task. To explore more complex scenarios, we construct three datasets: Bilateral Stamping, Unilateral Stamping and Cavity Extruding... We comprehensively compare Unisoma against more than ten baselines within the implicit modeling paradigm. For Unisoma, we set the number of processors, the hidden channels and the slice number to 2, 128 and 32, respectively, across all experiments. All experiments are conducted on a single RTX 3090 GPU and repeated three times. We utilize Relative L2 and Root Mean Square Error (RMSE) as evaluation metrics for long-time prediction and autoregressive simulation, respectively.
Researcher Affiliation	Academia	1School of Computer Science, Peking University, Beijing, China 2School of Electrical and Computer Science, University of Southampton, UK.
Pseudocode	No	The paper describes the model architecture and components using mathematical equations and textual descriptions (e.g., in Sections 3.1, 3.2, 3.3, and Appendix A), but it does not contain a clearly labeled pseudocode block, algorithm block, or structured code-like procedure.
Open Source Code	Yes	Code is avaiable at https://github.com/therontau0054/Unisoma.
Open Datasets	Yes	Deforming Plate (Pfaff et al., 2021), Cavity Grasping (Linkerh agner et al., 2023), Tissue Manipulation (Linkerh agner et al., 2023) and Rice Grip (Li et al., 2019) are public datasets, widely followed on autoregressive task.
Dataset Splits	Yes	Deforming Plate (Pfaff et al., 2021)... we use 1,000 samples for training, 100 for validation, and 100 for testing. Cavity Grasping (Linkerh agner et al., 2023)... 600 samples are used for training, 120 for validation, and 120 for testing. Tissue Manipulation (Linkerh agner et al., 2023)... we allocate 600 samples for training, 120 for validation, and 120 for testing. Rice Grip (Li et al., 2019)... the numbers of samples used for training, validation and test are 1000, 100 and 100, respectively. Bilateral Stamping... 1,000 are used for training, 100 for validation, and 100 for testing. Unilateral Stamping... 1,000 for training, 100 for validation, and 100 for testing. Cavity Extruding... 1,000 are used for training, 100 for validation, and 100 for testing. OOD generalization... The numbers of samples in the training, validation and test set are 900, 83 and 217.
Hardware Specification	Yes	All experiments are conducted on a single RTX 3090 GPU (24GB memory) and repeated three times.
Software Dependencies	Yes	These datasets are calculated by ABAQUS software (Abaqus, 2011).
Experiment Setup	Yes	For Unisoma, we set the number of processors, the hidden channels and the slice number to 2, 128 and 32, respectively, across all experiments. All experiments are conducted on a single RTX 3090 GPU and repeated three times. We utilize Relative L2 and Root Mean Square Error (RMSE) as evaluation metrics for long-time prediction and autoregressive simulation, respectively. As shown in Table 7, Unisoma and all baseline models are trained and tested using the same training strategy. For long-time prediction, the batch size refers to the number of samples in a batch. For autoregressive simulation, only one sample is processed during each forward and backward pass, and the batch size corresponds to the number of time steps in the sample. For autoregressive simulation task, we uniformly add noise with a mean of 0 and a variance of 0.001 to improve the error accumulation control during rollout.