Deep Generative Model for Mechanical System Configuration Design
Authors: Yasaman Etesam, Hyunmin Cheong, Mohammadmehdi Ataei, Pradeep Kumar Jayaraman
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To demonstrate our approach, we solve a gear train synthesis problem by first creating a synthetic dataset using a domain-specific language, a parts catalogue, and a physics simulator. We then train a Transformer-based model using this dataset, named Gear Former, which can not only generate quality solutions on its own, but also augment traditional search methods such as an evolutionary algorithm and Monte Carlo tree search. We show that Gear Former outperforms such search methods on their own in terms of satisfying the specified design requirements with orders of magnitude faster generation time. |
| Researcher Affiliation | Collaboration | 1Autodesk Research, Toronto, Ontario, Canada 2School of Computing Science, Simon Fraser University, Burnaby, BC, Canada |
| Pseudocode | No | The paper describes the methods used, such as EDA and MCTS, and their integration with the Transformer model, but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | A link to our code, data, and supplementary material referenced in this paper can be found at https://gearformer.github.io/. |
| Open Datasets | Yes | The creation of the Gear Former dataset, the first dataset for gear train synthesis, created with a domain-specific language and augmented with physics simulations. A link to our code, data, and supplementary material referenced in this paper can be found at https://gearformer.github.io/. |
| Dataset Splits | Yes | Out of these, 0.05% (3,681) were randomly selected for validation, another 0.05% for testing, and the remaining sequences for training. |
| Hardware Specification | Yes | A Tesla V100SXM2-16GB GPU was used for training and evaluation. |
| Software Dependencies | Yes | We implemented our model using the x-transformers library3 and Py Torch (Ansel et al. 2024). Dymos is an open-source Python library that enables simulation of time-dependent systems. |
| Experiment Setup | Yes | Training was conducted for a maximum of 20 epochs, but stopped if the validation loss increased at any point. We set α = w1 cos(wϵ), where w1 is a model parameter and wϵ is determined based on the epoch number ϵ as wϵ = max(0, π6 ) to bias the model on the cross entropy loss during the initial epochs. To balance having fewer parameters with achieving good results, we decided on the dimension of 512, depth of 6, and 8 attention heads. |