SketchDNN: Joint Continuous-Discrete Diffusion for CAD Sketch Generation
Authors: Sathvik Reddy Chereddy, John Femiani
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically validate our diffusion model, Sketch DNN, against prior art. We also perform ablation studies to show that our approaches to address the heterogeneity and permutation invariance of sketches improves the performance of our model. We present metrics for likelihood and sample quality. Additionally we provide a qualitative comparison between our model and the dataset in Figure 4. Our model reduces the Fr echet Inception Distance (FID) from 16.04 to 7.80 and the Negative Log Likelihood (NLL) from 84.8 to 81.33 on the Sketch Graphs dataset, demonstrating significant improvements in both fidelity and diversity. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Miami-Oxford University, Oxford OH, USA. Correspondence to: Sathvik Chereddy (M.S.) <EMAIL>, John Femiani <EMAIL>. |
| Pseudocode | No | The paper describes the methodology using mathematical equations and textual explanations, but no structured pseudocode or algorithm blocks are explicitly presented. |
| Open Source Code | No | The paper does not explicitly state that source code for the described methodology is being released, nor does it provide any links to a code repository or mention code in supplementary materials. |
| Open Datasets | Yes | We used the CAD sketch dataset introduced in Sketch Graphs by Seff et al. (Seff et al., 2020), which consists of 15 million human-created CAD sketches extracted from Onshape, a cloud-based CAD platform. |
| Dataset Splits | Yes | Lastly, we split the remaining 1.4 million CAD sketches into 3 subsets: we reserved 90% for training, 5% for validation, and 5% for testing. |
| Hardware Specification | Yes | The model was trained for 1000 epochs using a batch size of 8 512, distributed across 8 NVIDIA A30 GPUs. |
| Software Dependencies | No | The paper mentions general software components or frameworks (e.g., diffusion models), but does not specify any particular software libraries with version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | We use an embedding size of 512 with a depth of 32 transformer blocks. We trained our model to generate samples over T = 2000 timesteps. ... We use reconstruction loss, LRECON, composed of Mean-Squared Error (MSE) loss for continuous variables and Cross Entropy (CE) loss for discrete variables: ( λLMSE + LCE if x 150, LMSE + LCE if x > 150 where we set λ = 16. ... The model was trained for 1000 epochs using a batch size of 8 512, distributed across 8 NVIDIA A30 GPUs. A constant learning rate of 1 10 4 was used throughout training. |