SketchDNN: Joint Continuous-Discrete Diffusion for CAD Sketch Generation

Authors: Sathvik Reddy Chereddy, John Femiani

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically validate our diffusion model, Sketch DNN, against prior art. We also perform ablation studies to show that our approaches to address the heterogeneity and permutation invariance of sketches improves the performance of our model. We present metrics for likelihood and sample quality. Additionally we provide a qualitative comparison between our model and the dataset in Figure 4. Our model reduces the Fr echet Inception Distance (FID) from 16.04 to 7.80 and the Negative Log Likelihood (NLL) from 84.8 to 81.33 on the Sketch Graphs dataset, demonstrating significant improvements in both fidelity and diversity.
Researcher Affiliation Academia 1Department of Computer Science, Miami-Oxford University, Oxford OH, USA. Correspondence to: Sathvik Chereddy (M.S.) <EMAIL>, John Femiani <EMAIL>.
Pseudocode No The paper describes the methodology using mathematical equations and textual explanations, but no structured pseudocode or algorithm blocks are explicitly presented.
Open Source Code No The paper does not explicitly state that source code for the described methodology is being released, nor does it provide any links to a code repository or mention code in supplementary materials.
Open Datasets Yes We used the CAD sketch dataset introduced in Sketch Graphs by Seff et al. (Seff et al., 2020), which consists of 15 million human-created CAD sketches extracted from Onshape, a cloud-based CAD platform.
Dataset Splits Yes Lastly, we split the remaining 1.4 million CAD sketches into 3 subsets: we reserved 90% for training, 5% for validation, and 5% for testing.
Hardware Specification Yes The model was trained for 1000 epochs using a batch size of 8 512, distributed across 8 NVIDIA A30 GPUs.
Software Dependencies No The paper mentions general software components or frameworks (e.g., diffusion models), but does not specify any particular software libraries with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup Yes We use an embedding size of 512 with a depth of 32 transformer blocks. We trained our model to generate samples over T = 2000 timesteps. ... We use reconstruction loss, LRECON, composed of Mean-Squared Error (MSE) loss for continuous variables and Cross Entropy (CE) loss for discrete variables: ( λLMSE + LCE if x 150, LMSE + LCE if x > 150 where we set λ = 16. ... The model was trained for 1000 epochs using a batch size of 8 512, distributed across 8 NVIDIA A30 GPUs. A constant learning rate of 1 10 4 was used throughout training.