DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation
Authors: Jiwook Kim, Seonho Lee, Jaeyo Shin, Jiho Choi, Hyunjung Shim
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method through qualitative and quantitative comparisons and user studies with baseline methods. Our results demonstrate that the proposed method outperforms the baselines in both editing speed and quality. Moreover, our method shows outstanding results on both Ne RF and 3DGS. These results indicate Dream Catalyst is a model-agnostic 3D editing framework, rather than a method specifically tailored to a particular 3D representation (Chen et al., 2024b). To the best of our knowledge, Dream Catalyst is the first to achieve state-of-the-art results and provide extensive experiments on both Ne RF and 3DGS editing. |
| Researcher Affiliation | Academia | Jiwook Kim , Seonho Lee , Jaeyo Shin, Jiho Choi & Hyunjung Shim Graduate School of Artificial Intelligence, KAIST, Republic of Korea EMAIL |
| Pseudocode | No | The paper describes the methodology using mathematical formulations and textual explanations but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Please refer to our official repository: https://github.com/kaist-cvml/Dream Catalyst. This repository provides the source code, setup, and datasets used in the experiments. |
| Open Datasets | Yes | We conduct experiments on real scenes using datasets from IN2N and PDS. [...] This repository provides the source code, setup, and datasets used in the experiments. |
| Dataset Splits | No | The paper mentions using "real scenes" and evaluating "eight scenes with 40 pairs of source and target text prompts." It describes training steps for different methods but does not provide specific training/test/validation dataset splits or reference any predefined splits for reproducibility. |
| Hardware Specification | Yes | All experiments are conducted on a single NVIDIA A6000 GPU. |
| Software Dependencies | No | The paper mentions software tools and models such as Nerfstudio, Instruct Pix2Pix (IP2P), Adam optimizer, DDIM scheduler, Threestudio, and Stable Diffusion. However, it does not provide specific version numbers for any of these software components, which is required for a reproducible description of ancillary software. |
| Experiment Setup | Yes | We set different training steps for each method: 3,000 iterations for our method, 1,000 iterations for our method with the fast mode, 15,000 iterations for IN2N, and 30,000 iterations for PDS, following the original settings of their respective baselines. Based on previous researches (Poole et al., 2022; Katzir et al., 2023) indicating that higher classifier guidance may lead to over-saturated and poor edited results, we select a weight of 7.5 for classifier-free guidance in our method. During editing, we train the model using the Adam optimizer and an exponential decay learning rate scheduler. Specifically, since our method requires fewer iterations than PDS, we set smaller warm-up steps of learning rate schedulers: 100 for proposal networks and fields, and 300 for camera optimizers. We use a DDIM scheduler with 500 inference steps. [...] We set χ = 0.075, δ = 0.2, and γ = 0.8 for all experiments. [...] For model-specific methods, we choose 1,500 steps for our method with GE in high-quality mode and 1,200 steps in fast mode. Additionally, we set 500 steps and 1,500 steps for DGE and GE respectively, following the original settings of their baselines. [...] Additionally, to prevent over-densification in GE, we limit densification phases to 1,200 iterations in the high-quality mode of our method, ensuring that excessive cloning or splitting does not occur at the end of the editing process. |