MiraGe: Editable 2D Images using Gaussian Splatting
Authors: Joanna Waczynska, Tomasz Szczepanik, Piotr Borycki, Slawomir Tadeja, Thomas Bohné, Przemysław Spurek
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental section is split into two main parts. First, we demonstrate that our approach achieves high-quality 2D reconstruction by comparing it with existing models. Second, we highlight the versatility of Mira Ge in image editing full scenes (Fig. 8, 13) and selected objects (Fig. 1), presenting examples of user-driven modifications and demonstrations involving physical simulations (Fig. 3, 7). Reconstruction quality Our image reconstruction assessment utilizes two widely recognized datasets. Specifically, we employ the Kodak dataset4, which includes 24 images at a resolution of 768 × 512, alongside the DIV2K validation set (Agustsson & Timofte, 2017). In Tab. 1, we demonstrate the performance outcomes of different methods on the Kodak and DIV2K datasets. |
| Researcher Affiliation | Academia | 1Jagiellonian University, Faculty of Mathematics and Computer Science 2Doctoral School of Exact and Natural Sciences 3University of Cambridge 4IDEAS Research Institute. Correspondence to: Joanna Waczyńska <EMAIL>, Przemysław Spurek <EMAIL>. |
| Pseudocode | No | The paper describes the methodology using mathematical equations (e.g., Eq. 1, 2) and prose, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format. |
| Open Source Code | No | The source code for this project is available under . |
| Open Datasets | Yes | Our image reconstruction assessment utilizes two widely recognized datasets. Specifically, we employ the Kodak dataset4, which includes 24 images at a resolution of 768 × 512, alongside the DIV2K validation set (Agustsson & Timofte, 2017). 4https://r0k.us/graphics/kodak/ Additionally, we generated our own 2D images using DALL-E 3 to illustrate the benefits of our method. We demonstrate examples of modifications using datasets such as DIV2K, Kodak, and Animals5. 5https://www.kaggle.com/datasets/alessiocorrado99/animals10 |
| Dataset Splits | No | The paper mentions using the "Kodak dataset" and the "DIV2K validation set" but does not specify any training/test/validation splits (e.g., percentages or sample counts) for reproducing the experimental setup. It refers to the DIV2K validation set, but not how it is used in conjunction with other splits or if other splits were used for training. |
| Hardware Specification | Yes | Computational experiments in the main paper were conducted using NVIDIA Ge Force RTX 4070 Laptop version and NVIDIA Ge Force RTX 2080. Appendix time comparisons were reported using NVIDIA Ge Force RTX 2080. For Training we used 100K initial Gaussians;5k, 10k, 30k iterations on the V100 GPU. |
| Software Dependencies | Yes | For 2D representation (2D-Mira Ge) we used Taichi_elements2, for 3D representation (Amorphous-Mira Ge, Graphite-Mira Ge) we use Blender3. 3https://www.blender.org version 3.6 |
| Experiment Setup | Yes | The selection of hyperparameters, including the number of iterations, was inspired by the principles of 3DGS. In our approach, we model flat objects within 3D space, where the camera distance parameter effectively controls the perceived scale of the object. We first calculate the deviation from 0 on the X axis using the similarity of triangles devz = camdist tan( 1/2Fovvert), where camdist and Fovvert are camera distance from the XZ plane and camera field of view respectively. The deviation in the X axis can be then computed by multiplying this value by the camera aspect ratio. Consequently, the initialization of Gaussians is consistently performed on the XZ plane; however, we have opted to permit their movement within the 3D space. We use the classical loss function L1 combined with a D-SSIM term: L = (1 − λ)L1(I, GS(I)) + λLD SSIM(I, GS(I)), where I is the input image and GS(I) is the constraint obtained by the Gaussian renderer. In our model, where all Gaussians are constrained to a 2D plane at rendering time, we consider only the rotation angle, denoted as ϕ, as the primary rotation parameter. |