Remove the lamp on the desk
Change the color of a pillow
on the bed to DarkOrchid
Add a table in front of the sofa
Move the paper close to the bottle
Replace the laptop on the desk
with a book
The creation of 3D scenes has traditionally been both labor-intensive and costly, requiring designers to meticulously configure 3D assets and environments. Recent advancements in generative AI, including text-to-3D and image-to-3D methods, have dramatically reduced the complexity and cost of this process. However, current techniques for editing complex 3D scenes continue to rely on generally interactive multi-step, 2D-to-3D projection methods and diffusion-based techniques, which often lack precision in control and hamper real-time performance. In this work, we propose 3DSceneEditor , a fully 3D-based paradigm for real-time, precise editing of intricate 3D scenes using Gaussian Splatting. Unlike conventional methods, 3DSceneEditor operates through a streamlined 3D pipeline, enabling direct manipulation of Gaussians for efficient, high-quality edits based on input prompts. The proposed framework (i) integrates a pre-trained instance segmentation model for semantic labeling; (ii) employs a zero-shot grounding approach with CLIP to align target objects with user prompts; and (iii) applies scene modifications, such as object addition, repositioning, recoloring, replacing, and deletion—directly on Gaussians. Extensive experimental results show that 3DSceneEditor achieves superior editing precision and speed with respect to current SOTA 3D scene editing approaches, establishing a new benchmark for efficient and interactive 3D scene customization.
Our paradigm, named 3DSceneEditor consists of three key steps. First, a pre-trained instance segmentation model is applied to understand the input scene and assign a semantic label to each Gaussian. Followed by an Open Vocabulary Object Grounding module, which is used to ground the target objects from the input semantic Gaussians and generate the ROI for target objects. Finally, we execute the specified scene editing operation in ROI based on the prompt and render the edited views.
@article{yan20243DScene,
title = {3DSceneEditor: Controllable 3D Scene Editing with Gaussian Splatting},
author = {Ziyang Yan, Lei Li, Yihua Shao, Siyu Chen, Wuzong Kai, Jenq-Neng Hwang, Hao Zhao, Fabio Remondino},
journal = {arXiv preprint arXiv:2412.01583},
year = {2024}
}