ICML Poster Rethinking Score Distilling Sampling for 3D Editing and Generation

Poster

Rethinking Score Distilling Sampling for 3D Editing and Generation

Xingyu Miao · Haoran Duan · Yang Long · Jungong Han

West Exhibition Hall B2-B3 #W-219

[ Abstract ] [ Lay Summary ]

[ Poster] [ OpenReview]

Thu 17 Jul 11 a.m. PDT — 1:30 p.m. PDT

Abstract:

Score Distillation Sampling (SDS) has emerged as a prominent method for text-to-3D generation by leveraging the strengths of 2D diffusion models. However, SDS is limited to generation tasks and lacks the capability to edit existing 3D assets. Conversely, variants of SDS that introduce editing capabilities often can not generate new 3D assets effectively. In this work, we observe that the processes of generation and editing within SDS and its variants have unified underlying gradient terms. Building on this insight, we propose Unified Distillation Sampling (UDS), a method that seamlessly integrates both the generation and editing of 3D assets. Essentially, UDS refines the gradient terms used in vanilla SDS methods, unifying them to support both tasks. Extensive experiments demonstrate that UDS not only outperforms baseline methods in generating 3D assets with richer details but also excels in editing tasks, thereby bridging the gap between 3D generation and editing.

Lay Summary:

Creating or modifying 3D shapes with artificial intelligence is difficult because common tools either generate new objects or edit existing ones, and their results often look blurry or unnatural. We studied several leading methods and observed that they follow the same basic steps. Building on this observation, we present Unified Distillation Sampling, a single procedure that can both generate and edit 3D assets by combining a clean estimate of the model with guidance from a text-based image generator. Our method runs in about the same time as earlier techniques and requires less manual adjustment of its settings. In tests across many scenes, it produced sharper, more realistic objects and more faithful edits than previous methods. This work makes it easier for creators and researchers to turn text into high-quality 3D assets and then reshape them without switching between separate tools.

Chat is not available.