Blended-NeRF: Zero-Shot Object Generation and Blending in Existing Neural Radiance Fields

School of Computer Science and Engineering
The Hebrew University of Jerusalem, Israel
Full Flow

Training and Rendering Flow. Given a NeRF scene, our pipeline trains a NeRF generator model guided by a similarity loss defined by a language-image model such as CLIP, to synthesize a new object inside a user-specified ROI. This is achieved by casting rays and sampling points for the rendering process only inside the ROI box. Additionally, our method introduces augmentations and priors to get more natural results. After training, we render the edited scene by blending the sample points generated by the two models along each view ray.


Editing a local region or a specific object in a 3D scene represented by a NeRF is challenging, mainly due to the implicit nature of the scene representation. Consistently blending a new realistic object into the scene adds an additional level of difficulty. We present Blended-NeRF, a robust and flexible framework for editing a specific region of interest in an existing NeRF scene, based on text prompts or image patches, along with a 3D ROI box. Our method leverages a pretrained language-image model to steer the synthesis towards a user-provided text prompt or image patch, along with a 3D MLP model initialized on an existing NeRF scene to generate the object and blend it into a specified region in the original scene. We allow local editing by localizing a 3D ROI box in the input scene, and seamlessly blend the content synthesized inside the ROI with the existing scene using a novel volumetric blending technique. To obtain natural looking and view-consistent results, we leverage existing and new geometric priors and 3D augmentations for improving the visual fidelity of the final result. We test our framework both qualitatively and quantitatively on a variety of real 3D scenes and text prompts, demonstrating realistic multiview consistent results with much flexibility and diversity compared to the baselines. Finally, we show the applicability of our framework for several 3D editing applications, including adding new objects to a scene, removing/replacing/altering existing objects, and texture conversion.

Texture Conversion

We enable texture editing by training only the color-related layers and freezing all the other layers. For seamless blending results, we utilize our suggested distance smoothing operator.

Large Object Replacement

"A DLSR photo of dunes of sand"

"A DLSR photo of ice and snow"

We perform large object replacement by localizing the ROI box to include the sea and the bottom of the ship and requiring low 𝜏 value in the transmittance loss to fill most of the edited region with our generated object.

Objects Blending

Original Scene

"A few green and yellow bananas"

"A cluster of different types of mushrooms"

Original Scene

"Purple, white and blue flowers petals on the ground"

"A pile of snow"

Demonstration of our suggested blending procedure for blending the original and synthesized objects inside the ROI.

Object Insertion\Replacement

Object Insertion\Replacement in existing NeRF scene by steering the model weights towards the user-provided text prompt and utilizing our suggested depth loss to encourage the generator to synthesize volumetric 3D shapes.


      title={Blended-NeRF: Zero-Shot Object Generation and Blending in Existing Neural Radiance Fields},
      author={Gordon, Ori and Avrahami, Omri and Lischinski, Dani},
      journal={arXiv preprint arXiv:2306.12760},