Neutral Editing Framework for Diffusion-based Video Editing

Korea Advanced Institute of Science and Technology (KAIST)

NeutralEdit (NeuEdit) enables current editing systems to perfor effective editing including motion change, style transfer, and object overlay.

Abstract

Text-conditioned image editing has succeeded in various types of editing based on a diffusion framework. Unfortunately, this success did not carry over to a video, which continues to be challenging. Existing video editing systems are still limited to rigid-type editing such as style transfer and object overlay. To this end, this paper proposes Neutral Editing (NeuEdit) framework to enable complex non-rigid editing by changing the motion of a person/object in a video, which has never been attempted before. NeuEdit introduces a concept of `neutralization' that enhances a tuning-editing process of diffusion-based editing systems in a model-agnostic manner by leveraging input video and text without any other auxiliary aids (e.g., visual masks, video captions). Extensive experiments on numerous videos demonstrate adaptability and effectiveness of the NeuEdit framework. The code will be made publicly available.

Introduction

Editing results about recent editing systems and our editing system.

Method

Neutralization concept for diffusion-based video editing sytems.

Non-rigid editing results

Neutralization enable editing systems to perform non-rigid editing including motion variations.

Rigid editing results

Neutralization enable editing systems to perform effective rigid editing, while preserving temporal consitency and fidelity.

BibTeX

@misc{yoon2023neutral,
        title={Neutral Editing Framework for Diffusion-based Video Editing}, 
        author={Sunjae Yoon and Gwanhyeong Koo and Ji Woo Hong and Chang D. Yoo},
        year={2023},
        eprint={2312.06708},
        archivePrefix={arXiv},
        primaryClass={cs.CV}
  }