Advertisement

Diffutoon - Convert real-person videos into anime style

Diffutoon is a sub-project of DiffSynth Studio, which converts real-person videos into anime styles. This project is a collaboration between Alibaba and East China Normal University. Diffutoon can render extremely detailed, high-resolution, and long-duration videos in anime style. Additionally, through an extra branch, it can also edit content based on prompts.

Four main functions:

  1. Diffutoon uses a model based on Stable Diffusion to convert video frames into anime styles while preserving the core features of the original content.

  2. Integrated with the AnimateDiff motion module to enhance temporal consistency between frames, ensuring smooth playback and visual coherence in animated videos.

  3. The ControlNet model is used to extract key contour information from the video and retain these structural details during the animation process.

  4. A high-resolution ControlNet model enhances color saturation and contrast, improving overall visual quality even in low-resolution videos.

Usage:

You can try it directly on the project's official website: https://diffu-toon.com/

Principle

The overall architecture of DiffuToon consists of two parts: the upper part is the main toon shading pipeline, and the lower part is the editing branch. The editing branch can generate editing signals in color video format for use by the main toon shading pipeline.

Comparison

In the visual comparison with other methods, the editing prompt used is: "best quality, perfect anime illustration, a girl is dancing, smile, solo, orange dress, black hair, white shoes, blue sky." Due to the extremely high resolution of the generated video, the following enlarged details can be focused on.