Advertisement

Controlnet and T2I-Adapter

T2I-Adapter is an adapter for text-to-image generation developed by Tencent's ARC team. The paper was released in February this year, and the code has been open-sourced on GitHub.

Paper address: https://arxiv.org/pdf/2302.08453.pdf

Code address: https://github.com/TencentARC/T2I-Adapter

It is a small model that can be attached to any large text-to-image diffusion model to enhance its controllability. T2I-Adapter works by learning to align textual prompts with the internal states of image generators. This allows users to more finely control the generated images by adjusting the textual prompts.

Some advantages of T2I-Adapter mentioned in the official paper:

  • Plug-and-play. Does not affect the original network topology or generative capabilities.
  • Simple and compact. Approximately 77M parameters and about 300M storage.
  • Flexible. Suitable for various adapters under different control conditions.
  • Combinable. Multiple adapters can be used to achieve multi-condition control.
  • Universal. Can be directly applied to custom models.

Both ControlNet and today’s T2I-Adapter are technologies used for text-to-image generation. They both use small models to control the generation of large diffusion models. However, there are some differences between them:

  • ControlNet uses a multimodal transformer model, while T2I-Adapter uses a simple linear model. (This may not be entirely accurate—it was derived from unknown sources by Google Bard.)
  • ControlNet can provide finer control over the generated images, whereas T2I-Adapter is more lightweight.
  • ControlNet requires more training data and computational resources, while T2I-Adapter is easier to train.

The author of ControlNet, Lvmin Zhang, is a PhD student in Stanford University's CS department since 2022. He graduated from Soochow University in 2021 with a Bachelor's degree in Engineering. His research areas include computational art and design, interactive content creation, computer graphics, image and video processing, as well as anime. He organized a special interest research group called Style2Paints Research and also developed an anime drawing software named Style2Paints.

Under the extensions section of the Stable Diffusion webUI, there are models for both ControlNet and T2I-Adapter.

In terms of effect comparison, theoretically ControlNet should perform better, but many netizens have found that there is no significant difference in output quality. However, T2I-Adapter generates images approximately three times faster than ControlNet.

Previously, it was said that T2I-Adapter had fewer model types, but recently, more T2I-Adapter models have appeared on C station: https://civitai.com/models/17220?modelVersionId=20330

You can install according to the instructions on the website based on your needs.

Of course, ControlNet and T2I-Adapter can also be used together.