Researchers from UIUC and Microsoft published a paper titled "Multi-LoRA Composition for Image Generation," which achieved
Low-Rank Adaptation (LoRA) has been widely applied in text-to-image models to accurately present specific elements in generated images, such as unique characters or styles. I have introduced LoRA before.
However, existing methods face challenges in effectively combining multiple LoRAs, especially when the number of LoRAs to be integrated increases, thus hindering the creation of complex images. This paper studies the combination of multiple LoRAs from a decoding perspective and proposes two training-free methods:
LoRA Switch, which alternates between different LoRAs in each denoising step LoRA Composite, which combines all LoRAs simultaneously to guide more coherent image synthesis.
Project Features
🚀 Training-free methods
LoRA Switch and LoRA Composite achieve dynamic and precise integration of multiple LoRAs without fine-tuning. Unlike methods that merge LoRA weights, our approach focuses on the decoding process, keeping all LoRA weights intact.
📊 ComposLoRA Testing Platform
A comprehensive new platform including 480 combination sets and 22 pre-trained LoRAs across six categories. ComposLoRA aims to quantitatively evaluate LoRA-based composable image generation tasks.
📝 Evaluator Based on GPT-4V
We propose using GPT-4V as an evaluator to assess the effectiveness of combinations and image quality. This evaluator has shown better correlation with human judgment.
🏆 Outstanding Performance
Both automated and manual evaluations show that our method significantly outperforms popular LoRA Merge. Our method demonstrates greater advantages in generating complex compositions.
Generation Process
An overview of three Multi-LoRA techniques: Each colored LoRA represents a unique element. The popular method, LoRA Merge, linearly merges multiple LoRAs into one. In contrast, our methods focus on the denoising process: LoRA Switch cycles through different LoRAs during denoising, while LoRA Composite involves all LoRAs guiding the entire generation process.