Last Tuesday, Stability AI launched Stable Diffusion XL Turbo (SDXL Turbo), its latest generative AI model. Like Midjourney or DALL-E, this tool is capable of creating images from text. However, SDXL Turbo stands out for a very special feature: it is able to generate images “in real time“.
SDXL Turbo’s revolutionary ability to generate images in a single step represents a major advance over its competition and itself. No more waiting: SDXL Turbo transforms the generation result as you write the prompt.
This efficiency is due to the inclusion of a new technique called Adversarial Diffusion Distillation (ADD). ADD improves model learning from other image synthesis systems and refines its ability to distinguish between real and artificial. As a result, it also improves the realism of the generated images.
What makes SDXL Turbo stand out the most is its speed. According to Stability AI, by using the NVIDIA A100 (a graphics card designed for artificial intelligence), SDXL Turbo can create a 512×512 pixel image in just 207 milliseconds. This time includes the entire process: encoding, denoising and decoding.
The speed of SDXL Turbo opens up new possibilities, such as creating real-time AI-driven video filters or even developing advanced graphics for video games. However, for these applications to be viable, challenges such as coherence still need to be solved. When we talk about coherence, we mean maintaining uniformity between different frames or generations of images to avoid warping.
If you want to know more, Stability AI has published a detailed article on its own website explaining how ADD and SDXL Turbo work.