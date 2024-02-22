Stability AI is out today with an early preview of its Stable Diffusion 3.0 next-generation flagship text-to-image generative AI model. Stability AI has been steadily iterating and releasing multiple image models over the past year, each showing increasing levels of sophistication and quality. The SDXL release in July dramatically improved the Stable Diffusion base model and now the company is looking to go significantly further. The new Stable Diffusion 3.0 model aims to provide improved image quality and better performance in generating images from multi-subject prompts. It will also provide significantly better typography than prior Stable Diffusion models enabling more accurate and consistent spelling inside of generated images. Typography has been an area of weakness for Stable Diffusion in the past and one that rivals including DALL-E 3, Ideogram and Midjourney have also been working on with recent releases. Stability AI is building out Stable Diffusion 3.0 in multiple model sizes ranging from 800M to 8B parameters. Stable Diffusion 3.0 isn’t just a new version of a model that Stability AI has already released, it’s actually based on a new architecture. “Stable Diffusion 3 is a diffusion transformer, a new type of architecture similar to the one used in the recent OpenAI Sora model,” Emad Mostaque, CEO of Stability AI told VentureBeat. “It is the real successor to the original Stable Diffusion.” Stability AI has been experimenting with multiple types of approaches for generating images. Earlier this month the company released a preview of Stable Cascade that uses the Würstchen architecture to improve performance and accuracy. Stable Diffusion 3.0 is taking a different approach by using diffusion transformers. “Stable Diffusion did not have a transformer before,” Mostaque said.

